51
|
Brasó-Vives M, Povolotskaya IS, Hartasánchez DA, Farré X, Fernandez-Callejo M, Raveendran M, Harris RA, Rosene DL, Lorente-Galdos B, Navarro A, Marques-Bonet T, Rogers J, Juan D. Copy number variants and fixed duplications among 198 rhesus macaques (Macaca mulatta). PLoS Genet 2020; 16:e1008742. [PMID: 32392208 PMCID: PMC7241854 DOI: 10.1371/journal.pgen.1008742] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 05/21/2020] [Accepted: 03/27/2020] [Indexed: 01/01/2023] Open
Abstract
The rhesus macaque is an abundant species of Old World monkeys and a valuable model organism for biomedical research due to its close phylogenetic relationship to humans. Copy number variation is one of the main sources of genomic diversity within and between species and a widely recognized cause of inter-individual differences in disease risk. However, copy number differences among rhesus macaques and between the human and macaque genomes, as well as the relevance of this diversity to research involving this nonhuman primate, remain understudied. Here we present a high-resolution map of sequence copy number for the rhesus macaque genome constructed from a dataset of 198 individuals. Our results show that about one-eighth of the rhesus macaque reference genome is composed of recently duplicated regions, either copy number variable regions or fixed duplications. Comparison with human genomic copy number maps based on previously published data shows that, despite overall similarities in the genome-wide distribution of these regions, there are specific differences at the chromosome level. Some of these create differences in the copy number profile between human disease genes and their rhesus macaque orthologs. Our results highlight the importance of addressing the number of copies of target genes in the design of experiments and cautions against human-centered assumptions in research conducted with model organisms. Overall, we present a genome-wide copy number map from a large sample of rhesus macaque individuals representing an important novel contribution concerning the evolution of copy number in primate genomes.
Collapse
Affiliation(s)
- Marina Brasó-Vives
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
- Laboratoire de Biométrie et Biologie Évolutive UMR 5558, Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Inna S. Povolotskaya
- Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Moscow, Russia
| | - Diego A. Hartasánchez
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
| | - Xavier Farré
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
| | - Marcos Fernandez-Callejo
- National Centre for Genomic Analysis-Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - R. Alan Harris
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Douglas L. Rosene
- Department of Anatomy and Neurobiology, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Belen Lorente-Galdos
- Department of Neuroscience, Yale School of Medicine, New Haven, Connecticut, United States of America
| | - Arcadi Navarro
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Catalonia, Spain
| | - Tomas Marques-Bonet
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
- National Centre for Genomic Analysis-Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Catalonia, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Catalonia, Spain
| | - Jeffrey Rogers
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - David Juan
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
| |
Collapse
|
52
|
Takahashi KK, Innan H. Duplication with structural modification through extrachromosomal circular and lariat DNA in the human genome. Sci Rep 2020; 10:7150. [PMID: 32345992 PMCID: PMC7188851 DOI: 10.1038/s41598-020-63665-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 03/30/2020] [Indexed: 12/02/2022] Open
Abstract
Duplication plays an important role in creating drastic changes in genome evolution. In addition to well-known tandem duplication, duplication can occur such that a duplicated DNA fragment is inserted at another location in the genome. Here, we report several genomic regions in the human genome that could be best explained by two types of insertion-based duplication mechanisms, where a duplicated DNA fragment was modified structurally and then inserted into the genome. In one process, the DNA fragment is turned into an extrachromosomal circular DNA, cut somewhere in the circle, and reintegrated into another location in the genome. And in the other, the DNA fragment forms a “lariat structure” with a “knot”, the strand is swapped at the knot, and is then reintegrated into the genome. Our results suggest that insertion-based duplication may not be a simple process; it may involve a complicated procedures such as structural modification before reintegration. However, the molecular mechanism has yet to be fully understood.
Collapse
Affiliation(s)
- Kazuki K Takahashi
- SOKENDAI, The Graduate University for Advanced Studies, Hayama, Kanagawa, 240-0193, Japan.,Laboratory of Plant Genetics, Graduate School of Agriculture, Kyoto University, Kyoto, 606-8502, Japan
| | - Hideki Innan
- SOKENDAI, The Graduate University for Advanced Studies, Hayama, Kanagawa, 240-0193, Japan.
| |
Collapse
|
53
|
Bickhart DM, McClure JC, Schnabel RD, Rosen BD, Medrano JF, Smith TPL. Symposium review: Advances in sequencing technology herald a new frontier in cattle genomics and genome-enabled selection. J Dairy Sci 2020; 103:5278-5290. [PMID: 32331872 DOI: 10.3168/jds.2019-17693] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 12/03/2019] [Indexed: 11/19/2022]
Abstract
The cattle reference genome assembly has underpinned major innovations in beef and dairy genetics through genome-enabled selection, including removal of deleterious recessive variants and selection for favorable alleles affecting quantitative production traits. The initial reference assemblies, up to and including UMD3.1 and Btau4.1, were based on a combination of clone-by-clone sequencing of bacterial artificial chromosome clones generated from blood DNA of a Hereford bull and whole-genome shotgun sequencing of blood DNA from his inbred daughter/granddaughter named L1 Dominette 01449 (Dominette). The approach introduced assembly gaps, misassemblies, and errors, and it limited the ability to assemble regions that undergo rearrangement in blood cells, such as immune gene clusters. Nonetheless, the reference supported the creation of genotyping tools and provided a basis for many studies of gene expression. Recently, long-read sequencing technologies have emerged that facilitated a re-assembly of the reference genome, using lung tissue from Dominette to resolve many of the problems and providing a bridge to place historical studies in common context. The new reference, ARS-UCD1.2, successfully assembled germline immune gene clusters and improved overall continuity (i.e., reduction of gaps and inversions) by over 250-fold. This reference properly places nearly all of the legacy genetic markers used for over a decade in the industry. In this review, we discuss the improvements made to the cattle reference; remaining issues present in the assembly; tools developed to support genome-based studies in beef and dairy cattle; and the emergence of newer genome assembly methods that are producing even higher-quality assemblies for other breeds of cattle at a fraction of the cost. The new frontier for cattle genomics research will likely include a transition from the individual Hereford reference genome, to a "pan-genome" reference, representing all the DNA segments existing in commonly used cattle breeds, bringing the cattle reference into line with the current direction of human genome research.
Collapse
Affiliation(s)
- D M Bickhart
- US Dairy Forage Research Center, Agricultural Research Service, USDA, Madison, WI 53705.
| | - J C McClure
- US Dairy Forage Research Center, Agricultural Research Service, USDA, Madison, WI 53705
| | - R D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, 65211; MU Institute for Data Science and Informatics, University of Missouri, Columbia, 65211
| | - B D Rosen
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705
| | - J F Medrano
- Department of Animal Science, University of California Davis, 95616
| | - T P L Smith
- Meat Animal Research Center, Agricultural Research Service, USDA, Clay Center, NE 68933
| |
Collapse
|
54
|
Louzada S, Lopes M, Ferreira D, Adega F, Escudeiro A, Gama-Carvalho M, Chaves R. Decoding the Role of Satellite DNA in Genome Architecture and Plasticity-An Evolutionary and Clinical Affair. Genes (Basel) 2020; 11:E72. [PMID: 31936645 PMCID: PMC7017282 DOI: 10.3390/genes11010072] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 12/29/2019] [Accepted: 01/08/2020] [Indexed: 12/11/2022] Open
Abstract
Repetitive DNA is a major organizational component of eukaryotic genomes, being intrinsically related with their architecture and evolution. Tandemly repeated satellite DNAs (satDNAs) can be found clustered in specific heterochromatin-rich chromosomal regions, building vital structures like functional centromeres and also dispersed within euchromatin. Interestingly, despite their association to critical chromosomal structures, satDNAs are widely variable among species due to their high turnover rates. This dynamic behavior has been associated with genome plasticity and chromosome rearrangements, leading to the reshaping of genomes. Here we present the current knowledge regarding satDNAs in the light of new genomic technologies, and the challenges in the study of these sequences. Furthermore, we discuss how these sequences, together with other repeats, influence genome architecture, impacting its evolution and association with disease.
Collapse
Affiliation(s)
- Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Daniela Ferreira
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Filomena Adega
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Ana Escudeiro
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| |
Collapse
|
55
|
De Coster W, Van Broeckhoven C. Newest Methods for Detecting Structural Variations. Trends Biotechnol 2019; 37:973-982. [DOI: 10.1016/j.tibtech.2019.02.003] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 02/08/2019] [Accepted: 02/11/2019] [Indexed: 01/28/2023]
|
56
|
Tan S, Kermasson L, Hoslin A, Jaako P, Faille A, Acevedo-Arozena A, Lengline E, Ranta D, Poirée M, Fenneteau O, Ducou le Pointe H, Fumagalli S, Beaupain B, Nitschké P, Bôle-Feysot C, de Villartay JP, Bellanné-Chantelot C, Donadieu J, Kannengiesser C, Warren AJ, Revy P. EFL1 mutations impair eIF6 release to cause Shwachman-Diamond syndrome. Blood 2019; 134:277-290. [PMID: 31151987 PMCID: PMC6754720 DOI: 10.1182/blood.2018893404] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 05/10/2019] [Indexed: 12/15/2022] Open
Abstract
Shwachman-Diamond syndrome (SDS) is a recessive disorder typified by bone marrow failure and predisposition to hematological malignancies. SDS is predominantly caused by deficiency of the allosteric regulator Shwachman-Bodian-Diamond syndrome that cooperates with elongation factor-like GTPase 1 (EFL1) to catalyze release of the ribosome antiassociation factor eIF6 and activate translation. Here, we report biallelic mutations in EFL1 in 3 unrelated individuals with clinical features of SDS. Cellular defects in these individuals include impaired ribosomal subunit joining and attenuated global protein translation as a consequence of defective eIF6 eviction. In mice, Efl1 deficiency recapitulates key aspects of the SDS phenotype. By identifying biallelic EFL1 mutations in SDS, we define this leukemia predisposition disorder as a ribosomopathy that is caused by corruption of a fundamental, conserved mechanism, which licenses entry of the large ribosomal subunit into translation.
Collapse
Affiliation(s)
- Shengjiang Tan
- Cambridge Institute for Medical Research, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Laëtitia Kermasson
- INSERM Unité Mixte de Recherche 1163, Laboratory of Genome Dynamics in the Immune System, Equipe Labellisée Ligue contre le cancer, Paris, France
- Paris Descartes-Sorbonne Paris Cité University, Imagine Institute, Paris, France
| | - Angela Hoslin
- Medical Research Council Mammalian Genetics Unit, Harwell, United Kingdom
| | - Pekka Jaako
- Cambridge Institute for Medical Research, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Alexandre Faille
- Cambridge Institute for Medical Research, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Abraham Acevedo-Arozena
- Medical Research Council Mammalian Genetics Unit, Harwell, United Kingdom
- Unidad de Investigación, Hospital Universitario de Canarias, La Laguna, Spain
- Instituto de Tecnologías Biomédicas, Universidad de La Laguna, La Laguna, Spain
- Centro Investigación Biomédica en Red Enfermedades Neurodegenerativas, La Laguna, Spain
| | - Etienne Lengline
- Department of Hematology, CRNMR Aplasie Médullaire, Saint-Louis University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Dana Ranta
- Department of Haematology, Centre Hospitalier Universitaire de Nancy, Nancy, France
| | - Maryline Poirée
- Department of Pediatric Hematology-Oncology, Centre Hospitalier Universitaire Lenval, Nice, France
| | - Odile Fenneteau
- Assistance Publique-Hôpitaux de Paris, Laboratory of Hematology, Robert Debré University Hospital, Paris, France
| | - Hubert Ducou le Pointe
- Radiology Department, Armand Trousseau Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Department of Pediatric Imaging, Armand Trousseau Hospital, Sorbonne Universités, Pierre et Marie Curie-Paris University, Paris, France
| | - Stefano Fumagalli
- Institut Necker Enfants Malades, Paris, France
- INSERM, U1151, Université Paris Descartes Sorbonne Cité, Paris, France
| | - Blandine Beaupain
- French Neutropenia Registry, Assistance Publique-Hôpitaux de Paris, Trousseau Hospital, Paris, France
| | - Patrick Nitschké
- INSERM Unité Mixte de Recherche 1163, Bioinformatics Platform, Paris Descartes-Sorbonne Paris Cité University, Imagine Institute, Paris, France
| | - Christine Bôle-Feysot
- INSERM Unité Mixte de Recherche 1163, Genomics Platform, Paris Descartes-Sorbonne Paris Cité University, Imagine Institute, Paris, France
| | - Jean-Pierre de Villartay
- INSERM Unité Mixte de Recherche 1163, Laboratory of Genome Dynamics in the Immune System, Equipe Labellisée Ligue contre le cancer, Paris, France
- Paris Descartes-Sorbonne Paris Cité University, Imagine Institute, Paris, France
| | - Christine Bellanné-Chantelot
- Department of Genetics, Hospital Pitié Salpétriére Assistance Publique-Hôpitaux de Paris, Sorbonne Université, Paris, France
| | - Jean Donadieu
- Service d'Hémato-Oncologie Pédiatrique, Assistance Publique-Hôpitaux de Paris Hôpital Trousseau, Registre des neutropénies-Centre de référence des neutropénies chroniques, Paris, France
| | - Caroline Kannengiesser
- Assistance Publique-Hôpitaux de Paris Service de Génétique, Hôpital Bichat, Paris, France; and
- Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Alan J Warren
- Cambridge Institute for Medical Research, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Patrick Revy
- INSERM Unité Mixte de Recherche 1163, Laboratory of Genome Dynamics in the Immune System, Equipe Labellisée Ligue contre le cancer, Paris, France
- Paris Descartes-Sorbonne Paris Cité University, Imagine Institute, Paris, France
| |
Collapse
|
57
|
De Coster W, De Rijk P, De Roeck A, De Pooter T, D'Hert S, Strazisar M, Sleegers K, Van Broeckhoven C. Structural variants identified by Oxford Nanopore PromethION sequencing of the human genome. Genome Res 2019; 29:1178-1187. [PMID: 31186302 PMCID: PMC6633254 DOI: 10.1101/gr.244939.118] [Citation(s) in RCA: 92] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 06/06/2019] [Indexed: 01/17/2023]
Abstract
We sequenced the genome of the Yoruban reference individual NA19240 on the long-read sequencing platform Oxford Nanopore PromethION for evaluation and benchmarking of recently published aligners and germline structural variant calling tools, as well as a comparison with the performance of structural variant calling from short-read sequencing data. The structural variant caller Sniffles after NGMLR or minimap2 alignment provides the most accurate results, but additional confidence or sensitivity can be obtained by a combination of multiple variant callers. Sensitive and fast results can be obtained by minimap2 for alignment and a combination of Sniffles and SVIM for variant identification. We describe a scalable workflow for identification, annotation, and characterization of tens of thousands of structural variants from long-read genome sequencing of an individual or population. By discussing the results of this well-characterized reference individual, we provide an approximation of what can be expected in future long-read sequencing studies aiming for structural variant identification.
Collapse
Affiliation(s)
- Wouter De Coster
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| | - Peter De Rijk
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Arne De Roeck
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| | - Tim De Pooter
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Svenn D'Hert
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Mojca Strazisar
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Kristel Sleegers
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| | - Christine Van Broeckhoven
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| |
Collapse
|
58
|
Pervaiz N, Shakeel N, Qasim A, Zehra R, Anwar S, Rana N, Xue Y, Zhang Z, Bao Y, Abbasi AA. Evolutionary history of the human multigene families reveals widespread gene duplications throughout the history of animals. BMC Evol Biol 2019; 19:128. [PMID: 31221090 PMCID: PMC6585022 DOI: 10.1186/s12862-019-1441-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Accepted: 05/27/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The hypothesis that vertebrates have experienced two ancient, whole genome duplications (WGDs) is of central interest to evolutionary biology and has been implicated in evolution of developmental complexity. Three-way and Four-way paralogy regions in human and other vertebrate genomes are considered as vital evidence to support this hypothesis. Alternatively, it has been proposed that such paralogy regions are created by small-scale duplications that occurred at different intervals over the evolution of life. RESULTS To address this debate, the present study investigates the evolutionary history of multigene families with at least three-fold representation on human chromosomes 1, 2, 8 and 20. Phylogenetic analysis and the tree topology comparisons classified the members of 36 multigene families into four distinct co-duplicated groups. Gene families falling within the same co-duplicated group might have duplicated together, whereas genes belong to different co-duplicated groups might have distinct evolutionary origins. CONCLUSION Taken together with previous investigations, the current study yielded no proof in favor of WGDs hypothesis. Rather, it appears that the vertebrate genome evolved as a result of small-scale duplication events, that cover the entire span of the animals' history.
Collapse
Affiliation(s)
- Nashaiman Pervaiz
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Nazia Shakeel
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Ayesha Qasim
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Rabail Zehra
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Saneela Anwar
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Neenish Rana
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Yongbiao Xue
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhang Zhang
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiming Bao
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101; University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Amir Ali Abbasi
- National Center for Bioinformatics, Programme of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, 45320, Pakistan.
| |
Collapse
|
59
|
Haq F, Saeed U, Khalid R, Qasim M, Mehmood M. Phylogenetic analyses of human 1/2/8/20 paralogons suggest segmental duplications during animal evolution. 3 Biotech 2019; 9:233. [PMID: 31139548 DOI: 10.1007/s13205-019-1768-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 05/16/2019] [Indexed: 11/28/2022] Open
Abstract
Susumu Ohno hypothesized that the diversity of vertebrate gene families and genome is due to two rounds of whole genome duplications (also referred as 2R hypothesis). The quadruplicate paralogous blocks present on 1/2/8/20 chromosomes are taken as one of the evidences in favor of the 2R. In this study, we investigated that whether 2R has shaped the vertebrate evolution using gene families residing on chromosomes 1/2/8/20. Evolutionary history of 22 gene families (11 from the current study and 11 from the previous study) was evaluated by the phylogenetic analysis with triplicated or quadruplicated distribution on these human chromosomes 1/2/8/20. The phylogenetic analysis was performed using high-quality whole genomic sequence data of multiple species with neighbor-joining (NJ) and maximum likelihood (ML) methods. The phylogenetic tree topology of these gene families revealed variable duplication time points during invertebrate-vertebrate evolution. Topology comparison approach categorized 22 gene families into three groups. Tree topologies of ten gene families fell into Group 1 (duplications prior to invertebrate-vertebrate split), four in Group 2 (i.e., (AB) (C) (D), topology incongruent with 2R) and eight in Group 3 (((AB) (CD)), 2R congruent topology). Therefore, taken together the current and previous data of 1/2/8/20 paralogons, we propose that, in addition to whole genome duplications events, current developmental, morphological and genomic complexity of the vertebrate genomes may also have originated through segmental duplications occurring at varying time points during the course of animal evolution.
Collapse
Affiliation(s)
- Farhan Haq
- 1Department of Biosciences, COMSATS University Islamabad, Park Road, Chak Shehzad, Islamabad, Pakistan
| | - Usman Saeed
- 2Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftzentrum Weihenstephan, Munich, Germany
| | - Rida Khalid
- 1Department of Biosciences, COMSATS University Islamabad, Park Road, Chak Shehzad, Islamabad, Pakistan
| | | | - Maryam Mehmood
- 1Department of Biosciences, COMSATS University Islamabad, Park Road, Chak Shehzad, Islamabad, Pakistan
| |
Collapse
|
60
|
Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun 2019. [PMID: 30992455 DOI: 10.1038/s41467‐018‐08148‐z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
Collapse
|
61
|
Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun 2019; 10:1784. [PMID: 30992455 PMCID: PMC6467913 DOI: 10.1038/s41467-018-08148-z] [Citation(s) in RCA: 539] [Impact Index Per Article: 89.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 12/20/2018] [Indexed: 12/30/2022] Open
Abstract
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
Collapse
|
62
|
Lin YL, Gokcumen O. Fine-Scale Characterization of Genomic Structural Variation in the Human Genome Reveals Adaptive and Biomedically Relevant Hotspots. Genome Biol Evol 2019; 11:1136-1151. [PMID: 30887040 PMCID: PMC6475128 DOI: 10.1093/gbe/evz058] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2019] [Indexed: 12/25/2022] Open
Abstract
Genomic structural variants (SVs) are distributed nonrandomly across the human genome. The "hotspots" of SVs have been implicated in evolutionary innovations, as well as medical conditions. However, the evolutionary and biomedical features of these hotspots remain incompletely understood. Here, we analyzed data from 2,504 genomes to construct a refined map of 1,148 SV hotspots in human genomes. We confirmed that segmental duplication-related nonallelic homologous recombination is an important mechanistic driver of SV hotspot formation. However, to our surprise, we also found that a majority of SVs in hotspots do not form through such recombination-based mechanisms, suggesting diverse mechanistic and selective forces shaping hotspots. Indeed, our evolutionary analyses showed that the majority of SV hotspots are within gene-poor regions and evolve under relaxed negative selection or neutrality. However, we still found a small subset of SV hotspots harboring genes that are enriched for anthropologically crucial functions and evolve under geography-specific and balancing adaptive forces. These include two independent hotspots on different chromosomes affecting alpha and beta hemoglobin gene clusters. Biomedically, we found that the SV hotspots coincide with breakpoints of clinically relevant, large de novo SVs, significantly more often than genome-wide expectations. For example, we showed that the breakpoints of multiple large SVs, which lead to idiopathic short stature, coincide with SV hotspots. Therefore, the mutational instability in SV hotpots likely enables chromosomal breaks that lead to pathogenic structural variation formations. Overall, our study contributes to a better understanding of the mutational and adaptive landscape of the genome.
Collapse
Affiliation(s)
- Yen-Lung Lin
- Department of Biological Sciences, University at Buffalo
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo
- Corresponding author: E-mail: or
| |
Collapse
|
63
|
Shao Y, Chen C, Shen H, He BZ, Yu D, Jiang S, Zhao S, Gao Z, Zhu Z, Chen X, Fu Y, Chen H, Gao G, Long M, Zhang YE. GenTree, an integrated resource for analyzing the evolution and function of primate-specific coding genes. Genome Res 2019; 29:682-696. [PMID: 30862647 PMCID: PMC6442393 DOI: 10.1101/gr.238733.118] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 01/29/2019] [Indexed: 12/13/2022]
Abstract
The origination of new genes contributes to phenotypic evolution in humans. Two major challenges in the study of new genes are the inference of gene ages and annotation of their protein-coding potential. To tackle these challenges, we created GenTree, an integrated online database that compiles age inferences from three major methods together with functional genomic data for new genes. Genome-wide comparison of the age inference methods revealed that the synteny-based pipeline (SBP) is most suited for recently duplicated genes, whereas the protein-family–based methods are useful for ancient genes. For SBP-dated primate-specific protein-coding genes (PSGs), we performed manual evaluation based on published PSG lists and showed that SBP generated a conservative data set of PSGs by masking less reliable syntenic regions. After assessing the coding potential based on evolutionary constraint and peptide evidence from proteomic data, we curated a list of 254 PSGs with different levels of protein evidence. This list also includes 41 candidate misannotated pseudogenes that encode primate-specific short proteins. Coexpression analysis showed that PSGs are preferentially recruited into organs with rapidly evolving pathways such as spermatogenesis, immune response, mother–fetus interaction, and brain development. For brain development, primate-specific KRAB zinc-finger proteins (KZNFs) are specifically up-regulated in the mid-fetal stage, which may have contributed to the evolution of this critical stage. Altogether, hundreds of PSGs are either recruited to processes under strong selection pressure or to processes supporting an evolving novel organ.
Collapse
Affiliation(s)
- Yi Shao
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chunyan Chen
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hao Shen
- College of Computers, Hunan University of Technology, Zhuzhou Hunan 412007, China
| | - Bin Z He
- FAS Center for Systems Biology and Howard Hughes Medical Institute, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Daqi Yu
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuai Jiang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China
| | - Shilei Zhao
- University of Chinese Academy of Sciences, Beijing 100049, China.,CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhiqiang Gao
- University of Chinese Academy of Sciences, Beijing 100049, China.,National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
| | - Zhenglin Zhu
- School of Life Sciences, Chongqing University, Chongqing 400044, China
| | - Xi Chen
- Wuhan Institute of Biotechnology, Wuhan 430072, China.,Medical Research Institute, Wuhan University, Wuhan 430072, China
| | - Yan Fu
- University of Chinese Academy of Sciences, Beijing 100049, China.,National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
| | - Hua Chen
- University of Chinese Academy of Sciences, Beijing 100049, China.,CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yong E Zhang
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
64
|
Maggiolini FAM, Cantsilieris S, D’Addabbo P, Manganelli M, Coe BP, Dumont BL, Sanders AD, Pang AWC, Vollger MR, Palumbo O, Palumbo P, Accadia M, Carella M, Eichler EE, Antonacci F. Genomic inversions and GOLGA core duplicons underlie disease instability at the 15q25 locus. PLoS Genet 2019; 15:e1008075. [PMID: 30917130 PMCID: PMC6436712 DOI: 10.1371/journal.pgen.1008075] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 03/07/2019] [Indexed: 11/19/2022] Open
Abstract
Human chromosome 15q25 is involved in several disease-associated structural rearrangements, including microdeletions and chromosomal markers with inverted duplications. Using comparative fluorescence in situ hybridization, strand-sequencing, single-molecule, real-time sequencing and Bionano optical mapping analyses, we investigated the organization of the 15q25 region in human and nonhuman primates. We found that two independent inversions occurred in this region after the fission event that gave rise to phylogenetic chromosomes XIV and XV in humans and great apes. One of these inversions is still polymorphic in the human population today and may confer differential susceptibility to 15q25 microdeletions and inverted duplications. The inversion breakpoints map within segmental duplications containing core duplicons of the GOLGA gene family and correspond to the site of an ancestral centromere, which became inactivated about 25 million years ago. The inactivation of this centromere likely released segmental duplications from recombination repression typical of centromeric regions. We hypothesize that this increased the frequency of ectopic recombination creating a hotspot of hominid inversions where dispersed GOLGA core elements now predispose this region to recurrent genomic rearrangements associated with disease.
Collapse
Affiliation(s)
| | - Stuart Cantsilieris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States of America
| | - Pietro D’Addabbo
- Dipartimento di Biologia, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Michele Manganelli
- Dipartimento di Biologia, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Bradley P. Coe
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States of America
| | - Beth L. Dumont
- The Jackson Laboratory, Bar Harbor, ME, United States of America
| | - Ashley D. Sanders
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, Heidelberg, Germany
| | | | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States of America
| | - Orazio Palumbo
- Medical Genetics Unit, IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Pietro Palumbo
- Medical Genetics Unit, IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Maria Accadia
- Medical Genetics Service, Hospital “Cardinale G. Panico”, Via San Pio X n°4, Tricase, LE, Italy
| | - Massimo Carella
- Medical Genetics Unit, IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States of America
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, United States of America
| | - Francesca Antonacci
- Dipartimento di Biologia, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| |
Collapse
|
65
|
Farré M, Kim J, Proskuryakova AA, Zhang Y, Kulemzina AI, Li Q, Zhou Y, Xiong Y, Johnson JL, Perelman PL, Johnson WE, Warren WC, Kukekova AV, Zhang G, O'Brien SJ, Ryder OA, Graphodatsky AS, Ma J, Lewin HA, Larkin DM. Evolution of gene regulation in ruminants differs between evolutionary breakpoint regions and homologous synteny blocks. Genome Res 2019; 29:576-589. [PMID: 30760546 PMCID: PMC6442394 DOI: 10.1101/gr.239863.118] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 02/08/2019] [Indexed: 02/02/2023]
Abstract
The role of chromosome rearrangements in driving evolution has been a long-standing question of evolutionary biology. Here we focused on ruminants as a model to assess how rearrangements may have contributed to the evolution of gene regulation. Using reconstructed ancestral karyotypes of Cetartiodactyls, Ruminants, Pecorans, and Bovids, we traced patterns of gross chromosome changes. We found that the lineage leading to the ruminant ancestor after the split from other cetartiodactyls was characterized by mostly intrachromosomal changes, whereas the lineage leading to the pecoran ancestor (including all livestock ruminants) included multiple interchromosomal changes. We observed that the liver cell putative enhancers in the ruminant evolutionary breakpoint regions are highly enriched for DNA sequences under selective constraint acting on lineage-specific transposable elements (TEs) and a set of 25 specific transcription factor (TF) binding motifs associated with recently active TEs. Coupled with gene expression data, we found that genes near ruminant breakpoint regions exhibit more divergent expression profiles among species, particularly in cattle, which is consistent with the phylogenetic origin of these breakpoint regions. This divergence was significantly greater in genes with enhancers that contain at least one of the 25 specific TF binding motifs and located near bovidae-to-cattle lineage breakpoint regions. Taken together, by combining ancestral karyotype reconstructions with analysis of cis regulatory element and gene expression evolution, our work demonstrated that lineage-specific regulatory elements colocalized with gross chromosome rearrangements may have provided valuable functional modifications that helped to shape ruminant evolution.
Collapse
Affiliation(s)
- Marta Farré
- Royal Veterinary College, University of London, London NW1 0TU, United Kingdom
| | - Jaebum Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul 05029, Korea
| | - Anastasia A Proskuryakova
- Institute of Molecular and Cellular Biology, SB RAS, Novosibirsk 630090, Russia.,Synthetic Biology Unit, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Yang Zhang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | | | - Qiye Li
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Yang Zhou
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Yingqi Xiong
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Jennifer L Johnson
- Department of Animal Sciences, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Polina L Perelman
- Institute of Molecular and Cellular Biology, SB RAS, Novosibirsk 630090, Russia.,Synthetic Biology Unit, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, Virginia 22630, USA.,Walter Reed Biosystematics Unit, Museum Support Center, Smithsonian Institution, Suitland, Maryland 20746, USA
| | - Wesley C Warren
- Bond Life Sciences Center, University of Missouri, Columbia, Missouri 63201, USA
| | - Anna V Kukekova
- Department of Animal Sciences, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China.,State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China.,Centre for Social Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
| | - Stephen J O'Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg 199004, Russia.,Guy Harvey Oceanographic Center, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Fort Lauderdale, Florida 33004, USA
| | - Oliver A Ryder
- Institute for Conservation Research, San Diego Zoo, Escondido, California 92027, USA
| | - Alexander S Graphodatsky
- Institute of Molecular and Cellular Biology, SB RAS, Novosibirsk 630090, Russia.,Synthetic Biology Unit, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Harris A Lewin
- Department of Evolution and Ecology and the UC Davis Genome Center, University of California, Davis, California 95616, USA
| | - Denis M Larkin
- Royal Veterinary College, University of London, London NW1 0TU, United Kingdom.,The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences (ICG SB RAS), Novosibirsk 630090, Russia
| |
Collapse
|
66
|
Tang W, Mun S, Joshi A, Han K, Liang P. Mobile elements contribute to the uniqueness of human genome with 15,000 human-specific insertions and 14 Mbp sequence increase. DNA Res 2019; 25:521-533. [PMID: 30052927 PMCID: PMC6191304 DOI: 10.1093/dnares/dsy022] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 06/20/2018] [Indexed: 02/02/2023] Open
Abstract
Mobile elements (MEs) collectively contribute to at least 50% of the human genome. Due to their past incremental accumulation and ongoing DNA transposition, MEs serve as a significant source for both inter- and intra-species genetic and phenotypic diversity during primate and human evolution. By making use of the most recent genome sequences for human and many other closely related primates and robust multi-way comparative genomic approach, we identified a total of 14,870 human-specific MEs (HS-MEs) with more than 8,000 being newly identified. Collectively, these HS-MEs contribute to a total of 14.2 Mbp net genome sequence increase. Several new observations were made based on these HS-MEs, including the finding of Y chromosome as a strikingly hot target for HS-MEs and a strong mutual preference for SINE-R/VNTR/Alu (SVAs). Furthermore, ∼8,000 of these HS-MEs were found to locate in the vicinity of ∼4,900 genes, and collectively they contribute to ∼84 kb sequences in the human reference transcriptome in association with over 300 genes, including protein-coding sequences for 40 genes. In conclusion, our results demonstrate that MEs made a significant contribution to the evolution of human genome by participating in gene function in a human-specific fashion.
Collapse
Affiliation(s)
- Wanxiangfu Tang
- Department of Biological Sciences, Brock University, St. Catharines, ON, Canada
| | - Seyoung Mun
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research, Center for Regenerative Medicine, Dankook University, Cheonan, Republic of Korea
| | - Aditya Joshi
- Department of Biological Sciences, Brock University, St. Catharines, ON, Canada
| | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research, Center for Regenerative Medicine, Dankook University, Cheonan, Republic of Korea
| | - Ping Liang
- Department of Biological Sciences, Brock University, St. Catharines, ON, Canada
| |
Collapse
|
67
|
Belyeu JR, Nicholas TJ, Pedersen BS, Sasani TA, Havrilla JM, Kravitz SN, Conway ME, Lohman BK, Quinlan AR, Layer RM. SV-plaudit: A cloud-based framework for manually curating thousands of structural variants. Gigascience 2018; 7:5026174. [PMID: 29860504 PMCID: PMC6030999 DOI: 10.1093/gigascience/giy064] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 05/25/2018] [Indexed: 01/21/2023] Open
Abstract
SV-plaudit is a framework for rapidly curating structural variant (SV) predictions. For each SV, we generate an image that visualizes the coverage and alignment signals from a set of samples. Images are uploaded to our cloud framework where users assess the quality of each image using a client-side web application. Reports can then be generated as a tab-delimited file or annotated Variant Call Format (VCF) file. As a proof of principle, nine researchers collaborated for 1 hour to evaluate 1,350 SVs each. We anticipate that SV-plaudit will become a standard step in variant calling pipelines and the crowd-sourced curation of other biological results.Code available at https://github.com/jbelyeu/SV-plauditDemonstration video available at https://www.youtube.com/watch?v=ono8kHMKxDs.
Collapse
Affiliation(s)
- Jonathan R Belyeu
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - James M Havrilla
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Stephanie N Kravitz
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Megan E Conway
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA
| | - Brian K Lohman
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA.,Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA
| | - Ryan M Layer
- Department of Human Genetics, University of Utah, 15 S 2030 E, Salt Lake City, UT, USA.,USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| |
Collapse
|
68
|
Shao H, Zhou C, Cao MD, Coin LJM. Ongoing human chromosome end extension revealed by analysis of BioNano and nanopore data. Sci Rep 2018; 8:16616. [PMID: 30413723 PMCID: PMC6226469 DOI: 10.1038/s41598-018-34774-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 10/22/2018] [Indexed: 11/08/2022] Open
Abstract
The majority of human chromosome ends remain incompletely assembled due to their highly repetitive structure. In this study, we use BioNano data to anchor and extend chromosome ends from two European trios as well as two unrelated Asian genomes. At least 11 BioNano assembled chromosome ends are structurally divergent from the reference genome, including both missing sequence and extensions. These extensions are heritable and in some cases divergent between Asian and European samples. Six out of nine predicted extension sequences from NA12878 can be confirmed and filled by nanopore data. We identify two multi-kilobase sequence families both enriched more than 100-fold in extension sequence (p-values < 1e-5) whose origins can be traced to interstitial sequence on ancestral primate chromosome 7. Extensive sub-telomeric duplication of these families has occurred in the human lineage subsequent to divergence from chimpanzees.
Collapse
Affiliation(s)
- Haojing Shao
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, 4072, Australia
| | - Chenxi Zhou
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, 4072, Australia
| | - Minh Duc Cao
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, 4072, Australia
| | - Lachlan J M Coin
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, 4072, Australia.
| |
Collapse
|
69
|
Sahlin K, Tomaszkiewicz M, Makova KD, Medvedev P. Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon. Nat Commun 2018; 9:4601. [PMID: 30389934 PMCID: PMC6214943 DOI: 10.1038/s41467-018-06910-x] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 09/29/2018] [Indexed: 12/30/2022] Open
Abstract
A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases.
Collapse
Affiliation(s)
- Kristoffer Sahlin
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16802, USA
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Medical Genomics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Computational Biology and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
| | - Paul Medvedev
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Medical Genomics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Computational Biology and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
70
|
Hartasánchez DA, Brasó-Vives M, Heredia-Genestar JM, Pybus M, Navarro A. Effect of Collapsed Duplications on Diversity Estimates: What to Expect. Genome Biol Evol 2018; 10:2899-2905. [PMID: 30364947 PMCID: PMC6239678 DOI: 10.1093/gbe/evy223] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/08/2018] [Indexed: 12/19/2022] Open
Abstract
The study of segmental duplications (SDs) and copy-number variants (CNVs) is of great importance in the fields of genomics and evolution. However, SDs and CNVs are usually excluded from genome-wide scans for natural selection. Because of high identity between copies, SDs and CNVs that are not included in reference genomes are prone to be collapsed-that is, mistakenly aligned to the same region-when aligning sequence data from single individuals to the reference. Such collapsed duplications are additionally challenging because concerted evolution between duplications alters their site frequency spectrum and linkage disequilibrium patterns. To investigate the potential effect of collapsed duplications upon natural selection scans we obtained expectations for four summary statistics from simulations of duplications evolving under a range of interlocus gene conversion and crossover rates. We confirm that summary statistics traditionally used to detect the action of natural selection on DNA sequences cannot be applied to SDs and CNVs since in some cases values for known duplications mimic selective signatures. As a proof of concept of the pervasiveness of collapsed duplications, we analyzed data from the 1,000 Genomes Project. We find that, within regions identified as variable in copy number, diversity between individuals with the duplication is consistently higher than between individuals without the duplication. Furthermore, the frequency of single nucleotide variants (SNVs) deviating from Hardy-Weinberg Equilibrium is higher in individuals with the duplication, which strongly suggests that higher diversity is a consequence of collapsed duplications and incorrect evaluation of SNVs within these CNV regions.
Collapse
Affiliation(s)
- Diego A Hartasánchez
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.,Laboratoire de Biométrie et Biologie Évolutive UMR 5558, Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Marina Brasó-Vives
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Jose Maria Heredia-Genestar
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Marc Pybus
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Arcadi Navarro
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.,National Institute for Bioinformatics (INB), Barcelona, Catalonia, Spain.,Centre for Genomic Regulation (CRG), Barcelona, Catalonia, Spain
| |
Collapse
|
71
|
Moshiri N, Mirarab S. A Two-State Model of Tree Evolution and Its Applications to Alu Retrotransposition. Syst Biol 2018; 67:475-489. [PMID: 29165679 DOI: 10.1093/sysbio/syx088] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 11/15/2017] [Indexed: 11/14/2022] Open
Abstract
Models of tree evolution have mostly focused on capturing the cladogenesis processes behind speciation. Processes that derive the evolution of genomic elements, such as repeats, are not necessarily captured by these existing models. In this article, we design a model of tree evolution that we call the dual-birth model, and we show how it can be useful in studying the evolution of short Alu repeats found in the human genome in abundance. The dual-birth model extends the traditional birth-only model to have two rates of propagation, one for active nodes that propagate often, and another for inactive nodes, that with a lower rate, activate and start propagating. Adjusting the ratio of the rates controls the expected tree balance. We present several theoretical results under the dual-birth model, introduce parameter estimation techniques, and study the properties of the model in simulations. We then use the dual-birth model to estimate the number of active Alu elements and their rates of propagation and activation in the human genome based on a large phylogenetic tree that we build from close to one million Alu sequences.
Collapse
Affiliation(s)
- Niema Moshiri
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
| |
Collapse
|
72
|
Calhoun JD, Carvill GL. Unravelling the genetic architecture of autosomal recessive epilepsy in the genomic era. J Neurogenet 2018; 32:295-312. [PMID: 30247086 DOI: 10.1080/01677063.2018.1513509] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The technological advancement of next-generation sequencing has greatly accelerated the pace of variant discovery in epilepsy. Despite an initial focus on autosomal dominant epilepsy due to the tractable nature of variant discovery with trios under a de novo model, more and more variants are being reported in families with epilepsies consistent with autosomal recessive (AR) inheritance. In this review, we touch on the classical AR epilepsy variants such as the inborn errors of metabolism and malformations of cortical development. However, we also highlight recently reported genes that are being identified by next-generation sequencing approaches and online 'matchmaking' platforms. Syndromes mainly characterized by seizures and complex neurodevelopmental disorders comorbid with epilepsy are discussed as an example of the wide phenotypic spectrum associated with the AR epilepsies. We conclude with a foray into the future, from the application of whole-genome sequencing to identify elusive epilepsy variants, to the promise of precision medicine initiatives to provide novel targeted therapeutics specific to the individual based on their clinical genetic testing.
Collapse
Affiliation(s)
- Jeffrey D Calhoun
- a Department of Neurology , Northwestern University Feinberg School of Medicine , Chicago , IL , USA
| | - Gemma L Carvill
- a Department of Neurology , Northwestern University Feinberg School of Medicine , Chicago , IL , USA
| |
Collapse
|
73
|
Poot M. Syndromes Hidden within the 16p11.2 Deletion Region. Mol Syndromol 2018; 9:171-174. [PMID: 30140194 DOI: 10.1159/000490845] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2018] [Indexed: 12/31/2022] Open
|
74
|
Shao H, Ganesamoorthy D, Duarte T, Cao MD, Hoggart CJ, Coin LJM. npInv: accurate detection and genotyping of inversions using long read sub-alignment. BMC Bioinformatics 2018; 19:261. [PMID: 30001702 PMCID: PMC6044046 DOI: 10.1186/s12859-018-2252-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Accepted: 06/18/2018] [Indexed: 11/21/2022] Open
Abstract
Background Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored. Result We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats. Conclusion The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion. Electronic supplementary material The online version of this article (10.1186/s12859-018-2252-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Haojing Shao
- Genomics of Development and Disease Division, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, 4067, Australia
| | - Devika Ganesamoorthy
- Genomics of Development and Disease Division, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, 4067, Australia
| | - Tania Duarte
- Genomics of Development and Disease Division, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, 4067, Australia
| | - Minh Duc Cao
- Genomics of Development and Disease Division, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, 4067, Australia
| | - Clive J Hoggart
- Department of Medicine, Imperial College London, Level 2, Faculty Building South Kensington Campus, London, SW7 2AZ, United Kingdom
| | - Lachlan J M Coin
- Genomics of Development and Disease Division, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, 4067, Australia.
| |
Collapse
|
75
|
Fiddes IT, Lodewijk GA, Mooring M, Bosworth CM, Ewing AD, Mantalas GL, Novak AM, van den Bout A, Bishara A, Rosenkrantz JL, Lorig-Roach R, Field AR, Haeussler M, Russo L, Bhaduri A, Nowakowski TJ, Pollen AA, Dougherty ML, Nuttle X, Addor MC, Zwolinski S, Katzman S, Kriegstein A, Eichler EE, Salama SR, Jacobs FMJ, Haussler D. Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 2018; 173:1356-1369.e22. [PMID: 29856954 DOI: 10.1016/j.cell.2018.03.051] [Citation(s) in RCA: 343] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 02/16/2018] [Accepted: 03/21/2018] [Indexed: 12/12/2022]
Abstract
Genetic changes causing brain size expansion in human evolution have remained elusive. Notch signaling is essential for radial glia stem cell proliferation and is a determinant of neuronal number in the mammalian cortex. We find that three paralogs of human-specific NOTCH2NL are highly expressed in radial glia. Functional analysis reveals that different alleles of NOTCH2NL have varying potencies to enhance Notch signaling by interacting directly with NOTCH receptors. Consistent with a role in Notch signaling, NOTCH2NL ectopic expression delays differentiation of neuronal progenitors, while deletion accelerates differentiation into cortical neurons. Furthermore, NOTCH2NL genes provide the breakpoints in 1q21.1 distal deletion/duplication syndrome, where duplications are associated with macrocephaly and autism and deletions with microcephaly and schizophrenia. Thus, the emergence of human-specific NOTCH2NL genes may have contributed to the rapid evolution of the larger human neocortex, accompanied by loss of genomic stability at the 1q21.1 locus and resulting recurrent neurodevelopmental disorders.
Collapse
Affiliation(s)
- Ian T Fiddes
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Gerrald A Lodewijk
- University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, the Netherlands
| | | | | | - Adam D Ewing
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Gary L Mantalas
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA; Molecular, Cell and Developmental Biology Department, UC Santa Cruz, Santa Cruz, CA, USA
| | - Adam M Novak
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Anouk van den Bout
- University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, the Netherlands
| | - Alex Bishara
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jimi L Rosenkrantz
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA; Howard Hughes Medical Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | | | - Andrew R Field
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA; Molecular, Cell and Developmental Biology Department, UC Santa Cruz, Santa Cruz, CA, USA
| | | | - Lotte Russo
- University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, the Netherlands
| | - Aparna Bhaduri
- Department of Neurology and the Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research at the University of California, San Francisco, San Francisco, CA, USA
| | - Tomasz J Nowakowski
- Department of Neurology and the Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research at the University of California, San Francisco, San Francisco, CA, USA
| | - Alex A Pollen
- Department of Neurology and the Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research at the University of California, San Francisco, San Francisco, CA, USA
| | - Max L Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Xander Nuttle
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Harvard Medical School, Boston, MA, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA
| | | | - Simon Zwolinski
- Department of Cytogenetics, Northern Genetics Service, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Sol Katzman
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | - Arnold Kriegstein
- Department of Neurology and the Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research at the University of California, San Francisco, San Francisco, CA, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Sofie R Salama
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA; Howard Hughes Medical Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | - Frank M J Jacobs
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA; University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, the Netherlands.
| | - David Haussler
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA; Howard Hughes Medical Institute, UC Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
76
|
Pu L, Lin Y, Pevzner PA. Detection and analysis of ancient segmental duplications in mammalian genomes. Genome Res 2018; 28:901-909. [PMID: 29735604 PMCID: PMC5991524 DOI: 10.1101/gr.228718.117] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 04/26/2018] [Indexed: 01/07/2023]
Abstract
Although segmental duplications (SDs) represent hotbeds for genomic rearrangements and emergence of new genes, there are still no easy-to-use tools for identifying SDs. Moreover, while most previous studies focused on recently emerged SDs, detection of ancient SDs remains an open problem. We developed an SDquest algorithm for SD finding and applied it to analyzing SDs in human, gorilla, and mouse genomes. Our results demonstrate that previous studies missed many SDs in these genomes and show that SDs account for at least 6.05% of the human genome (version hg19), a 17% increase as compared to the previous estimate. Moreover, SDquest classified 6.42% of the latest GRCh38 version of the human genome as SDs, a large increase as compared to previous studies. We thus propose to re-evaluate evolution of SDs based on their accurate representation across multiple genomes. Toward this goal, we analyzed the complex mosaic structure of SDs and decomposed mosaic SDs into elementary SDs, a prerequisite for follow-up evolutionary analysis. We also introduced the concept of the breakpoint graph of mosaic SDs that revealed SD hotspots and suggested that some SDs may have originated from circular extrachromosomal DNA (ecDNA), not unlike ecDNA that contributes to accelerated evolution in cancer.
Collapse
Affiliation(s)
- Lianrong Pu
- Department of Computer Science and Technology, Shandong University, Jinan 250101, China.,Department of Computer Science and Engineering, University of California at San Diego, San Diego, California 92093, USA
| | - Yu Lin
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, California 92093, USA.,Research School of Computer Science, Australian National University, Canberra, ACT 2601, Australia
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, California 92093, USA
| |
Collapse
|
77
|
Saitou M, Satta Y, Gokcumen O, Ishida T. Complex evolution of the GSTM gene family involves sharing of GSTM1 deletion polymorphism in humans and chimpanzees. BMC Genomics 2018; 19:293. [PMID: 29695243 PMCID: PMC5918908 DOI: 10.1186/s12864-018-4676-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2018] [Accepted: 04/15/2018] [Indexed: 02/06/2023] Open
Abstract
Background The common deletion of the glutathione S-transferase Mu 1 (GSTM1) gene in humans has been shown to be involved in xenobiotic metabolism and associated with bladder cancer. However, the evolution of this deletion has not been investigated. Results In this study, we conducted comparative analyses of primate genomes. We demonstrated that the GSTM gene family has evolved through multiple structural variations, involving gene duplications, losses, large inversions and gene conversions. We further showed experimentally that the GSTM1 was polymorphically deleted in both humans and also in chimpanzees, through independent deletion events. To generalize our results, we searched for genic deletions that are polymorphic in both humans and chimpanzees. Consequently, we found only two such deletions among the thousands that we have searched, one of them being the GSTM1 deletion and the other surprisingly being another metabolizing gene, the UGT2B17. Conclusions Overall, our results support the emerging notion that metabolizing gene families, such as the GSTM, NAT, UGT and CYP, have been evolving rapidly through gene duplication and deletion events in primates, leading to complex structural variation within and among species with unknown evolutionary consequences. Electronic supplementary material The online version of this article (10.1186/s12864-018-4676-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- M Saitou
- Department of Biological Sciences, The University of Tokyo, Tokyo, Japan.,Department of Biological Sciences, State University of New York at Buffalo, Buffalo, USA
| | - Y Satta
- The Graduate University for Advanced Studies (SOKENDAI), Hayama, Japan
| | - O Gokcumen
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, USA.
| | - T Ishida
- Department of Biological Sciences, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
78
|
Bekpen C, Xie C, Nebel A, Tautz D. Involvement of SPATA31 copy number variable genes in human lifespan. Aging (Albany NY) 2018; 10:674-688. [PMID: 29676996 PMCID: PMC5940121 DOI: 10.18632/aging.101421] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 04/14/2018] [Indexed: 12/22/2022]
Abstract
The SPATA31 (alias FAM75A) gene family belongs to the core duplicon families that are thought to have contributed significantly to hominoid evolution. It is also among the gene families with the strongest signal of positive selection in hominoids. It has acquired new protein domains in the primate lineage and a previous study has suggested that the gene family has expanded its function into UV response and DNA repair. Here we show that over-expression of SPATA31A1 in fibroblast cells leads to premature senescence due to interference with aging-related transcription pathways. We show that there are considerable copy number differences for this gene family in human populations and we ask whether this could influence mutation rates and longevity in humans. We find no evidence for an influence on germline mutation rates, but an analysis of long-lived individuals (> 96 years) shows that they carry significantly fewer SPATA31 copies in their genomes than younger individuals in a control group. We propose that the evolution of SPATA31 copy number is an example for antagonistic pleiotropy by providing a fitness benefit during the reproductive phase of life, but negatively influencing the overall life span.
Collapse
Affiliation(s)
| | - Chen Xie
- Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Almut Nebel
- Institute of Clinical Molecular Biology, Kiel University, 24105 Kiel, Germany
| | - Diethard Tautz
- Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| |
Collapse
|
79
|
Glugoski L, Giuliano-Caetano L, Moreira-Filho O, Vicari MR, Nogaroto V. Co-located hAT transposable element and 5S rDNA in an interstitial telomeric sequence suggest the formation of Robertsonian fusion in armored catfish. Gene 2018; 650:49-54. [DOI: 10.1016/j.gene.2018.01.099] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 01/23/2018] [Accepted: 01/31/2018] [Indexed: 01/12/2023]
|
80
|
da Silva VH, Laine VN, Bosse M, Oers KV, Dibbits B, Visser ME, M A Crooijmans RP, Groenen MAM. CNVs are associated with genomic architecture in a songbird. BMC Genomics 2018; 19:195. [PMID: 29703149 PMCID: PMC6389189 DOI: 10.1186/s12864-018-4577-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 03/02/2018] [Indexed: 12/11/2022] Open
Abstract
Background Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75% of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history. Electronic supplementary material The online version of this article (10.1186/s12864-018-4577-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Vinicius H da Silva
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands. .,Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands.
| | - Veronika N Laine
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands.,Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands.,Swedish University of Agricultural Sciences (SLU), Ulls väg 26, Uppsala, 750 07, Sweden
| | - Mirte Bosse
- Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands
| | - Kees van Oers
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands
| | - Bert Dibbits
- Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands
| | - Marcel E Visser
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands
| | - Richard P M A Crooijmans
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands.,Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands
| | - Martien A M Groenen
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands
| |
Collapse
|
81
|
Liu C, Ran X, Yu C, Xu Q, Niu X, Zhao P, Wang J. Whole-genome analysis of structural variations between Xiang pigs with larger litter sizes and those with smaller litter sizes. Genomics 2018; 111:310-319. [PMID: 29481841 DOI: 10.1016/j.ygeno.2018.02.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Revised: 02/08/2018] [Accepted: 02/11/2018] [Indexed: 11/30/2022]
Abstract
To gain a better knowledge of structural variations (SVs) in Xiang pig, we used next-generation sequencing to analyze the Xiang pigs with larger (XL) or smaller litter sizes (XS). Our analysis yielded 28,040 putative SVs in the Xiang pig. These SVs distributed throughout all of chromosomes. Some functional regions including exons and untranslated regions were less varied than introns and intergenic regions. We detected 4637 and 4119 specific SVs, which contained 1697 and 1582 genes in XL and XS group, respectively. These genes were mainly enriched in the well-known pathways involved in development and reproduction processes. Population validation was carried out on 50 SVs candidates using PCR method in 144 Xiang pig crowds. All of 50 SVs were confirmed by PCR method and 14 SVs were associated with the litter size of Xiang pigs. These results may be helpful for the elucidation of growth and reproduction regulation in Xiang pig.
Collapse
Affiliation(s)
- Chang Liu
- Institute of Agro-Bioengineering, College of Animal Science, Guizhou University, Guiyang 550025, China
| | - Xueqin Ran
- Institute of Agro-Bioengineering, College of Animal Science, Guizhou University, Guiyang 550025, China.
| | - Changyan Yu
- Institute of Agro-Bioengineering, College of Animal Science, Guizhou University, Guiyang 550025, China
| | - Qian Xu
- Institute of Agro-Bioengineering, College of Animal Science, Guizhou University, Guiyang 550025, China
| | - Xi Niu
- Institute of Agro-Bioengineering, College of Animal Science, Guizhou University, Guiyang 550025, China
| | - Pengju Zhao
- College of Animal Science and Technology, China Agricultural University, Beijing 100083, China
| | - Jiafu Wang
- Institute of Agro-Bioengineering, College of Animal Science, Guizhou University, Guiyang 550025, China; Tongren University, Tongren 554300, China.
| |
Collapse
|
82
|
Steenwyk JL, Rokas A. Copy Number Variation in Fungi and Its Implications for Wine Yeast Genetic Diversity and Adaptation. Front Microbiol 2018; 9:288. [PMID: 29520259 PMCID: PMC5826948 DOI: 10.3389/fmicb.2018.00288] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 02/07/2018] [Indexed: 11/13/2022] Open
Abstract
In recent years, copy number (CN) variation has emerged as a new and significant source of genetic polymorphisms contributing to the phenotypic diversity of populations. CN variants are defined as genetic loci that, due to duplication and deletion, vary in their number of copies across individuals in a population. CN variants range in size from 50 base pairs to whole chromosomes, can influence gene activity, and are associated with a wide range of phenotypes in diverse organisms, including the budding yeast Saccharomyces cerevisiae. In this review, we introduce CN variation, discuss the genetic and molecular mechanisms implicated in its generation, how they can contribute to genetic and phenotypic diversity in fungal populations, and consider how CN variants may influence wine yeast adaptation in fermentation-related processes. In particular, we focus on reviewing recent work investigating the contribution of changes in CN of fermentation-related genes in yeast wine strains and offer notable illustrations of such changes, including the high levels of CN variation among the CUP genes, which confer resistance to copper, a metal with fungicidal properties, and the preferential deletion and duplication of the MAL1 and MAL3 loci, respectively, which are responsible for metabolizing maltose and sucrose. Based on the available data, we propose that CN variation is a substantial dimension of yeast genetic diversity that occurs largely independent of single nucleotide polymorphisms. As such, CN variation harbors considerable potential for understanding and manipulating yeast strains in the wine fermentation environment and beyond.
Collapse
Affiliation(s)
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
83
|
Capilla L, Sánchez-Guillén RA, Farré M, Paytuví-Gallart A, Malinverni R, Ventura J, Larkin DM, Ruiz-Herrera A. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia. Genome Biol Evol 2018; 8:3703-3717. [PMID: 28175287 PMCID: PMC5521730 DOI: 10.1093/gbe/evw276] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/08/2016] [Indexed: 12/16/2022] Open
Abstract
Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints.
Collapse
Affiliation(s)
- Laia Capilla
- Genome Integrity and Instability Group, Institut de Biotecnologia i Biomedicina (IBB), Universitat Autònoma de Barcelona (UAB), Barcelona, Spain.,Departament de Biologia Animal, Biologia Vegetal i Ecologia, Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
| | - Rosa Ana Sánchez-Guillén
- Genome Integrity and Instability Group, Institut de Biotecnologia i Biomedicina (IBB), Universitat Autònoma de Barcelona (UAB), Barcelona, Spain.,Biología Evolutiva, Instituto de Ecología A.C, Xalapa, Veracruz, Apartado, Mexico
| | - Marta Farré
- Biología Evolutiva, Instituto de Ecología A.C, Xalapa, Veracruz, Apartado, Mexico
| | - Andreu Paytuví-Gallart
- Department of Comparative Biomedical Sciences, The Royal Veterinary College, London, UK.,Sequentia Biotech S.L. Calle Comte d'Urgell, Barcelona, Spain
| | - Roberto Malinverni
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
| | - Jacint Ventura
- Departament de Biologia Animal, Biologia Vegetal i Ecologia, Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
| | - Denis M Larkin
- Biología Evolutiva, Instituto de Ecología A.C, Xalapa, Veracruz, Apartado, Mexico
| | - Aurora Ruiz-Herrera
- Genome Integrity and Instability Group, Institut de Biotecnologia i Biomedicina (IBB), Universitat Autònoma de Barcelona (UAB), Barcelona, Spain.,Sequentia Biotech S.L. Calle Comte d'Urgell, Barcelona, Spain
| |
Collapse
|
84
|
Poot M. Neocentromeres to the Rescue of Acentric Chromosome Fragments. Mol Syndromol 2017; 8:279-281. [PMID: 29230156 DOI: 10.1159/000481332] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2017] [Indexed: 11/19/2022] Open
|
85
|
Heide M, Long KR, Huttner WB. Novel gene function and regulation in neocortex expansion. Curr Opin Cell Biol 2017; 49:22-30. [PMID: 29227861 DOI: 10.1016/j.ceb.2017.11.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 11/18/2017] [Accepted: 11/26/2017] [Indexed: 01/01/2023]
Abstract
The expansion of the neocortex during human evolution is due to changes in our genome that result in increased and prolonged proliferation of neural stem and progenitor cells during neocortex development. Three principal types of such genomic changes can be distinguished, first, novel gene regulation in human, second, novel function in human of genes existing in both human and non-human species, and third, novel, human-specific genes. The latter comprise both, increases in the copy number of genes existing also in non-human species, and the emergence of genes giving rise to unique, human-specific gene products. Examples of all these types of changes in the human genome have been identified, with ARHGAP11B constituting a paradigmatic example of a unique, human-specific protein.
Collapse
Affiliation(s)
- Michael Heide
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, D-01307 Dresden, Germany
| | - Katherine R Long
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, D-01307 Dresden, Germany
| | - Wieland B Huttner
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, D-01307 Dresden, Germany.
| |
Collapse
|
86
|
Shapiro JA. Living Organisms Author Their Read-Write Genomes in Evolution. BIOLOGY 2017; 6:E42. [PMID: 29211049 PMCID: PMC5745447 DOI: 10.3390/biology6040042] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Revised: 11/17/2017] [Accepted: 11/28/2017] [Indexed: 12/18/2022]
Abstract
Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with "non-coding" DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called "non-coding" RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations.
Collapse
Affiliation(s)
- James A Shapiro
- Department of Biochemistry and Molecular Biology, University of Chicago GCIS W123B, 979 E. 57th Street, Chicago, IL 60637, USA.
| |
Collapse
|
87
|
Wang J, Samuels DC, Zhao S, Xiang Y, Zhao YY, Guo Y. Current Research on Non-Coding Ribonucleic Acid (RNA). Genes (Basel) 2017; 8:366. [PMID: 29206165 PMCID: PMC5748684 DOI: 10.3390/genes8120366] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Revised: 11/16/2017] [Accepted: 11/21/2017] [Indexed: 11/16/2022] Open
Abstract
Non-coding ribonucleic acid (RNA) has without a doubt captured the interest of biomedical researchers. The ability to screen the entire human genome with high-throughput sequencing technology has greatly enhanced the identification, annotation and prediction of the functionality of non-coding RNAs. In this review, we discuss the current landscape of non-coding RNA research and quantitative analysis. Non-coding RNA will be categorized into two major groups by size: long non-coding RNAs and small RNAs. In long non-coding RNA, we discuss regular long non-coding RNA, pseudogenes and circular RNA. In small RNA, we discuss miRNA, transfer RNA, piwi-interacting RNA, small nucleolar RNA, small nuclear RNA, Y RNA, single recognition particle RNA, and 7SK RNA. We elaborate on the origin, detection method, and potential association with disease, putative functional mechanisms, and public resources for these non-coding RNAs. We aim to provide readers with a complete overview of non-coding RNAs and incite additional interest in non-coding RNA research.
Collapse
Affiliation(s)
- Jing Wang
- Department of Biostatistics, Vanderbilt University, Medical Center, Nashville, TN 37232, USA.
| | - David C Samuels
- Department of Molecular Physiology and Biophysics, Vanderbilt Genetics Institute, Vanderbilt University Medical School, Nashville, TN 37232, USA.
| | - Shilin Zhao
- Department of Biostatistics, Vanderbilt University, Medical Center, Nashville, TN 37232, USA.
| | - Yu Xiang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| | - Ying-Yong Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi'an 710069, Shaanxi, China.
| | - Yan Guo
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi'an 710069, Shaanxi, China.
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87102, USA.
| |
Collapse
|
88
|
Abstract
PURPOSE OF REVIEW Copy number variation (CNV) disorders arise from the dosage imbalance of one or more gene(s), resulting from deletions, duplications or other genomic rearrangements that lead to the loss or gain of genetic material. Several disorders, characterized by multiple birth defects and neurodevelopmental abnormalities, have been associated with relatively large (>1 Mb) and often recurrent CNVs. CNVs have also been implicated in the etiology of neuropsychiatric disorders including autism and schizophrenia as well as other common complex diseases. Thus, CNVs have a significant impact on human health and disease. RECENT FINDINGS The use of increasingly higher resolution, genomewide analysis has greatly enhanced the detection of genetic variation, including CNVs. Furthermore, the availability of comprehensive genetic variation data from large cohorts of healthy controls has the potential to greatly improve the identification of disease associated genetic variants in patient samples. SUMMARY This review discusses the current knowledge about CNV disorders, including the mechanisms underlying their formation and phenotypic outcomes, and the advantages and limitations of current methods of detection and disease association.
Collapse
Affiliation(s)
- Tamim H. Shaikh
- Department of Pediatrics, Section of Clinical Genetics and Metabolism, University of Colorado Denver, Aurora, CO 80045
| |
Collapse
|
89
|
Chiatante G, Giannuzzi G, Calabrese FM, Eichler EE, Ventura M. Centromere Destiny in Dicentric Chromosomes: New Insights from the Evolution of Human Chromosome 2 Ancestral Centromeric Region. Mol Biol Evol 2017; 34:1669-1681. [PMID: 28333343 DOI: 10.1093/molbev/msx108] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Dicentric chromosomes are products of genomic rearrangements that place two centromeres on the same chromosome. Due to the presence of two primary constrictions, they are inherently unstable and overcome their instability by epigenetically inactivating and/or deleting one of the two centromeres, thus resulting in functionally monocentric chromosomes that segregate normally during cell division. Our understanding to date of dicentric chromosome formation, behavior and fate has been largely inferred from observational studies in plants and humans as well as artificially produced de novo dicentrics in yeast and in human cells. We investigate the most recent product of a chromosome fusion event fixed in the human lineage, human chromosome 2, whose stability was acquired by the suppression of one centromere, resulting in a unique difference in chromosome number between humans (46 chromosomes) and our most closely related ape relatives (48 chromosomes). Using molecular cytogenetics, sequencing, and comparative sequence data, we deeply characterize the relicts of the chromosome 2q ancestral centromere and its flanking regions, gaining insight into the ancestral organization that can be easily broadened to all acrocentric chromosome centromeres. Moreover, our analyses offered the opportunity to trace the evolutionary history of rDNA and satellite III sequences among great apes, thus suggesting a new hypothesis for the preferential inactivation of some human centromeres, including IIq. Our results suggest two possible centromere inactivation models to explain the evolutionarily stabilization of human chromosome 2 over the last 5-6 million years. Our results strongly favor centromere excision through a one-step process.
Collapse
Affiliation(s)
- Giorgia Chiatante
- Department of Biology, University of Bari "Aldo Moro", Bari, Italy.,Department of Biology, Anthropology Laboratories University of Florence, Florence, Italy
| | - Giuliana Giannuzzi
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA
| | - Mario Ventura
- Department of Biology, University of Bari "Aldo Moro", Bari, Italy
| |
Collapse
|
90
|
Structural Variation Shapes the Landscape of Recombination in Mouse. Genetics 2017; 206:603-619. [PMID: 28592499 DOI: 10.1534/genetics.116.197988] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 03/13/2017] [Indexed: 01/02/2023] Open
Abstract
Meiotic recombination is an essential feature of sexual reproduction that ensures faithful segregation of chromosomes and redistributes genetic variants in populations. Multiparent populations such as the Diversity Outbred (DO) mouse stock accumulate large numbers of crossover (CO) events between founder haplotypes, and thus present a unique opportunity to study the role of genetic variation in shaping the recombination landscape. We obtained high-density genotype data from [Formula: see text] DO mice, and localized 2.2 million CO events to intervals with a median size of 28 kb. The resulting sex-averaged genetic map of the DO population is highly concordant with large-scale (order 10 Mb) features of previously reported genetic maps for mouse. To examine fine-scale (order 10 kb) patterns of recombination in the DO, we overlaid putative recombination hotspots onto our CO intervals. We found that CO intervals are enriched in hotspots compared to the genomic background. However, as many as [Formula: see text] of CO intervals do not overlap any putative hotspots, suggesting that our understanding of hotspots is incomplete. We also identified coldspots encompassing 329 Mb, or [Formula: see text] of observable genome, in which there is little or no recombination. In contrast to hotspots, which are a few kilobases in size, and widely scattered throughout the genome, coldspots have a median size of 2.1 Mb and are spatially clustered. Coldspots are strongly associated with copy-number variant (CNV) regions, especially multi-allelic clusters, identified from whole-genome sequencing of 228 DO mice. Genes in these regions have reduced expression, and epigenetic features of closed chromatin in male germ cells, which suggests that CNVs may repress recombination by altering chromatin structure in meiosis. Our findings demonstrate how multiparent populations, by bridging the gap between large-scale and fine-scale genetic mapping, can reveal new features of the recombination landscape.
Collapse
|
91
|
Segmental duplications: evolution and impact among the current Lepidoptera genomes. BMC Evol Biol 2017; 17:161. [PMID: 28683762 PMCID: PMC5499213 DOI: 10.1186/s12862-017-1007-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 06/23/2017] [Indexed: 11/10/2022] Open
Abstract
Background Structural variation among genomes is now viewed to be as important as single nucleoid polymorphisms in influencing the phenotype and evolution of a species. Segmental duplication (SD) is defined as segments of DNA with homologous sequence. Results Here, we performed a systematic analysis of segmental duplications (SDs) among five lepidopteran reference genomes (Plutella xylostella, Danaus plexippus, Bombyx mori, Manduca sexta and Heliconius melpomene) to understand their potential impact on the evolution of these species. We find that the SDs content differed substantially among species, ranging from 1.2% of the genome in B. mori to 15.2% in H. melpomene. Most SDs formed very high identity (similarity higher than 90%) blocks but had very few large blocks. Comparative analysis showed that most of the SDs arose after the divergence of each linage and we found that P. xylostella and H. melpomene showed more duplications than other species, suggesting they might be able to tolerate extensive levels of variation in their genomes. Conserved ancestral and species specific SD events were assessed, revealing multiple examples of the gain, loss or maintenance of SDs over time. SDs content analysis showed that most of the genes embedded in SDs regions belonged to species-specific SDs (“Unique” SDs). Functional analysis of these genes suggested their potential roles in the lineage-specific evolution. SDs and flanking regions often contained transposable elements (TEs) and this association suggested some involvement in SDs formation. Further studies on comparison of gene expression level between SDs and non-SDs showed that the expression level of genes embedded in SDs was significantly lower, suggesting that structure changes in the genomes are involved in gene expression differences in species. Conclusions The results showed that most of the SDs were “unique SDs”, which originated after species formation. Functional analysis suggested that SDs might play different roles in different species. Our results provide a valuable resource beyond the genetic mutation to explore the genome structure for future Lepidoptera research. Electronic supplementary material The online version of this article (doi:10.1186/s12862-017-1007-y) contains supplementary material, which is available to authorized users.
Collapse
|
92
|
Abstract
Human genetic studies have been the driving force in bringing to light the underlying biology of psychiatric conditions. As these studies fill in the gaps in our knowledge of the mechanisms at play, we will be better equipped to design therapies in rational and targeted ways, or repurpose existing therapies in previously unanticipated ways. This review is intended for those unfamiliar with psychiatric genetics as a field and provides a primer on different modes of genetic variation, the technologies currently used to probe them, and concepts that provide context for interpreting the gene-phenotype relationship. Like other subfields in human genetics, psychiatric genetics is moving from microarray technology to sequencing-based approaches as barriers of cost and expertise are removed, and the ramifications of this transition are discussed here. A summary is then given of recent genetic discoveries in a number of neuropsychiatric conditions, with particular emphasis on neurodevelopmental conditions. The general impact of genetics on drug development has been to underscore the extensive etiological heterogeneity in seemingly cohesive diagnostic categories. Consequently, the path forward is not in therapies hoping to reach large swaths of patients sharing a clinically defined diagnosis, but rather in targeting patients belonging to specific "biotypes" defined through a combination of objective, quantifiable data, including genotype.
Collapse
Affiliation(s)
- Jacob J Michaelson
- Department of Psychiatry, University of Iowa Carver College of Medicine, Iowa City, IA, USA.
- Department of Biomedical Engineering, University of Iowa College of Engineering, Iowa City, IA, USA.
- Department of Communication Sciences and Disorders, University of Iowa College of Liberal Arts and Sciences, Iowa City, IA, USA.
- Iowa Institute of Human Genetics, University of Iowa, Iowa City, IA, USA.
- Genetics Cluster Initiative, University of Iowa, Iowa City, IA, USA.
- The DeLTA Center, University of Iowa, Iowa City, IA, USA.
- University of Iowa Informatics Initiative, University of Iowa, Iowa City, IA, USA.
| |
Collapse
|
93
|
Peng T, Li G, Zhong X, Wang L. Does copy number variation of APOL1 gene affect the susceptibility to focal segmental glomerulosclerosis? Ren Fail 2017; 39:500-504. [PMID: 28494221 PMCID: PMC6014314 DOI: 10.1080/0886022x.2017.1323646] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Background: APOL1 risk variants (G1 and G2) are associated with increased susceptibility to focal segmental glomerulosclerosis (FSGS) in African population. However, the two risk mutations were not found in Chinese FSGS patients. In this study, we explored the association between the copy number variation (CNV) of APOL1 gene and FSGS. Methods: APOL1 copy number variations were detected by quantitative real-time PCR with TaqMan probes and compared between 133 FSGS patients and 123 controls. The association between CNV of APOL1 gene and clinical parameters was also investigated. Results: The distribution of APOL1 CNV did not show significant difference between FSGS patients and controls. The creatinine and proteinuria in the high copy number group (CN ≥ 3) were higher than the other two groups, but the difference was not significant (p > .05). The FSGS pathological types were different among the three groups. Conclusion: There was no significant difference in the distribution of APOL1 gene copy variants between FSGS patients and normal controls, and there was no significant correlation between the APOL1 gene CNV and the FSGS patients’ clinical manifestations. APOL1 CNVs may be not associated with susceptibility to FSGS.
Collapse
Affiliation(s)
- Ting Peng
- a School of Medicine, University of Electronic Science and Technology of China, Renal Division and Institute of Nephrology , Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital , Chengdu , China
| | - Guisen Li
- a School of Medicine, University of Electronic Science and Technology of China, Renal Division and Institute of Nephrology , Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital , Chengdu , China
| | - Xiang Zhong
- a School of Medicine, University of Electronic Science and Technology of China, Renal Division and Institute of Nephrology , Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital , Chengdu , China
| | - Li Wang
- a School of Medicine, University of Electronic Science and Technology of China, Renal Division and Institute of Nephrology , Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital , Chengdu , China
| |
Collapse
|
94
|
Padhi A, Shen B, Jiang J, Zhou Y, Liu GE, Ma L. Ruminant-specific multiple duplication events of PRDM9 before speciation. BMC Evol Biol 2017; 17:79. [PMID: 28292260 PMCID: PMC5351255 DOI: 10.1186/s12862-017-0892-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 01/26/2017] [Indexed: 11/30/2022] Open
Abstract
Background Understanding the genetic and evolutionary mechanisms of speciation genes in sexually reproducing organisms would provide important insights into mammalian reproduction and fitness. PRDM9, a widely known speciation gene, has recently gained attention for its important role in meiotic recombination and hybrid incompatibility. Despite the fact that PRDM9 is a key regulator of recombination and plays a dominant role in hybrid incompatibility, little is known about the underlying genetic and evolutionary mechanisms that generated multiple copies of PRDM9 in many metazoan lineages. Results The present study reports (1) evidence of ruminant-specific multiple gene duplication events, which likely have had occurred after the ancestral ruminant population diverged from its most recent common ancestor and before the ruminant speciation events, (2) presence of three copies of PRDM9, one copy (lineages I) in chromosome 1 (chr1) and two copies (lineages II & III) in chromosome X (chrX), thus indicating the possibility of ancient inter- and intra-chromosomal unequal crossing over and gene conversion events, (3) while lineages I and II are characterized by the presence of variable tandemly repeated C2H2 zinc finger (ZF) arrays, lineage III lost these arrays, and (4) C2H2 ZFs of lineages I and II, particularly the amino acid residues located at positions −1, 3, and 6 have evolved under strong positive selection. Conclusions Our results demonstrated two gene duplication events of PRDM9 in ruminants: an inter-chromosomal duplication that occurred between chr1 and chrX, and an intra-chromosomal X-linked duplication, which resulted in two additional copies of PRDM9 in ruminants. The observation of such duplication between chrX and chr1 is rare and may possibly have happened due to unequal crossing-over millions of years ago when sex chromosomes were independently derived from a pair of ancestral autosomes. Two copies (lineages I & II) are characterized by the presence of variable sized tandem-repeated C2H2 ZFs and evolved under strong positive selection and concerted evolution, supporting the notion of well-established Red Queen hypothesis. Collectively, gene duplication, concerted evolution, and positive selection are the likely driving forces for the expansion of ruminant PRDM9 sub-family. Electronic supplementary material The online version of this article (doi:10.1186/s12862-017-0892-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Abinash Padhi
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD, 20742, USA.
| | - Botong Shen
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD, 20742, USA
| | - Jicai Jiang
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD, 20742, USA
| | - Yang Zhou
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, 20705, USA.,College of Animal Science and Technology, Northwest A & F University, Shaanxi Key Laboratory of Agricultural Molecular Biology, Yangling, Shaanxi, 712100, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, 20705, USA
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD, 20742, USA.
| |
Collapse
|
95
|
Dougherty ML, Nuttle X, Penn O, Nelson BJ, Huddleston J, Baker C, Harshman L, Duyzend MH, Ventura M, Antonacci F, Sandstrom R, Dennis MY, Eichler EE. The birth of a human-specific neural gene by incomplete duplication and gene fusion. Genome Biol 2017; 18:49. [PMID: 28279197 PMCID: PMC5345166 DOI: 10.1186/s13059-017-1163-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/27/2017] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Gene innovation by duplication is a fundamental evolutionary process but is difficult to study in humans due to the large size, high sequence identity, and mosaic nature of segmental duplication blocks. The human-specific gene hydrocephalus-inducing 2, HYDIN2, was generated by a 364 kbp duplication of 79 internal exons of the large ciliary gene HYDIN from chromosome 16q22.2 to chromosome 1q21.1. Because the HYDIN2 locus lacks the ancestral promoter and seven terminal exons of the progenitor gene, we sought to characterize transcription at this locus by coupling reverse transcription polymerase chain reaction and long-read sequencing. RESULTS 5' RACE indicates a transcription start site for HYDIN2 outside of the duplication and we observe fusion transcripts spanning both the 5' and 3' breakpoints. We observe extensive splicing diversity leading to the formation of altered open reading frames (ORFs) that appear to be under relaxed selection. We show that HYDIN2 adopted a new promoter that drives an altered pattern of expression, with highest levels in neural tissues. We estimate that the HYDIN duplication occurred ~3.2 million years ago and find that it is nearly fixed (99.9%) for diploid copy number in contemporary humans. Examination of 73 chromosome 1q21 rearrangement patients reveals that HYDIN2 is deleted or duplicated in most cases. CONCLUSIONS Together, these data support a model of rapid gene innovation by fusion of incomplete segmental duplications, altered tissue expression, and potential subfunctionalization or neofunctionalization of HYDIN2 early in the evolution of the Homo lineage.
Collapse
Affiliation(s)
- Max L Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Xander Nuttle
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Osnat Penn
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Lana Harshman
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Michael H Duyzend
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Mario Ventura
- Department of Biology, University of Bari, Bari, 70121, Italy
| | | | | | - Megan Y Dennis
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, 95616, CA, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA.
| |
Collapse
|
96
|
Bekpen C, Künzel S, Xie C, Eaaswarkhanth M, Lin YL, Gokcumen O, Akdis CA, Tautz D. Segmental duplications and evolutionary acquisition of UV damage response in the SPATA31 gene family of primates and humans. BMC Genomics 2017; 18:222. [PMID: 28264649 PMCID: PMC5338094 DOI: 10.1186/s12864-017-3595-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 02/20/2017] [Indexed: 12/11/2022] Open
Abstract
Background Segmental duplications are an abundant source for novel gene functions and evolutionary adaptations. This mechanism of generating novelty was very active during the evolution of primates particularly in the human lineage. Here, we characterize the evolution and function of the SPATA31 gene family (former designation FAM75A), which was previously shown to be among the gene families with the strongest signal of positive selection in hominoids. The mouse homologue for this gene family is a single copy gene expressed during spermatogenesis. Results We show that in primates, the SPATA31 gene duplicated into SPATA31A and SPATA31C types and broadened the expression into many tissues. Each type became further segmentally duplicated in the line towards humans with the largest number of full-length copies found for SPATA31A in humans. Copy number estimates of SPATA31A based on digital PCR show an average of 7.5 with a range of 5–11 copies per diploid genome among human individuals. The primate SPATA31 genes also acquired new protein domains that suggest an involvement in UV response and DNA repair. We generated antibodies and show that the protein is re-localized from the nucleolus to the whole nucleus upon UV-irradiation suggesting a UV damage response. We used CRISPR/Cas mediated mutagenesis to knockout copies of the gene in human primary fibroblast cells. We find that cell lines with reduced functional copies as well as naturally occurring low copy number HFF cells show enhanced sensitivity towards UV-irradiation. Conclusion The acquisition of new SPATA31 protein functions and its broadening of expression may be related to the evolution of the diurnal life style in primates that required a higher UV tolerance. The increased segmental duplications in hominoids as well as its fast evolution suggest the acquisition of further specific functions particularly in humans. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3595-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Cemalettin Bekpen
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany.
| | - Sven Künzel
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany
| | - Chen Xie
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany
| | - Muthukrishnan Eaaswarkhanth
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, 14260-1300, NY, USA.,Present address: Population Genomics and Genetic Epidemiology Unit, Dasman Diabetes Institute, P.O.Box 1180, Dasman, 15462, Kuwait
| | - Yen-Lung Lin
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, 14260-1300, NY, USA
| | - Omer Gokcumen
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, 14260-1300, NY, USA
| | - Cezmi A Akdis
- Swiss Institute of Allergy and Asthma Research (SIAF), Davos, CH-7270, Switzerland
| | - Diethard Tautz
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany.
| |
Collapse
|
97
|
Dennis MY, Harshman L, Nelson BJ, Penn O, Cantsilieris S, Huddleston J, Antonacci F, Penewit K, Denman L, Raja A, Baker C, Mark K, Malig M, Janke N, Espinoza C, Stessman HAF, Nuttle X, Hoekzema K, Lindsay-Graves TA, Wilson RK, Eichler EE. The evolution and population diversity of human-specific segmental duplications. Nat Ecol Evol 2017; 1:69. [PMID: 28580430 PMCID: PMC5450946 DOI: 10.1038/s41559-016-0069] [Citation(s) in RCA: 97] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Segmental duplications contribute to human evolution, adaptation and genomic instability but are often poorly characterized. We investigate the evolution, genetic variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs based on analysis of 322 deeply sequenced archaic and contemporary hominid genomes. We sequence 550 human and nonhuman primate genomic clones to reconstruct the evolution of the largest, most complex regions with protein-coding potential (n=80 genes/33 gene families). We show that HSDs are non-randomly organized, associate preferentially with ancestral ape duplications termed “core duplicons”, and evolved primarily in an interspersed inverted orientation. In addition to Homo sapiens-specific gene expansions (e.g., TCAF1/2), we highlight ten gene families (e.g., ARHGAP11B and SRGAP2C) where copy number never returns to the ancestral state, there is evidence of mRNA splicing, and no common gene-disruptive mutations are observed in the general population. Such duplicates are candidates for the evolution of human-specific adaptive traits.
Collapse
Affiliation(s)
- Megan Y Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA 95616, USA.,Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Lana Harshman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Osnat Penn
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Stuart Cantsilieris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Francesca Antonacci
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro", Bari 70125, Italy
| | - Kelsi Penewit
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Laura Denman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Archana Raja
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kenneth Mark
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Maika Malig
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Nicolette Janke
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Claudia Espinoza
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Holly A F Stessman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Xander Nuttle
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Tina A Lindsay-Graves
- McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Richard K Wilson
- McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
98
|
Srinivasan S, Bettella F, Hassani S, Wang Y, Witoelar A, Schork AJ, Thompson WK, Collier DA, Desikan RS, Melle I, Dale AM, Djurovic S, Andreassen OA. Probing the Association between Early Evolutionary Markers and Schizophrenia. PLoS One 2017; 12:e0169227. [PMID: 28081145 PMCID: PMC5231388 DOI: 10.1371/journal.pone.0169227] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Accepted: 12/13/2016] [Indexed: 12/31/2022] Open
Abstract
Schizophrenia is suggested to be a by-product of the evolution in humans, a compromise for our language, creative thinking and cognitive abilities, and thus, essentially, a human disorder. The time of its origin during the course of human evolution remains unclear. Here we investigate several markers of early human evolution and their relationship to the genetic risk of schizophrenia. We tested the schizophrenia evolutionary hypothesis by analyzing genome-wide association studies of schizophrenia and other human phenotypes in a statistical framework suited for polygenic architectures. We analyzed evolutionary proxy measures: human accelerated regions, segmental duplications, and ohnologs, representing various time periods of human evolution for overlap with the human genomic loci associated with schizophrenia. Polygenic enrichment plots suggest a higher prevalence of schizophrenia associations in human accelerated regions, segmental duplications and ohnologs. However, the enrichment is mostly accounted for by linkage disequilibrium, especially with functional elements like introns and untranslated regions. Our results did not provide clear evidence that markers of early human evolution are more likely associated with schizophrenia. While SNPs associated with schizophrenia are enriched in HAR, Ohno and SD regions, the enrichment seems to be mediated by affiliation to known genomic enrichment categories. Taken together with previous results, these findings suggest that schizophrenia risk may have mainly developed more recently in human evolution.
Collapse
Affiliation(s)
- Saurabh Srinivasan
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Francesco Bettella
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Sahar Hassani
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Yunpeng Wang
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Aree Witoelar
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Andrew J. Schork
- Multimodal Imaging Laboratory, University of California at San Diego, La Jolla, CA, United States of America
- Cognitive Sciences Graduate Program, University of California, San Diego, La Jolla, CA, United States of America
- Center for Human Development, University of California at San Diego, La Jolla, CA, United States of America
| | - Wesley K. Thompson
- Institute of Biological Psychiatry, Mental Health Center St. Hans, Mental Health Services Copenhagen, Roskilde, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - David A. Collier
- Eli Lilly & Co, Erl Wood Manor, Windlesham, Surrey, United Kingdom
| | - Rahul S. Desikan
- Neuroradiology Section, Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA, United States of America
| | - Ingrid Melle
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Anders M. Dale
- Multimodal Imaging Laboratory, University of California at San Diego, La Jolla, CA, United States of America
- Center for Human Development, University of California at San Diego, La Jolla, CA, United States of America
- Department of Neuroscience, University of California at San Diego, La Jolla, CA, United States of America
- Neuroradiology Section, Department of Radiology and Biomedical Imaging, University of California at San Francisco, San Francisco, CA, United States of America
| | - Srdjan Djurovic
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
- NORMENT, KG Jebsen Centre for Psychosis Research, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Ole A. Andreassen
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
- Institute of Biological Psychiatry, Mental Health Center St. Hans, Mental Health Services Copenhagen, Roskilde, Denmark
- * E-mail:
| |
Collapse
|
99
|
Jahic A, Hinreiner S, Emberger W, Hehr U, Zuchner S, Beetz C. Doublet-Mediated DNA Rearrangement-A Novel and Potentially Underestimated Mechanism for the Formation of Recurrent Pathogenic Deletions. Hum Mutat 2016; 38:275-278. [PMID: 28008689 DOI: 10.1002/humu.23162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Revised: 12/02/2016] [Accepted: 12/14/2016] [Indexed: 11/09/2022]
Abstract
Deletions and duplications of genomic DNA contribute to evolution, phenotypic diversity, and human disease. The underlying mechanisms are incompletely understood. We identified deletions of exon 10 of the SPAST gene in two unrelated families with hereditary spastic paraplegia. We excluded a founder event, but observed that the breakpoints map to identical repeat regions. These regions likely represent an intragenic "doublet," that is, an enigmatic class of local duplications. The fusion sequences for both deletions are compatible with recombination-based as well as with replication-based mechanisms. Searching the literature, we identified a partial SLC24A4 deletion that involved two copies of another doublet, and was likely formed in an analogous way. Comparing the SPAST and the SLC24A4 doublets with doublets identified previously suggested that many additional doublets have a high potential for triggering rearrangements. Considering that doublets are still being formed in the human genome, and that they likely create high local instability, we suggest that a two-step mechanism consisting of doublet generation and subsequent doublet-mediated deletion/duplication may underlie certain copy-number changes for which other mechanisms are currently assumed. Further studies are necessary to delineate the significance of the thus-far understudied doublets for the formation of copy-number variation.
Collapse
Affiliation(s)
- Amir Jahic
- Department of Clinical Chemistry and Laboratory Medicine, Jena University Hospital, Jena, Germany
| | - Sophie Hinreiner
- Department of Human Genetics, University of Regensburg, Regensburg, Germany
| | - Werner Emberger
- Department of Human Genetics, Graz Medical University, Graz, Austria
| | - Ute Hehr
- Department of Human Genetics, University of Regensburg, Regensburg, Germany
| | - Stephan Zuchner
- John T. Macdonald Department of Human Genetics and John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, Florida
| | - Christian Beetz
- Department of Clinical Chemistry and Laboratory Medicine, Jena University Hospital, Jena, Germany
| |
Collapse
|
100
|
Shao M, Moret BME. On Computing Breakpoint Distances for Genomes with Duplicate Genes. J Comput Biol 2016; 24:571-580. [PMID: 27788022 DOI: 10.1089/cmb.2016.0149] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.
Collapse
Affiliation(s)
- Mingfu Shao
- 1 Laboratory for Computational Biology and Bioinformatics, School of Computer and Communication Sciences , École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland .,2 Computational Biology Department, School of Computer Science, Carnegie Mellon University , Pittsburgh, Pennsylvania
| | - Bernard M E Moret
- 1 Laboratory for Computational Biology and Bioinformatics, School of Computer and Communication Sciences , École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|