1
|
de Oliveira AM, Souza GM, Toma GA, Dos Santos N, Dos Santos RZ, Goes CAG, Deon GA, Setti PG, Porto-Foresti F, Utsunomia R, Gunski RJ, Del Valle Garnero A, Herculano Correa de Oliveira E, Kretschmer R, Cioffi MDB. Satellite DNAs, heterochromatin, and sex chromosomes of the wattled jacana (Charadriiformes; Jacanidae): a species with highly rearranged karyotype. Genome 2024; 67:109-118. [PMID: 38316150 DOI: 10.1139/gen-2023-0082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Charadriiformes, which comprises shorebirds and their relatives, is one of the most diverse avian orders, with over 390 species showing a wide range of karyotypes. Here, we isolated and characterized the whole collection of satellite DNAs (satDNAs) at both molecular and cytogenetic levels of one of its representative species, named the wattled jacana (Jacana jacana), a species that contains a typical ZZ/ZW sex chromosome system and a highly rearranged karyotype. In addition, we also investigate the in situ location of telomeric and microsatellite repeats. A small catalog of 11 satDNAs was identified that typically accumulated on microchromosomes and on the W chromosome. The latter also showed a significant accumulation of telomeric signals, being (GA)10 the only microsatellite with positive hybridization signals among all the 16 tested ones. These current findings contribute to our understanding of the genomic organization of repetitive DNAs in a bird species with high degree of chromosomal reorganization contrary to the majority of bird species that have stable karyotypes.
Collapse
Affiliation(s)
- Alan Moura de Oliveira
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, São Paulo, Brazil
| | - Guilherme Mota Souza
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, São Paulo, Brazil
| | - Gustavo Akira Toma
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, São Paulo, Brazil
| | | | | | | | - Geize Aparecida Deon
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, São Paulo, Brazil
| | - Princia Grejo Setti
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, São Paulo, Brazil
| | | | | | | | | | | | - Rafael Kretschmer
- Departamento de Ecologia, Zoologia e Genética, Instituto de Biologia, Universidade Federal de Pelotas, Pelotas, Rio Grande do Sul, Brazil
| | - Marcelo de Bello Cioffi
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, São Paulo, Brazil
| |
Collapse
|
2
|
Kretschmer R, Toma GA, Deon GA, dos Santos N, dos Santos RZ, Utsunomia R, Porto-Foresti F, Gunski RJ, Garnero ADV, Liehr T, de Oliveira EHC, de Freitas TRO, Cioffi MDB. Satellitome Analysis in the Southern Lapwing ( Vanellus chilensis) Genome: Implications for SatDNA Evolution in Charadriiform Birds. Genes (Basel) 2024; 15:258. [PMID: 38397247 PMCID: PMC10887557 DOI: 10.3390/genes15020258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 02/14/2024] [Accepted: 02/16/2024] [Indexed: 02/25/2024] Open
Abstract
Vanellus (Charadriidae; Charadriiformes) comprises around 20 species commonly referred to as lapwings. In this study, by integrating cytogenetic and genomic approaches, we assessed the satellite DNA (satDNA) composition of one typical species, Vanellus chilensis, with a highly conserved karyotype. We additionally underlined its role in the evolution, structure, and differentiation process of the present ZW sex chromosome system. Seven distinct satellite DNA families were identified within its genome, accumulating on the centromeres, microchromosomes, and the W chromosome. However, these identified satellite DNA families were not found in two other Charadriiformes members, namely Jacana jacana and Calidris canutus. The hybridization of microsatellite sequences revealed the presence of a few repetitive sequences in V. chilensis, with only two out of sixteen displaying positive hybridization signals. Overall, our results contribute to understanding the genomic organization and satDNA evolution in Charadriiform birds.
Collapse
Affiliation(s)
- Rafael Kretschmer
- Departamento de Ecologia, Zoologia e Genética, Universidade Federal de Pelotas, Pelotas 96010-900, RS, Brazil;
| | - Gustavo A. Toma
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos 13565-905, SP, Brazil; (G.A.T.); (G.A.D.); (M.d.B.C.)
| | - Geize Aparecida Deon
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos 13565-905, SP, Brazil; (G.A.T.); (G.A.D.); (M.d.B.C.)
| | - Natalia dos Santos
- Faculdade de Ciências, Universidade Estadual Paulista, Bauru 13506-900, SP, Brazil; (N.d.S.); (R.Z.d.S.); (R.U.); (F.P.-F.)
| | - Rodrigo Zeni dos Santos
- Faculdade de Ciências, Universidade Estadual Paulista, Bauru 13506-900, SP, Brazil; (N.d.S.); (R.Z.d.S.); (R.U.); (F.P.-F.)
| | - Ricardo Utsunomia
- Faculdade de Ciências, Universidade Estadual Paulista, Bauru 13506-900, SP, Brazil; (N.d.S.); (R.Z.d.S.); (R.U.); (F.P.-F.)
| | - Fabio Porto-Foresti
- Faculdade de Ciências, Universidade Estadual Paulista, Bauru 13506-900, SP, Brazil; (N.d.S.); (R.Z.d.S.); (R.U.); (F.P.-F.)
| | - Ricardo José Gunski
- Laboratório de Diversidade Genética Animal, Universidade Federal do Pampa, São Gabriel 97300-162, RS, Brazil; (R.J.G.); (A.D.V.G.)
| | - Analía Del Valle Garnero
- Laboratório de Diversidade Genética Animal, Universidade Federal do Pampa, São Gabriel 97300-162, RS, Brazil; (R.J.G.); (A.D.V.G.)
| | - Thomas Liehr
- Institute of Human Genetics, Friedrich Schiller University, University Hospital Jena, 07747 Jena, Germany
| | - Edivaldo Herculano Corra de Oliveira
- Laboratório de Citogenô mica e Mutagênese Ambiental, Seção de Meio Ambiente, Instituto Evandro Chagas, Ananindeua 67030-000, PA, Brazil;
- Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém 66075-110, PA, Brazil
| | - Thales Renato Ochotorena de Freitas
- Laboratório de Citogenética e Evolução, Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre 91509-900, RS, Brazil;
| | - Marcelo de Bello Cioffi
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos 13565-905, SP, Brazil; (G.A.T.); (G.A.D.); (M.d.B.C.)
| |
Collapse
|
3
|
Koga A, Nishihara H, Tanabe H, Tanaka R, Kayano R, Matsumoto S, Endo T, Srikulnath K, O'Neill RJ. Kangaroo endogenous retrovirus (KERV) forms megasatellite DNA with a simple repetition pattern in which the provirus structure is retained. Virology 2023; 586:56-66. [PMID: 37487326 DOI: 10.1016/j.virol.2023.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 07/07/2023] [Accepted: 07/10/2023] [Indexed: 07/26/2023]
Abstract
The kangaroo endogenous retrovirus (KERV) was previously reported to have undergone a rapid copy number increase in the red-necked wallaby; however, the mode of amplification was left to be clarified. The present study revealed that the long terminal repeat (LTR) (0.6 kb) and internal region (2.0 kb) of a provirus are repeated alternately, forming megasatellite DNA which we named kervRep. This repetition pattern was the same as that observed for walbRep, megasatellite DNA originating from another endogenous retrovirus. Their formation process can be explained using a simple model: pairing slippage followed by homologous recombination. This model features that the initial step is triggered by the presence of two identical sequences within a short distance; the possession of LTRs by endogenous retroviruses fulfills this condition. The discovery of two cases suggests that formation of this type of satellite DNA is one of non-negligible effects of endogenous retroviruses on their host genomes.
Collapse
Affiliation(s)
- Akihiko Koga
- Center for Evolutionary Origins of Human Behavior, Kyoto University, Inuyama 484-8506, Japan; Animal Genomics and Bioresource Research Unit, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand.
| | - Hidenori Nishihara
- School of Life Science and Technology, Tokyo Institute of Technology, Yokohama 226-8503, Japan
| | - Hideyuki Tanabe
- Research Center for Integrative Evolutionary Science, SOKENDAI (The Graduate University for Advanced Studies), Hayama 240-0193, Japan
| | - Rieko Tanaka
- Saitama Children's Zoo, Higashimatsuyama 355-0065, Japan
| | - Rika Kayano
- Saitama Children's Zoo, Higashimatsuyama 355-0065, Japan
| | | | | | - Kornsorn Srikulnath
- Animal Genomics and Bioresource Research Unit, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| | - Rachel J O'Neill
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
4
|
Peona V, Kutschera VE, Blom MPK, Irestedt M, Suh A. Satellite DNA evolution in Corvoidea inferred from short and long reads. Mol Ecol 2023; 32:1288-1305. [PMID: 35488497 DOI: 10.1111/mec.16484] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 04/11/2022] [Accepted: 04/17/2022] [Indexed: 11/29/2022]
Abstract
Satellite DNA (satDNA) is a fast-evolving portion of eukaryotic genomes. The homogeneous and repetitive nature of such satDNA causes problems during the assembly of genomes, and therefore it is still difficult to study it in detail in nonmodel organisms as well as across broad evolutionary timescales. Here, we combined the use of short- and long-read data to explore the diversity and evolution of satDNA between individuals of the same species and between genera of birds spanning ~40 millions of years of bird evolution using birds-of-paradise (Paradisaeidae) and crow (Corvus) species. These avian species highlighted the presence of a GC-rich Corvoidea satellitome composed of 61 satellite families and provided a set of candidate satDNA monomers for being centromeric on the basis of length, abundance, homogeneity and transcription. Surprisingly, we found that the satDNA of crow species rapidly diverged between closely related species while the satDNA appeared more similar between birds-of-paradise species belonging to different genera.
Collapse
Affiliation(s)
- Valentina Peona
- Department of Organismal Biology - Systematic Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Verena E Kutschera
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Solna, Sweden
| | - Mozes P K Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden.,Museum für Naturkunde, Leibniz Institut für Evolutions- und Biodiversitätsforschung, Berlin, Germany
| | - Martin Irestedt
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Alexander Suh
- Department of Organismal Biology - Systematic Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.,School of Biological Sciences-Organisms and the Environment, University of East Anglia, Norwich, UK
| |
Collapse
|
5
|
Zhang S, Pei Z, Lei C, Zhu S, Deng K, Zhou J, Yang J, Lu D, Sun X, Xu C, Xu C. Detection of cryptic balanced chromosomal rearrangements using high-resolution optical genome mapping. J Med Genet 2023; 60:274-284. [PMID: 35710108 DOI: 10.1136/jmedgenet-2022-108553] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 05/28/2022] [Indexed: 12/24/2022]
Abstract
BACKGROUND Chromosomal rearrangements have profound consequences in diverse human genetic diseases. Currently, the detection of balanced chromosomal rearrangements (BCRs) mainly relies on routine cytogenetic G-banded karyotyping. However, cryptic BCRs are hard to detect by karyotyping, and the risk of miscarriage or delivering abnormal offspring with congenital malformations in carrier couples is significantly increased. In the present study, we aimed to investigate the potential of single-molecule optical genome mapping (OGM) in unravelling cryptic chromosomal rearrangements. METHODS Eleven couples with normal karyotypes that had abortions/affected offspring with unbalanced rearrangements were enrolled. Ultra-high-molecular-weight DNA was isolated from peripheral blood cells and processed via OGM. The genome assembly was performed followed by variant calling and annotation. Meanwhile, multiple detection strategies, including FISH, long-range-PCR amplicon-based next-generation sequencing and Sanger sequencing were implemented to confirm the results obtained from OGM. RESULTS High-resolution OGM successfully detected cryptic reciprocal translocation in all recruited couples, which was consistent with the results of FISH and sequencing. All high-confidence cryptic chromosomal translocations detected by OGM were confirmed by sequencing analysis of rearrangement breakpoints. Moreover, OGM revealed additional complex rearrangement events such as inverted aberrations, further refining potential genetic interpretation. CONCLUSION To the best of our knowledge, this is the first study wherein OGM facilitate the rapid and robust detection of cryptic chromosomal reciprocal translocations in clinical practice. With the excellent performance, our findings suggest that OGM is well qualified as an accurate, comprehensive and first-line method for detecting cryptic BCRs in routine clinical testing.
Collapse
Affiliation(s)
- Shuo Zhang
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Zhenle Pei
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Caixia Lei
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Saijuan Zhu
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Ke Deng
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Jing Zhou
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Jingmin Yang
- State Key Laboratory of Genetic Engineering, School of Life Science, Fudan University, Shanghai, China.,NHC Key Laboratory of Birth Defects and Reproductive Health, Chongqing Key Laboratory of Birth Defects and Reproductive Health, Chongqing Population and Family Planning, Science and Technology Research Institute, Chongqing, China
| | - Daru Lu
- State Key Laboratory of Genetic Engineering, School of Life Science, Fudan University, Shanghai, China.,NHC Key Laboratory of Birth Defects and Reproductive Health, Chongqing Key Laboratory of Birth Defects and Reproductive Health, Chongqing Population and Family Planning, Science and Technology Research Institute, Chongqing, China
| | - Xiaoxi Sun
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Chenming Xu
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| | - Congjian Xu
- Shanghai Ji Ai Genetics & IVF Institute, Shanghai Key Laboratory of Female Reproductive Endocrine Related Diseases, Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China
| |
Collapse
|
6
|
Koga A, Hashimoto K, Honda Y, Nishihara H. Marsupial genome analysis suggests that satellite DNA formation from walb endogenous retrovirus is an event specific to the red-necked wallaby. Genes Cells 2023; 28:149-155. [PMID: 36527312 DOI: 10.1111/gtc.12999] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/29/2022] [Accepted: 12/11/2022] [Indexed: 12/23/2022]
Abstract
We recently identified walbRep, a satellite DNA residing in the genome of the red-necked wallaby Notamacropus rufogriseus. It originates from the walb endogenous retrovirus and is organized in a manner in which the provirus structure is retained. The walbRep repeat units feature an average pairwise nucleotide identity as high as 99.5%, raising the possibility of a recent origin. The tammar wallaby N. eugenii is a species estimated to have diverged from the red-necked wallaby 2-3 million years ago. In PCR analyses of these two and other related species, walbRep-specific fragment amplification was observed only in the red-necked wallaby. Sequence database searches for the tammar wallaby resulted in sequence alignment lists that were sufficiently powerful to exclude the possibility of walbRep existence. These results suggested that the walbRep formation occurred in the red-necked wallaby lineage after its divergence from the tammar wallaby lineage, thus in a time span of maximum 3 million years.
Collapse
Affiliation(s)
- Akihiko Koga
- Center for Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan
| | | | - Yusuke Honda
- Noichi Zoological Park of Kochi Prefecture, Konan, Japan
| | - Hidenori Nishihara
- School of Life Science and Technology, Tokyo Institute of Technology, Yokohama, Japan
| |
Collapse
|
7
|
Lundberg M, Mackintosh A, Petri A, Bensch S. Inversions maintain differences between migratory phenotypes of a songbird. Nat Commun 2023; 14:452. [PMID: 36707538 PMCID: PMC9883250 DOI: 10.1038/s41467-023-36167-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 01/18/2023] [Indexed: 01/28/2023] Open
Abstract
Structural rearrangements have been shown to be important in local adaptation and speciation, but have been difficult to reliably identify and characterize in non-model species. Here we combine long reads, linked reads and optical mapping to characterize three divergent chromosome regions in the willow warbler Phylloscopus trochilus, of which two are associated with differences in migration and one with an environmental gradient. We show that there are inversions (0.4-13 Mb) in each of the regions and that the divergence times between inverted and non-inverted haplotypes are similar across the regions (~1.2 Myrs), which is compatible with a scenario where inversions arose in either of two allopatric populations that subsequently hybridized. The improved genomes allow us to detect additional functional differences in the divergent regions, providing candidate genes for migration and adaptations to environmental gradients.
Collapse
Affiliation(s)
- Max Lundberg
- Department of Biology, Lund University, Lund, Sweden.
| | | | - Anna Petri
- Science for Life Laboratory, Uppsala Genome Center, Uppsala University, Uppsala, Sweden
| | | |
Collapse
|
8
|
Westerdahl H, Mellinger S, Sigeman H, Kutschera VE, Proux-Wéra E, Lundberg M, Weissensteiner M, Churcher A, Bunikis I, Hansson B, Wolf JBW, Strandh M. The genomic architecture of the passerine MHC region: high repeat content and contrasting evolutionary histories of single copy and tandemly duplicated MHC genes. Mol Ecol Resour 2022; 22:2379-2395. [PMID: 35348299 DOI: 10.1111/1755-0998.13614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/09/2022] [Accepted: 03/23/2022] [Indexed: 12/01/2022]
Abstract
The Major Histocompatibility Complex (MHC) is of central importance to the immune system, and an optimal MHC diversity is believed to maximize pathogen elimination. Birds show substantial variation in MHC diversity, ranging from few genes in most bird orders to very many genes in passerines. Our understanding of the evolutionary trajectories of the MHC in passerines is hampered by lack of data on genomic organization. Therefore, we assemble and annotate the MHC genomic region of the great reed warbler (Acrocephalus arundinaceus), using long-read sequencing and optical mapping. The MHC region is large (>5.5Mb), characterized by structural changes compared to hitherto investigated bird orders and shows higher repeat content than the genome average. These features were supported by analyses in three additional passerines. MHC genes in passerines are found in two different chromosomal arrangements, either as single copy MHC genes located among non-MHC genes, or as tandemly duplicated tightly linked MHC genes. Some single copy MHC genes are old and putative orthologs among species. In contrast tandemly duplicated MHC genes are monophyletic within species and have evolved by simultaneous gene duplication of several MHC genes. Structural differences in the MHC genomic region among bird orders seem substantial compared to mammals and have possibly been fuelled by clade-specific immune system adaptations. Our study provides methodological guidance in characterizing complex genomic regions, constitutes a resource for MHC research in birds, and calls for a revision of the general belief that avian MHC has a conserved gene order and small size compared to mammals.
Collapse
Affiliation(s)
- Helena Westerdahl
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Samantha Mellinger
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Hanna Sigeman
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Verena E Kutschera
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Box 1031, SE-17121, Solna, Sweden
| | - Estelle Proux-Wéra
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Box 1031, SE-17121, Solna, Sweden
| | - Max Lundberg
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Matthias Weissensteiner
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Allison Churcher
- National Bioinformatics Infrastructure Sweden, Department of Molecular Biology, Umeå University, SE-901 87, Umeå, Sweden
| | - Ignas Bunikis
- Uppsala Genome Center, Science for Life Laboratory, Dept. of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, SE-752 37, Uppsala, Sweden
| | - Bengt Hansson
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Jochen B W Wolf
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Maria Strandh
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| |
Collapse
|
9
|
Gall-Duncan T, Sato N, Yuen RKC, Pearson CE. Advancing genomic technologies and clinical awareness accelerates discovery of disease-associated tandem repeat sequences. Genome Res 2022; 32:1-27. [PMID: 34965938 PMCID: PMC8744678 DOI: 10.1101/gr.269530.120] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 11/29/2021] [Indexed: 11/25/2022]
Abstract
Expansions of gene-specific DNA tandem repeats (TRs), first described in 1991 as a disease-causing mutation in humans, are now known to cause >60 phenotypes, not just disease, and not only in humans. TRs are a common form of genetic variation with biological consequences, observed, so far, in humans, dogs, plants, oysters, and yeast. Repeat diseases show atypical clinical features, genetic anticipation, and multiple and partially penetrant phenotypes among family members. Discovery of disease-causing repeat expansion loci accelerated through technological advances in DNA sequencing and computational analyses. Between 2019 and 2021, 17 new disease-causing TR expansions were reported, totaling 63 TR loci (>69 diseases), with a likelihood of more discoveries, and in more organisms. Recent and historical lessons reveal that properly assessed clinical presentations, coupled with genetic and biological awareness, can guide discovery of disease-causing unstable TRs. We highlight critical but underrecognized aspects of TR mutations. Repeat motifs may not be present in current reference genomes but will be in forthcoming gapless long-read references. Repeat motif size can be a single nucleotide to kilobases/unit. At a given locus, repeat motif sequence purity can vary with consequence. Pathogenic repeats can be "insertions" within nonpathogenic TRs. Expansions, contractions, and somatic length variations of TRs can have clinical/biological consequences. TR instabilities occur in humans and other organisms. TRs can be epigenetically modified and/or chromosomal fragile sites. We discuss the expanding field of disease-associated TR instabilities, highlighting prospects, clinical and genetic clues, tools, and challenges for further discoveries of disease-causing TR instabilities and understanding their biological and pathological impacts-a vista that is about to expand.
Collapse
Affiliation(s)
- Terence Gall-Duncan
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Nozomu Sato
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
| | - Ryan K C Yuen
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Christopher E Pearson
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
10
|
Mukherjee K, Dole-Muinos D, Ajayi A, Rossi M, Prosperi M, Boucher C. Finding Overlapping Rmaps via Clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; PP:1-1. [PMID: 34890332 DOI: 10.1109/tcbb.2021.3132534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Optical mapping has been largely automated, and first produces single molecule restriction maps, called Rmaps, which are assembled to generate genome wide optical maps. Since the location and orientation of each Rmap is unknown, the first problem in the analysis of this data is finding related Rmaps, i.e., pairs of Rmaps that share the same orientation and have significant overlap in their genomic location. Although heuristics for identifying related Rmaps exist, they all require quantization of the data which leads to a loss in the precision. In this paper, we propose a Gaussian mixture modelling clustering based method, which we refer to as O, that finds overlapping Rmaps without quantization. Using both simulated and real datasets, we show that OMclust substantially improves the precision (from 48.3% to 73.3%) over the state-of-the art methods while also reducing CPU time and memory consumption. Further, we integrated OMclust into the error correction methods (Elmeri and Comet) to demonstrate the increase in the performance of these methods. When OMclust was combined with Comet to error correct Rmap data generated from human DNA, it was able to error correct close to 3x more Ramps, and reduced the CPU time by more than 35x.
Collapse
|
11
|
Kriegova E, Fillerova R, Minarik J, Savara J, Manakova J, Petrackova A, Dihel M, Balcarkova J, Krhovska P, Pika T, Gajdos P, Behalek M, Vasinek M, Papajik T. Whole-genome optical mapping of bone-marrow myeloma cells reveals association of extramedullary multiple myeloma with chromosome 1 abnormalities. Sci Rep 2021; 11:14671. [PMID: 34282158 PMCID: PMC8289962 DOI: 10.1038/s41598-021-93835-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 06/24/2021] [Indexed: 11/18/2022] Open
Abstract
Extramedullary disease (EMM) represents a rare, aggressive and mostly resistant phenotype of multiple myeloma (MM). EMM is frequently associated with high-risk cytogenetics, but their complex genomic architecture is largely unexplored. We used whole-genome optical mapping (Saphyr, Bionano Genomics) to analyse the genomic architecture of CD138+ cells isolated from bone-marrow aspirates from an unselected cohort of newly diagnosed patients with EMM (n = 4) and intramedullary MM (n = 7). Large intrachromosomal rearrangements (> 5 Mbp) within chromosome 1 were detected in all EMM samples. These rearrangements, predominantly deletions with/without inversions, encompassed hundreds of genes and led to changes in the gene copy number on large regions of chromosome 1. Compared with intramedullary MM, EMM was characterised by more deletions (size range of 500 bp–50 kbp) and fewer interchromosomal translocations, and two EMM samples had copy number loss in the 17p13 region. Widespread genomic heterogeneity and novel aberrations in the high-risk IGH/IGK/IGL, 8q24 and 13q14 regions were detected in individual patients but were not specific to EMM/MM. Our pilot study revealed an association of chromosome 1 abnormalities in bone marrow myeloma cells with extramedullary progression. Optical mapping showed the potential for refining the complex genomic architecture in MM and its phenotypes.
Collapse
Affiliation(s)
- Eva Kriegova
- Department of Immunology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Hnevotinska 3, 779 00, Olomouc, Czech Republic.
| | - Regina Fillerova
- Department of Immunology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Hnevotinska 3, 779 00, Olomouc, Czech Republic
| | - Jiri Minarik
- Department of Hemato-Oncology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Olomouc, Czech Republic
| | - Jakub Savara
- Department of Immunology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Hnevotinska 3, 779 00, Olomouc, Czech Republic.,Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, Ostrava, Czech Republic
| | - Jirina Manakova
- Department of Immunology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Hnevotinska 3, 779 00, Olomouc, Czech Republic
| | - Anna Petrackova
- Department of Immunology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Hnevotinska 3, 779 00, Olomouc, Czech Republic
| | - Martin Dihel
- Department of Immunology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Hnevotinska 3, 779 00, Olomouc, Czech Republic
| | - Jana Balcarkova
- Department of Hemato-Oncology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Olomouc, Czech Republic
| | - Petra Krhovska
- Department of Hemato-Oncology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Olomouc, Czech Republic
| | - Tomas Pika
- Department of Hemato-Oncology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Olomouc, Czech Republic
| | - Petr Gajdos
- Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, Ostrava, Czech Republic
| | - Marek Behalek
- Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, Ostrava, Czech Republic
| | - Michal Vasinek
- Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, Ostrava, Czech Republic
| | - Tomas Papajik
- Department of Hemato-Oncology, Faculty of Medicine and Dentistry, Palacky University Olomouc and University Hospital Olomouc, Olomouc, Czech Republic
| |
Collapse
|
12
|
Lopes M, Louzada S, Gama-Carvalho M, Chaves R. Genomic Tackling of Human Satellite DNA: Breaking Barriers through Time. Int J Mol Sci 2021; 22:4707. [PMID: 33946766 PMCID: PMC8125562 DOI: 10.3390/ijms22094707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/24/2021] [Accepted: 04/27/2021] [Indexed: 12/12/2022] Open
Abstract
(Peri)centromeric repetitive sequences and, more specifically, satellite DNA (satDNA) sequences, constitute a major human genomic component. SatDNA sequences can vary on a large number of features, including nucleotide composition, complexity, and abundance. Several satDNA families have been identified and characterized in the human genome through time, albeit at different speeds. Human satDNA families present a high degree of sub-variability, leading to the definition of various subfamilies with different organization and clustered localization. Evolution of satDNA analysis has enabled the progressive characterization of satDNA features. Despite recent advances in the sequencing of centromeric arrays, comprehensive genomic studies to assess their variability are still required to provide accurate and proportional representation of satDNA (peri)centromeric/acrocentric short arm sequences. Approaches combining multiple techniques have been successfully applied and seem to be the path to follow for generating integrated knowledge in the promising field of human satDNA biology.
Collapse
Affiliation(s)
- Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| |
Collapse
|
13
|
Garg S. Computational methods for chromosome-scale haplotype reconstruction. Genome Biol 2021; 22:101. [PMID: 33845884 PMCID: PMC8040228 DOI: 10.1186/s13059-021-02328-9] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022] Open
Abstract
High-quality chromosome-scale haplotype sequences of diploid genomes, polyploid genomes, and metagenomes provide important insights into genetic variation associated with disease and biodiversity. However, whole-genome short read sequencing does not yield haplotype information spanning whole chromosomes directly. Computational assembly of shorter haplotype fragments is required for haplotype reconstruction, which can be challenging owing to limited fragment lengths and high haplotype and repeat variability across genomes. Recent advancements in long-read and chromosome-scale sequencing technologies, alongside computational innovations, are improving the reconstruction of haplotypes at the level of whole chromosomes. Here, we review recent and discuss methodological progress and perspectives in these areas.
Collapse
Affiliation(s)
- Shilpa Garg
- Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
14
|
Blom MPK. Opportunities and challenges for high-quality biodiversity tissue archives in the age of long-read sequencing. Mol Ecol 2021; 30:5935-5948. [PMID: 33786900 DOI: 10.1111/mec.15909] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 03/06/2021] [Accepted: 03/22/2021] [Indexed: 12/11/2022]
Abstract
The technological ability to characterize genetic variation at a genome-wide scale provides an unprecedented opportunity to study the genetic underpinnings and evolutionary mechanisms that promote and sustain biodiversity. The transition from short- to long-read sequencing is particularly promising and allows a more holistic view on any changes in genetic diversity across time and space. Long-read sequencing has tremendous potential but sequencing success strongly depends on the long-range integrity of DNA molecules and therefore on the availability of high-quality tissue samples. With the scope of genomic experiments expanding and wild populations simultaneously disappearing at an unprecedented rate, access to high-quality samples may soon be a major concern for many projects. The need for high-quality biodiversity tissue archives is therefore urgent but sampling and preserving high-quality samples is not a trivial exercise. In this review, I will briefly outline how long-read sequencing can benefit the study of molecular ecology, how this will substantially increase the demand for high-quality tissues and why it is challenging to preserve DNA integrity. I will then provide an overview of preservation approaches and end with a call for support to acknowledge the efforts needed to assemble high-quality tissue archives. In doing so, I hope to simultaneously motivate field biologists to expand sampling practices and molecular biologists to develop (cost) efficient guidelines for the sampling and long-term storage of tissues. A concerted, interdisciplinary, effort is needed to catalogue the genetic variation underlying contemporary biodiversity and will eventually provide a critical resource for future studies.
Collapse
Affiliation(s)
- Mozes P K Blom
- Leibniz Institut für Evolutions- und Biodiversitätsforschung, Museum für Naturkunde, Berlin, Germany
| |
Collapse
|
15
|
McGurk MP, Dion-Côté AM, Barbash DA. Rapid evolution at the Drosophila telomere: transposable element dynamics at an intrinsically unstable locus. Genetics 2021; 217:iyaa027. [PMID: 33724410 PMCID: PMC8045721 DOI: 10.1093/genetics/iyaa027] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Accepted: 12/03/2020] [Indexed: 12/26/2022] Open
Abstract
Drosophila telomeres have been maintained by three families of active transposable elements (TEs), HeT-A, TAHRE, and TART, collectively referred to as HTTs, for tens of millions of years, which contrasts with an unusually high degree of HTT interspecific variation. While the impacts of conflict and domestication are often invoked to explain HTT variation, the telomeres are unstable structures such that neutral mutational processes and evolutionary tradeoffs may also drive HTT evolution. We leveraged population genomic data to analyze nearly 10,000 HTT insertions in 85 Drosophila melanogaster genomes and compared their variation to other more typical TE families. We observe that occasional large-scale copy number expansions of both HTTs and other TE families occur, highlighting that the HTTs are, like their feral cousins, typically repressed but primed to take over given the opportunity. However, large expansions of HTTs are not caused by the runaway activity of any particular HTT subfamilies or even associated with telomere-specific TE activity, as might be expected if HTTs are in strong genetic conflict with their hosts. Rather than conflict, we instead suggest that distinctive aspects of HTT copy number variation and sequence diversity largely reflect telomere instability, with HTT insertions being lost at much higher rates than other TEs elsewhere in the genome. We extend previous observations that telomere deletions occur at a high rate, and surprisingly discover that more than one-third do not appear to have been healed with an HTT insertion. We also report that some HTT families may be preferentially activated by the erosion of whole telomeres, implying the existence of HTT-specific host control mechanisms. We further suggest that the persistent telomere localization of HTTs may reflect a highly successful evolutionary strategy that trades away a stable insertion site in order to have reduced impact on the host genome. We propose that HTT evolution is driven by multiple processes, with niche specialization and telomere instability being previously underappreciated and likely predominant.
Collapse
Affiliation(s)
- Michael P McGurk
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Anne-Marie Dion-Côté
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| | - Daniel A Barbash
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
16
|
Ottenburghs J, Geng K, Suh A, Kutter C. Genome Size Reduction and Transposon Activity Impact tRNA Gene Diversity While Ensuring Translational Stability in Birds. Genome Biol Evol 2021; 13:6127176. [PMID: 33533905 PMCID: PMC8044555 DOI: 10.1093/gbe/evab016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2021] [Indexed: 12/12/2022] Open
Abstract
As a highly diverse vertebrate class, bird species have adapted to various ecological systems. How this phenotypic diversity can be explained genetically is intensively debated and is likely grounded in differences in the genome content. Larger and more complex genomes could allow for greater genetic regulation that results in more phenotypic variety. Surprisingly, avian genomes are much smaller compared to other vertebrates but contain as many protein-coding genes as other vertebrates. This supports the notion that the phenotypic diversity is largely determined by selection on non-coding gene sequences. Transfer RNAs (tRNAs) represent a group of non-coding genes. However, the characteristics of tRNA genes across bird genomes have remained largely unexplored. Here, we exhaustively investigated the evolution and functional consequences of these crucial translational regulators within bird species and across vertebrates. Our dense sampling of 55 avian genomes representing each bird order revealed an average of 169 tRNA genes with at least 31% being actively used. Unlike other vertebrates, avian tRNA genes are reduced in number and complexity but are still in line with vertebrate wobble pairing strategies and mutation-driven codon usage. Our detailed phylogenetic analyses further uncovered that new tRNA genes can emerge through multiplication by transposable elements. Together, this study provides the first comprehensive avian and cross-vertebrate tRNA gene analyses and demonstrates that tRNA gene evolution is flexible albeit constrained within functional boundaries of general mechanisms in protein translation.
Collapse
Affiliation(s)
- Jente Ottenburghs
- Department of Microbiology, Tumor and Cell Biology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden.,Department of Ecology and Genetics, Evolutionary Biology Centre, Science for Life Laboratory, Uppsala University, Sweden
| | - Keyi Geng
- Department of Microbiology, Tumor and Cell Biology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden
| | - Alexander Suh
- Department of Ecology and Genetics, Evolutionary Biology Centre, Science for Life Laboratory, Uppsala University, Sweden
| | - Claudia Kutter
- Department of Microbiology, Tumor and Cell Biology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
17
|
Peona V, Blom MPK, Xu L, Burri R, Sullivan S, Bunikis I, Liachko I, Haryoko T, Jønsson KA, Zhou Q, Irestedt M, Suh A. Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise. Mol Ecol Resour 2021; 21:263-286. [PMID: 32937018 PMCID: PMC7757076 DOI: 10.1111/1755-0998.13252] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/21/2020] [Accepted: 08/26/2020] [Indexed: 01/09/2023]
Abstract
Genome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies now enable assembling genomes at unprecedented quality and contiguity. However, the difficulty in assembling repeat-rich and GC-rich regions (genomic "dark matter") limits insights into the evolution of genome structure and regulatory networks. Here, we compare the efficiency of currently available sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter. By adopting different de novo assembly strategies, we compare individual draft assemblies to a curated multiplatform reference assembly and identify the genomic features that cause gaps within each assembly. We show that a multiplatform assembly implementing long-read, linked-read and proximity sequencing technologies performs best at recovering transposable elements, multicopy MHC genes, GC-rich microchromosomes and the repeat-rich W chromosome. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is now possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects for optimized completeness of both the coding and noncoding parts of nonmodel genomes.
Collapse
Affiliation(s)
- Valentina Peona
- Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
- Museum für NaturkundeLeibniz Institut für Evolutions‐ und BiodiversitätsforschungBerlinGermany
| | - Luohao Xu
- Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
| | - Reto Burri
- Department of Population EcologyInstitute of Ecology and EvolutionFriedrich‐Schiller‐University JenaJenaGermany
| | | | - Ignas Bunikis
- Department of Immunology, Genetics and PathologyScience for Life LaboratoryUppsala Genome CenterUppsala UniversityUppsalaSweden
| | | | - Tri Haryoko
- Research Centre for BiologyMuseum Zoologicum BogorienseIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | - Knud A. Jønsson
- Natural History Museum of DenmarkUniversity of CopenhagenCopenhagenDenmark
| | - Qi Zhou
- Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
- MOE Laboratory of Biosystems Homeostasis & ProtectionLife Sciences InstituteZhejiang UniversityHangzhouChina
- Center for Reproductive MedicineThe 2nd Affiliated HospitalSchool of MedicineZhejiang UniversityHangzhouChina
| | - Martin Irestedt
- Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
| | - Alexander Suh
- Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- School of Biological Sciences—Organisms and the EnvironmentUniversity of East AngliaNorwichUK
| |
Collapse
|
18
|
Dussex N, Kutschera VE, Wiberg RAW, Parker DJ, Hunt GR, Gray RD, Rutherford K, Abe H, Fleischer RC, Ritchie MG, Rutz C, Wolf JBW, Gemmell NJ. A genome-wide investigation of adaptive signatures in protein-coding genes related to tool behaviour in New Caledonian and Hawaiian crows. Mol Ecol 2020; 30:973-986. [PMID: 33305388 DOI: 10.1111/mec.15775] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 11/27/2020] [Accepted: 12/04/2020] [Indexed: 12/30/2022]
Abstract
Very few animals habitually manufacture and use tools. It has been suggested that advanced tool behaviour co-evolves with a suite of behavioural, morphological and life history traits. In fact, there are indications for such an adaptive complex in tool-using crows (genus Corvus species). Here, we sequenced the genomes of two habitually tool-using and ten non-tool-using crow species to search for genomic signatures associated with a tool-using lifestyle. Using comparative genomic and population genetic approaches, we screened for signals of selection in protein-coding genes in the tool-using New Caledonian and Hawaiian crows. While we detected signals of recent selection in New Caledonian crows near genes associated with bill morphology, our data indicate that genetic changes in these two lineages are surprisingly subtle, with little evidence at present for convergence. We explore the biological explanations for these findings, such as the relative roles of gene regulation and protein-coding changes, as well as the possibility that statistical power to detect selection in recently diverged lineages may have been insufficient. Our study contributes to a growing body of literature aiming to decipher the genetic basis of recently evolved complex behaviour.
Collapse
Affiliation(s)
- Nicolas Dussex
- Department of Anatomy, University of Otago, Dunedin, New Zealand.,Department of Bioinformatics and Genetics, Centre for Palaeogenetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Verena E Kutschera
- Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.,Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Solna, Sweden
| | - R Axel W Wiberg
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK.,Department of Environmental Sciences, Evolutionary Biology, University of Basel, Basel, Switzerland
| | - Darren J Parker
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Gavin R Hunt
- University of Auckland, Science Centre 302, Auckland, New Zealand
| | - Russell D Gray
- University of Auckland, Science Centre 302, Auckland, New Zealand.,Max Planck Institute for the Science of Human History, Jena, Germany
| | - Kim Rutherford
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Hideaki Abe
- Department of Anatomy, University of Otago, Dunedin, New Zealand.,Wildlife Research Center, Kyoto University, Kyoto, Japan
| | - Robert C Fleischer
- Center for Conservation Genomics, Smithsonian Conservation Biology Institute, Washington, DC, USA
| | - Michael G Ritchie
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK
| | - Christian Rutz
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK
| | - Jochen B W Wolf
- Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.,Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Neil J Gemmell
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| |
Collapse
|
19
|
Knief U, Forstmeier W, Pei Y, Wolf J, Kempenaers B. A test for meiotic drive in hybrids between Australian and Timor zebra finches. Ecol Evol 2020; 10:13464-13475. [PMID: 33304552 PMCID: PMC7713956 DOI: 10.1002/ece3.6951] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 09/14/2020] [Accepted: 09/28/2020] [Indexed: 12/16/2022] Open
Abstract
Meiotic drivers have been proposed as a potent evolutionary force underlying genetic and phenotypic variation, genome structure, and also speciation. Due to their strong selective advantage, they are expected to rapidly spread through a population despite potentially detrimental effects on organismal fitness. Once fixed, autosomal drivers are cryptic within populations and only become visible in between-population crosses lacking the driver or corresponding suppressor. However, the assumed ubiquity of meiotic drivers has rarely been assessed in crosses between populations or species. Here we test for meiotic drive in hybrid embryos and offspring of Timor and Australian zebra finches-subspecies that have evolved in isolation for about two million years-using 38,541 informative transmissions of 56 markers linked to either centromeres or distal chromosome ends. We did not find evidence for meiotic driver loci on specific chromosomes. However, we observed a weak overall transmission bias toward Timor alleles at centromeres in females (transmission probability of Australian alleles of 47%, nominal p = 6 × 10-5). While this is in line with the centromere drive theory, it goes against the expectation that the subspecies with the larger effective population size (i.e., the Australian zebra finch) should have evolved the more potent meiotic drivers. We thus caution against interpreting our finding as definite evidence for centromeric drive. Yet, weak centromeric meiotic drivers may be more common than generally anticipated and we encourage further studies that are designed to detect also small effect meiotic drivers.
Collapse
Affiliation(s)
- Ulrich Knief
- Department of Behavioural Ecology and Evolutionary GeneticsMax Planck Institute for OrnithologySeewiesenGermany
- Division of Evolutionary BiologyFaculty of BiologyLudwig Maximilian University of MunichPlanegg‐MartinsriedGermany
| | - Wolfgang Forstmeier
- Department of Behavioural Ecology and Evolutionary GeneticsMax Planck Institute for OrnithologySeewiesenGermany
| | - Yifan Pei
- Department of Behavioural Ecology and Evolutionary GeneticsMax Planck Institute for OrnithologySeewiesenGermany
| | - Jochen Wolf
- Division of Evolutionary BiologyFaculty of BiologyLudwig Maximilian University of MunichPlanegg‐MartinsriedGermany
| | - Bart Kempenaers
- Department of Behavioural Ecology and Evolutionary GeneticsMax Planck Institute for OrnithologySeewiesenGermany
| |
Collapse
|
20
|
Kim SB, Karre S, Wu Q, Park M, Meyers E, Claeys H, Wisser R, Jackson D, Balint-Kurti P. Multiple insertions of COIN, a novel maize Foldback transposable element, in the Conring gene cause a spontaneous progressive cell death phenotype. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:581-595. [PMID: 32748440 DOI: 10.1111/tpj.14945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 07/16/2020] [Accepted: 07/21/2020] [Indexed: 06/11/2023]
Abstract
Similar progressive leaf lesion phenotypes, named conring for "concentric ring," were identified in 10 independently derived maize lines. Complementation and mapping experiments indicated that the phenotype had the same genetic basis in each line - a single recessive gene located in a 1.1-Mb region on chromosome 2. Among the 15 predicted genes in this interval, Zm00001d003866 (subsequently renamed Conring or Cnr) had insertions of four related 138 bp transposable element (TE) sequences at precisely the same site in exon 4 in nine of the 10 cnr alleles. The 10th cnr allele had a distinct insertion of 226 bp of in exon 3. Genetic evidence suggested that the 10 cnr alleles were independently derived, and arose during the derivation of each line. The four TEs, named COINa (for COnring INsertion) through COINd, have not been previously characterized and consist entirely of imperfect 69-bp terminal inverted repeats characteristic of the Foldback class of TEs. They belong to three clades of a family of maize TEs comprising hundreds of sequences in the genome of the B73 maize line. COIN elements preferentially insert at TNA sequences with a preference for C and G nucleotides in the immediately flanking 5' and 3' regions, respectively. They produce a three-base target site duplication and do not have homology to other characterized TEs. We propose that Cnr is an unstable gene that is mutated insertionally at high frequency, most commonly due to COIN element insertions at a specific site in the gene.
Collapse
Affiliation(s)
- Saet-Byul Kim
- Department of Entomology and Plant Pathology, NC State University, Raleigh, NC, USA
| | - Shailesh Karre
- Department of Entomology and Plant Pathology, NC State University, Raleigh, NC, USA
| | - Qingyu Wu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Minkyu Park
- Horticultural Sciences Department, University of Florida, 2550 Hull Rd, Gainesville, FL, 32611, USA
| | - Emily Meyers
- Department of Entomology and Plant Pathology, NC State University, Raleigh, NC, USA
| | - Hannes Claeys
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Randall Wisser
- Department of Plant and Soil Sciences, University of Delaware, Newark, DE, 19716, USA
| | - David Jackson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Peter Balint-Kurti
- Department of Entomology and Plant Pathology, NC State University, Raleigh, NC, USA
- Plant Science Research Unit USDA-ARS, NC State University, Raleigh, NC, USA
| |
Collapse
|
21
|
Blommaert J. Genome size evolution: towards new model systems for old questions. Proc Biol Sci 2020; 287:20201441. [PMID: 32842932 PMCID: PMC7482279 DOI: 10.1098/rspb.2020.1441] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 07/29/2020] [Indexed: 12/20/2022] Open
Abstract
Genome size (GS) variation is a fundamental biological characteristic; however, its evolutionary causes and consequences are the topic of ongoing debate. Whether GS is a neutral trait or one subject to selective pressures, and how strong these selective pressures are, may remain open questions. Fundamentally, the genomic sequences responsible for this variation directly impact the potential evolutionary outcomes and, equally, are the targets of different evolutionary pressures. For example, duplications and deletions of genic regions (large or small) can have immediate and drastic phenotypic effects, while an expansion or contraction of non-coding DNA is less likely to cause catastrophic phenotypic effects. However, in the long term, the accumulation or deletion of ncDNA is likely to have larger effects. Modern sequencing technologies are allowing for the dissection of these proximate causes, but a combination of these new technologies with more traditional evolutionary experiments and approaches could revolutionize this debate and potentially resolve many of these arguments. Here, I discuss an ambitious way forward for GS research, putting it in context of historical debates, theories and sometimes contradictory evidence, and highlighting the promise of combining new sequencing technologies and analytical developments with more traditional experimental evolution approaches.
Collapse
Affiliation(s)
- Julie Blommaert
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
22
|
Near-chromosome level genome assembly of the fruit pest Drosophila suzukii using long-read sequencing. Sci Rep 2020; 10:11227. [PMID: 32641717 PMCID: PMC7343843 DOI: 10.1038/s41598-020-67373-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Accepted: 06/02/2020] [Indexed: 12/31/2022] Open
Abstract
Over the past decade, the spotted wing Drosophila, Drosophila suzukii, has invaded Europe and America and has become a major agricultural pest in these areas, thereby prompting intense research activities to better understand its biology. Two draft genome assemblies already exist for this species but contain pervasive assembly errors and are highly fragmented, which limits their values. Our purpose here was to improve the assembly of the D. suzukii genome and to annotate it in a way that facilitates comparisons with D. melanogaster. For this, we generated PacBio long-read sequencing data and assembled a novel, high-quality D. suzukii genome assembly. It is one of the largest Drosophila genomes, notably because of the expansion of its repeatome. We found that despite 16 rounds of full-sib crossings the D. suzukii strain that we sequenced has maintained high levels of polymorphism in some regions of its genome. As a consequence, the quality of the assembly of these regions was reduced. We explored possible origins of this high residual diversity, including the presence of structural variants and a possible heterogeneous admixture pattern of North American and Asian ancestry. Overall, our assembly and annotation constitute a high-quality genomic resource that can be used for both high-throughput sequencing approaches, as well as manipulative genetic technologies to study D. suzukii.
Collapse
|
23
|
Weissensteiner MH, Bunikis I, Catalán A, Francoijs KJ, Knief U, Heim W, Peona V, Pophaly SD, Sedlazeck FJ, Suh A, Warmuth VM, Wolf JBW. Discovery and population genomics of structural variation in a songbird genus. Nat Commun 2020; 11:3403. [PMID: 32636372 PMCID: PMC7341801 DOI: 10.1038/s41467-020-17195-4] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 06/16/2020] [Indexed: 02/07/2023] Open
Abstract
Structural variation (SV) constitutes an important type of genetic mutations providing the raw material for evolution. Here, we uncover the genome-wide spectrum of intra- and interspecific SV segregating in natural populations of seven songbird species in the genus Corvus. Combining short-read (N = 127) and long-read re-sequencing (N = 31), as well as optical mapping (N = 16), we apply both assembly- and read mapping approaches to detect SV and characterize a total of 220,452 insertions, deletions and inversions. We exploit sampling across wide phylogenetic timescales to validate SV genotypes and assess the contribution of SV to evolutionary processes in an avian model of incipient speciation. We reveal an evolutionary young (~530,000 years) cis-acting 2.25-kb LTR retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth and evolutionary significance of SV segregating in natural populations and highlight the need for reliable SV genotyping.
Collapse
Affiliation(s)
- Matthias H Weissensteiner
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden.
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany.
- Department of Biology, Pennsylvania State University, 310 Wartik Lab, University Park, PA, 16802, USA.
| | - Ignas Bunikis
- Uppsala Genome Center, Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, 752 37, Uppsala, Sweden
| | - Ana Catalán
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | | | - Ulrich Knief
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Wieland Heim
- Institute of Landscsape Ecology, University of Münster, Heisenbergstrasse 2, 48149, Münster, Germany
| | - Valentina Peona
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden
- Department of Organismal Biology - Systematic Biology, Uppsala University, 752 36, Uppsala, Sweden
| | - Saurabh D Pophaly
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
- Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center at Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
| | - Alexander Suh
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden
- Department of Organismal Biology - Systematic Biology, Uppsala University, 752 36, Uppsala, Sweden
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TU, UK
| | - Vera M Warmuth
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Jochen B W Wolf
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden.
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany.
| |
Collapse
|
24
|
From molecules to populations: appreciating and estimating recombination rate variation. Nat Rev Genet 2020; 21:476-492. [DOI: 10.1038/s41576-020-0240-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/15/2020] [Indexed: 02/07/2023]
|
25
|
Jayakumar V, Sakakibara Y. Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data. Brief Bioinform 2020; 20:866-876. [PMID: 29112696 PMCID: PMC6585154 DOI: 10.1093/bib/bbx147] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 09/22/2017] [Indexed: 12/20/2022] Open
Abstract
Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms.
Collapse
|
26
|
Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol 2020; 21:30. [PMID: 32033565 PMCID: PMC7006217 DOI: 10.1186/s13059-020-1935-5] [Citation(s) in RCA: 679] [Impact Index Per Article: 169.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 01/15/2020] [Indexed: 12/11/2022] Open
Abstract
Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characteristics of long-read data are thus required, but the fast pace of development of such tools can be overwhelming. To assist in the design and analysis of long-read sequencing projects, we review the current landscape of available tools and present an online interactive database, long-read-tools.org, to facilitate their browsing. We further focus on the principles of error correction, base modification detection, and long-read transcriptomics analysis and highlight the challenges that remain.
Collapse
Affiliation(s)
- Shanika L. Amarasinghe
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Shian Su
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Xueyi Dong
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Luke Zappia
- Bioinformatics, Murdoch Children’s Research Institute, Parkville, 3052 Australia
- School of Biosciences, Faculty of Science, The University of Melbourne, Parkville, 3010 Australia
| | - Matthew E. Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
- School of Mathematics and StatisticsThe University of Melbourne, Parkville, 3010 Australia
| | - Quentin Gouil
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| |
Collapse
|
27
|
Mugal CF, Kutschera VE, Botero-Castro F, Wolf JBW, Kaj I. Polymorphism Data Assist Estimation of the Nonsynonymous over Synonymous Fixation Rate Ratio ω for Closely Related Species. Mol Biol Evol 2020; 37:260-279. [PMID: 31504782 PMCID: PMC6984366 DOI: 10.1093/molbev/msz203] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The ratio of nonsynonymous over synonymous sequence divergence, dN/dS, is a widely used estimate of the nonsynonymous over synonymous fixation rate ratio ω, which measures the extent to which natural selection modulates protein sequence evolution. Its computation is based on a phylogenetic approach and computes sequence divergence of protein-coding DNA between species, traditionally using a single representative DNA sequence per species. This approach ignores the presence of polymorphisms and relies on the indirect assumption that new mutations fix instantaneously, an assumption which is generally violated and reasonable only for distantly related species. The violation of the underlying assumption leads to a time-dependence of sequence divergence, and biased estimates of ω in particular for closely related species, where the contribution of ancestral and lineage-specific polymorphisms to sequence divergence is substantial. We here use a time-dependent Poisson random field model to derive an analytical expression of dN/dS as a function of divergence time and sample size. We then extend our framework to the estimation of the proportion of adaptive protein evolution α. This mathematical treatment enables us to show that the joint usage of polymorphism and divergence data can assist the inference of selection for closely related species. Moreover, our analytical results provide the basis for a protocol for the estimation of ω and α for closely related species. We illustrate the performance of this protocol by studying a population data set of four corvid species, which involves the estimation of ω and α at different time-scales and for several choices of sample sizes.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Verena E Kutschera
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Science for Life Laboratory, Stockholm University, Stockholm, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Fidel Botero-Castro
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Jochen B W Wolf
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Ingemar Kaj
- Department of Mathematics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
28
|
Louzada S, Lopes M, Ferreira D, Adega F, Escudeiro A, Gama-Carvalho M, Chaves R. Decoding the Role of Satellite DNA in Genome Architecture and Plasticity-An Evolutionary and Clinical Affair. Genes (Basel) 2020; 11:E72. [PMID: 31936645 PMCID: PMC7017282 DOI: 10.3390/genes11010072] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 12/29/2019] [Accepted: 01/08/2020] [Indexed: 12/11/2022] Open
Abstract
Repetitive DNA is a major organizational component of eukaryotic genomes, being intrinsically related with their architecture and evolution. Tandemly repeated satellite DNAs (satDNAs) can be found clustered in specific heterochromatin-rich chromosomal regions, building vital structures like functional centromeres and also dispersed within euchromatin. Interestingly, despite their association to critical chromosomal structures, satDNAs are widely variable among species due to their high turnover rates. This dynamic behavior has been associated with genome plasticity and chromosome rearrangements, leading to the reshaping of genomes. Here we present the current knowledge regarding satDNAs in the light of new genomic technologies, and the challenges in the study of these sequences. Furthermore, we discuss how these sequences, together with other repeats, influence genome architecture, impacting its evolution and association with disease.
Collapse
Affiliation(s)
- Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Daniela Ferreira
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Filomena Adega
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Ana Escudeiro
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| |
Collapse
|
29
|
Vondrak T, Ávila Robledillo L, Novák P, Koblížková A, Neumann P, Macas J. Characterization of repeat arrays in ultra-long nanopore reads reveals frequent origin of satellite DNA from retrotransposon-derived tandem repeats. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 101:484-500. [PMID: 31559657 PMCID: PMC7004042 DOI: 10.1111/tpj.14546] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 09/09/2019] [Accepted: 09/12/2019] [Indexed: 05/21/2023]
Abstract
Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing 11 major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73× genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favourable for satellite DNA accumulation.
Collapse
Affiliation(s)
- Tihana Vondrak
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
- Faculty of ScienceUniversity of South BohemiaČeské BudějoviceCzech Republic
| | - Laura Ávila Robledillo
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
- Faculty of ScienceUniversity of South BohemiaČeské BudějoviceCzech Republic
| | - Petr Novák
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| | - Andrea Koblížková
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| | - Pavel Neumann
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| | - Jiří Macas
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| |
Collapse
|
30
|
Dhar R, Seethy A, Pethusamy K, Singh S, Rohil V, Purkayastha K, Mukherjee I, Goswami S, Singh R, Raj A, Srivastava T, Acharya S, Rajashekhar B, Karmakar S. De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing. Gigascience 2019; 8:5488106. [PMID: 31077316 PMCID: PMC6511069 DOI: 10.1093/gigascience/giz038] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Revised: 09/30/2018] [Accepted: 03/18/2019] [Indexed: 01/23/2023] Open
Abstract
Background The Indian peafowl (Pavo cristanus) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT). Results ONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23,153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences. Conclusions We report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1,639 bases, whereas with ONT, the N50 increased by >9-fold to 14,749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685,241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15,025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteins predicted by homology matched with those in a public repository, verifying the completeness of our assembly. Like other phylogenetic studies of avian conserved genes, we found P. cristatus to be most closely related to Gallus gallus, followed by Meleagris gallopavo and Anas platyrhynchos. Compared with the recently published peacock genome assembly, the current, superior, hybrid assembly has greater sequencing depth, fewer non-ATGC sequences, and fewer scaffolds.
Collapse
Affiliation(s)
- Ruby Dhar
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Ashikh Seethy
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Karthikeyan Pethusamy
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Sunil Singh
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Vishwajeet Rohil
- Vallabhbhai Patel Chest Institute (VPCI), Delhi University, New Delhi 110007, India
| | - Kakali Purkayastha
- Vallabhbhai Patel Chest Institute (VPCI), Delhi University, New Delhi 110007, India
| | - Indrani Mukherjee
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Sandeep Goswami
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Rakesh Singh
- Kanpur Zoo, Hastings Ave, Azad Nagar, Nawabganj, Kanpur, Uttar Pradesh 208002, India
| | - Ankita Raj
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Tryambak Srivastava
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Sovon Acharya
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Balaji Rajashekhar
- Institute of Computer Science, University of Tartu, J. Liivi, Tartu 50409, Estonia.,Celixa, 19/1 Sankey Road, Bangalore 560020, India
| | - Subhradip Karmakar
- Department of Biochemistry, Room 3020, AIIMS - All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| |
Collapse
|
31
|
Tørresen OK, Star B, Mier P, Andrade-Navarro MA, Bateman A, Jarnot P, Gruca A, Grynberg M, Kajava AV, Promponas VJ, Anisimova M, Jakobsen KS, Linke D. Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases. Nucleic Acids Res 2019; 47:10994-11006. [PMID: 31584084 PMCID: PMC6868369 DOI: 10.1093/nar/gkz841] [Citation(s) in RCA: 153] [Impact Index Per Article: 30.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 09/03/2019] [Accepted: 10/01/2019] [Indexed: 12/13/2022] Open
Abstract
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with 'ready-to-use' deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others.
Collapse
Affiliation(s)
- Ole K Tørresen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Bastiaan Star
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Husch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Husch-Weg 15, 55128 Mainz, Germany
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton. CB10 1SD, UK
| | - Patryk Jarnot
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Aleksandra Gruca
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics PAS, Pawińskiego 5A, 02-106 Warsaw, Poland
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237 CNRS, Universite Montpellier 1919 Route de Mende, CEDEX 5, 34293 Montpellier, France
- Institut de Biologie Computationnelle, 34095 Montpellier, France
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, PO Box 20537, CY 1678 Nicosia, Cyprus
| | - Maria Anisimova
- Institute of Applied Simulations, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Kjetill S Jakobsen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| | - Dirk Linke
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, NO-0316 Oslo, Norway
| |
Collapse
|
32
|
Bahbahani H, Musa HH, Wragg D, Shuiep ES, Almathen F, Hanotte O. Genome Diversity and Signatures of Selection for Production and Performance Traits in Dromedary Camels. Front Genet 2019; 10:893. [PMID: 31608121 PMCID: PMC6761857 DOI: 10.3389/fgene.2019.00893] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 08/23/2019] [Indexed: 12/20/2022] Open
Abstract
Dromedary camels (Camelus dromedarius) are single-humped animals found throughout the deserts of Africa, the Arabian Peninsula, and the southwest of Asia. This well-adapted species is mainly used for milk and meat production, although some specific types exhibit superior running performance and are used in racing competitions. However, neither performance nor production camels are bred under intensive genomic selection programs with specific aims to improve these traits. In this study, the full genome sequence data of six camels from the Arabian Peninsula and the genotyping-by-sequencing data of 44 camels (29 packing and 15 racing) from Sudan were analyzed to assess their genome diversities, relationships, and candidate signatures of positive selection. Genome ADMIXTURE and principle component analyses indicate clear geographic separation between the Sudanese and the Arabian Peninsula camels, but with no population-specific genetic distinction within populations. Camel samples from the Arabian Peninsula show higher mean heterozygosity (0.560 ± 0.003) than those from Sudan (0.347 ± 0.003). Analyses of signatures of selection, using pooled heterozygosity (Hp) approach, in the Sudanese camels revealed 176, 189, and 308 candidate regions under positive selection in the combined and packing and racing camel populations, respectively. These regions host genes that might be associated with adaptation to arid environment, dairy traits, energy homeostasis, and chondrogenesis. Eight regions show high genetic differentiation, based on Fst analysis, between the Sudanese packing and racing camel types. Genes associated with chondrogenesis, energy balance, and urinary system development were found within these regions. Our results advocate for further detailed investigation of the genome of the dromedary camel to identify and characterize genes and variants associated with their valuable phenotypic traits. The results of which may support the development of breeding programs to improve the production and performance traits of this unique domesticated species.
Collapse
Affiliation(s)
- Hussain Bahbahani
- Department of Biological Sciences, Faculty of Science, Kuwait University, Kuwait City, Kuwait
| | - Hassan H Musa
- Department of Medical Microbiology, Faculty of Medical Laboratory Sciences, University of Khartoum, Khartoum North, Sudan
| | - David Wragg
- Centre for Tropical Livestock Genetics and Health, The Roslin Institute, Edinburgh, United Kingdom
| | - Eltahir S Shuiep
- Department of Animal Production, Faculty of Agricultural and Environmental Sciences, University of Gadarif, Gadarif State, Sudan
| | - Faisal Almathen
- Department of Public Health, College of Veterinary Medicine, King Faisal University, Al-Hasa, Saudi Arabia
| | - Olivier Hanotte
- LiveGene, International Livestock Research Institute (ILRI), Addis Ababa, Ethiopia
| |
Collapse
|
33
|
|
34
|
Ellegren H, Wolf JBW. Parallelism in genomic landscapes of differentiation, conserved genomic features and the role of linked selection. J Evol Biol 2019; 30:1516-1518. [PMID: 28786191 DOI: 10.1111/jeb.13113] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 05/01/2017] [Accepted: 05/03/2017] [Indexed: 01/01/2023]
Affiliation(s)
- H Ellegren
- Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden
| | - J B W Wolf
- Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.,Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| |
Collapse
|
35
|
A Multireference-Based Whole Genome Assembly for the Obligate Ant-Following Antbird, Rhegmatorhina melanosticta (Thamnophilidae). DIVERSITY-BASEL 2019. [DOI: 10.3390/d11090144] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Current generation high-throughput sequencing technology has facilitated the generation of more genomic-scale data than ever before, thus greatly improving our understanding of avian biology across a range of disciplines. Recent developments in linked-read sequencing (Chromium 10×) and reference-based whole-genome assembly offer an exciting prospect of more accessible chromosome-level genome sequencing in the near future. We sequenced and assembled a genome of the Hairy-crested Antbird (Rhegmatorhina melanosticta), which represents the first publicly available genome for any antbird (Thamnophilidae). Our objectives were to (1) assemble scaffolds to chromosome level based on multiple reference genomes, and report on differences relative to other genomes, (2) assess genome completeness and compare content to other related genomes, and (3) assess the suitability of linked-read sequencing technology for future studies in comparative phylogenomics and population genomics studies. Our R. melanosticta assembly was both highly contiguous (de novo scaffold N50 = 3.3 Mb, reference based N50 = 53.3 Mb) and relatively complete (contained close to 90% of evolutionarily conserved single-copy avian genes and known tetrapod ultraconserved elements). The high contiguity and completeness of this assembly enabled the genome to be successfully mapped to the chromosome level, which uncovered a consistent structural difference between R. melanosticta and other avian genomes. Our results are consistent with the observation that avian genomes are structurally conserved. Additionally, our results demonstrate the utility of linked-read sequencing for non-model genomics. Finally, we demonstrate the value of our R. melanosticta genome for future researchers by mapping reduced representation sequencing data, and by accurately reconstructing the phylogenetic relationships among a sample of thamnophilid species.
Collapse
|
36
|
Jung H, Winefield C, Bombarely A, Prentis P, Waterhouse P. Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes. TRENDS IN PLANT SCIENCE 2019; 24:700-724. [PMID: 31208890 DOI: 10.1016/j.tplants.2019.05.003] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Revised: 05/01/2019] [Accepted: 05/10/2019] [Indexed: 05/16/2023]
Abstract
The commercial release of third-generation sequencing technologies (TGSTs), giving long and ultra-long sequencing reads, has stimulated the development of new tools for assembling highly contiguous genome sequences with unprecedented accuracy across complex repeat regions. We survey here a wide range of emerging sequencing platforms and analytical tools for de novo assembly, provide background information for each of their steps, and discuss the spectrum of available options. Our decision tree recommends workflows for the generation of a high-quality genome assembly when used in combination with the specific needs and resources of a project.
Collapse
Affiliation(s)
- Hyungtaek Jung
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD 4001, Australia.
| | - Christopher Winefield
- Department of Wine, Food, and Molecular Biosciences, Lincoln University, 7647 Christchurch, New Zealand
| | - Aureliano Bombarely
- Department of Bioscience, University of Milan, Milan 20133, Italy; School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - Peter Prentis
- School of Earth, Environmental, and Biological Sciences, Queensland University of Technology, Brisbane, QLD, 4001, Australia
| | - Peter Waterhouse
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD 4001, Australia; School of Biological Sciences, University of Sydney, Sydney, NSW 2006, Australia.
| |
Collapse
|
37
|
Wallberg A, Bunikis I, Pettersson OV, Mosbech MB, Childers AK, Evans JD, Mikheyev AS, Robertson HM, Robinson GE, Webster MT. A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds. BMC Genomics 2019; 20:275. [PMID: 30961563 PMCID: PMC6454739 DOI: 10.1186/s12864-019-5642-0] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 03/24/2019] [Indexed: 01/27/2023] Open
Abstract
Background The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map. Results Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor > 98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features. Conclusions The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics. Electronic supplementary material The online version of this article (10.1186/s12864-019-5642-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Andreas Wallberg
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Ignas Bunikis
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Olga Vinnere Pettersson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Mai-Britt Mosbech
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Anna K Childers
- USDA-ARS Insect Genetics and Biochemistry Research Unit, Fargo, ND, USA.,USDA-ARS Bee Research Lab, Beltsville, MD, USA
| | - Jay D Evans
- USDA-ARS Bee Research Lab, Beltsville, MD, USA
| | | | - Hugh M Robertson
- Department of Entomology and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Gene E Robinson
- Department of Entomology and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Matthew T Webster
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
38
|
Cirillo D, Valencia A. Big data analytics for personalized medicine. Curr Opin Biotechnol 2019; 58:161-167. [PMID: 30965188 DOI: 10.1016/j.copbio.2019.03.004] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 02/22/2019] [Accepted: 03/01/2019] [Indexed: 01/06/2023]
Abstract
Big Data are radically changing biomedical research. The unprecedented advances in automated collection of large-scale molecular and clinical data pose major challenges to data analysis and interpretation, calling for the development of new computational approaches. The creation of powerful systems for the effective use of biomedical Big Data in Personalized Medicine (a.k.a. Precision Medicine) will require significant scientific and technical developments, including infrastructure, engineering, project and financial management. We review here how the evolution of data-driven methods offers the possibility to address many of these problems, guiding the formulation of hypotheses on systems functioning and the generation of mechanistic models, and facilitating the design of clinical procedures in Personalized Medicine.
Collapse
Affiliation(s)
- Davide Cirillo
- Barcelona Supercomputing Center (BSC), C/Jordi Girona 29, 08034, Barcelona, Spain.
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), C/Jordi Girona 29, 08034, Barcelona, Spain; ICREA, Pg. Lluís Companys 23, 08010, Barcelona, Spain
| |
Collapse
|
39
|
Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 2019; 37:540-546. [DOI: 10.1038/s41587-019-0072-8] [Citation(s) in RCA: 1327] [Impact Index Per Article: 265.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 02/06/2019] [Indexed: 01/02/2023]
|
40
|
Knief U, Bossu CM, Saino N, Hansson B, Poelstra J, Vijay N, Weissensteiner M, Wolf JBW. Epistatic mutations under divergent selection govern phenotypic variation in the crow hybrid zone. Nat Ecol Evol 2019; 3:570-576. [PMID: 30911146 PMCID: PMC6445362 DOI: 10.1038/s41559-019-0847-9] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 02/18/2019] [Indexed: 12/22/2022]
Abstract
The evolution of genetic barriers opposing inter-specific gene flow is key to the origin of new species. Drawing from information of over 400 admixed genomes sourced from replicate transects across the European hybrid zone between all-black carrion crows and grey-coated hooded crows, we decipher the interplay between phenotypic divergence and selection at the molecular level. Over 68% of plumage variation was explained by epistasis between the gene NDP and a ~2.8 Mb region on chromosome 18 with suppressed recombination. Both pigmentation loci showed evidence for divergent selection resisting introgression. This study reveals how few, large-effect loci can govern prezygotic isolation and shield phenotypic divergence from gene flow.
Collapse
Affiliation(s)
- Ulrich Knief
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Munich, Germany
| | - Christen M Bossu
- Science for Life Laboratories and Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.,Population Genetics, Department of Zoology, Stockholm University, Stockholm, Sweden.,Institute of the Environment and Sustainability, Center for Tropical Research, University of California, Los Angeles, CA, USA
| | - Nicola Saino
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| | - Bengt Hansson
- Department of Biology, Lund University, Lund, Sweden
| | - Jelmer Poelstra
- Science for Life Laboratories and Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.,Biology Department, Duke University, Durham, NC, USA
| | - Nagarjun Vijay
- Science for Life Laboratories and Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.,Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal, India
| | - Matthias Weissensteiner
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Munich, Germany.,Science for Life Laboratories and Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden
| | - Jochen B W Wolf
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Munich, Germany. .,Science for Life Laboratories and Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
41
|
Manthey JD, Moyle RG, Boissinot S. Multiple and Independent Phases of Transposable Element Amplification in the Genomes of Piciformes (Woodpeckers and Allies). Genome Biol Evol 2018; 10:1445-1456. [PMID: 29850797 PMCID: PMC6007501 DOI: 10.1093/gbe/evy105] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/22/2018] [Indexed: 12/15/2022] Open
Abstract
The small and conserved genomes of birds are likely a result of flight-related metabolic constraints. Recombination-driven deletions and minimal transposable element (TE) expansions have led to continually shrinking genomes during evolution of many lineages of volant birds. Despite constraints of genome size in birds, we identified multiple waves of amplification of TEs in Piciformes (woodpeckers, honeyguides, toucans, and barbets). Relative to other bird species’ genomic TE abundance (< 10% of genome), we found ∼17–30% TE content in multiple clades within Piciformes. Several families of the retrotransposon superfamily chicken repeat 1 (CR1) expanded in at least three different waves of activity. The most recent CR1 expansions (∼4–7% of genome) preceded bursts of diversification in the woodpecker clade and in the American barbets + toucans clade. Additionally, we identified several thousand polymorphic CR1 insertions (hundreds per individual) in three closely related woodpecker species. Woodpecker CR1 insertion polymorphisms are maintained at lower frequencies than single nucleotide polymorphisms indicating that purifying selection is acting against additional CR1 copies and that these elements impose a fitness cost on their host. These findings provide evidence of large scale and ongoing TE activity in avian genomes despite continual constraint on genome size.
Collapse
Affiliation(s)
- Joseph D Manthey
- New York University Abu Dhabi, UAE.,Department of Biological Sciences, Texas Tech University
| | - Robert G Moyle
- Department of Ecology and Evolutionary Biology, Biodiversity Institute, University of Kansas
| | | |
Collapse
|
42
|
D'Agostino N, Tamburino R, Cantarella C, De Carluccio V, Sannino L, Cozzolino S, Cardi T, Scotti N. The Complete Plastome Sequences of Eleven Capsicum Genotypes: Insights into DNA Variation and Molecular Evolution. Genes (Basel) 2018; 9:E503. [PMID: 30336638 PMCID: PMC6210379 DOI: 10.3390/genes9100503] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 10/11/2018] [Accepted: 10/11/2018] [Indexed: 11/16/2022] Open
Abstract
Members of the genus Capsicum are of great economic importance, including both wild forms and cultivars of peppers and chilies. The high number of potentially informative characteristics that can be identified through next-generation sequencing technologies gave a huge boost to evolutionary and comparative genomic research in higher plants. Here, we determined the complete nucleotide sequences of the plastomes of eight Capsicum species (eleven genotypes), representing the three main taxonomic groups in the genus and estimated molecular diversity. Comparative analyses highlighted a wide spectrum of variation, ranging from point mutations to small/medium size insertions/deletions (InDels), with accD, ndhB, rpl20, ycf1, and ycf2 being the most variable genes. The global pattern of sequence variation is consistent with the phylogenetic signal. Maximum-likelihood tree estimation revealed that Capsicum chacoense is sister to the baccatum complex. Divergence and positive selection analyses unveiled that protein-coding genes were generally well conserved, but we identified 25 positive signatures distributed in six genes involved in different essential plastid functions, suggesting positive selection during evolution of Capsicum plastomes. Finally, the identified sequence variation allowed us to develop simple PCR-based markers useful in future work to discriminate species belonging to different Capsicum complexes.
Collapse
Affiliation(s)
- Nunzio D'Agostino
- CREA Research Centre for Vegetable and Ornamental Crops, Via dei Cavalleggeri 25, 84098 Pontecagnano Faiano (SA), Italy.
| | - Rachele Tamburino
- CNR-IBBR, National Research Council of Italy, Institute of Biosciences and BioResources, Via Università 133, 80055 Portici (NA), Italy.
| | - Concita Cantarella
- CREA Research Centre for Vegetable and Ornamental Crops, Via dei Cavalleggeri 25, 84098 Pontecagnano Faiano (SA), Italy.
| | - Valentina De Carluccio
- CREA Research Centre for Vegetable and Ornamental Crops, Via dei Cavalleggeri 25, 84098 Pontecagnano Faiano (SA), Italy.
- Department of Biology, University of Naples Federico II, Via Cinthia, 80126 Naples, Italy.
| | - Lorenza Sannino
- CNR-IBBR, National Research Council of Italy, Institute of Biosciences and BioResources, Via Università 133, 80055 Portici (NA), Italy.
| | - Salvatore Cozzolino
- Department of Biology, University of Naples Federico II, Via Cinthia, 80126 Naples, Italy.
| | - Teodoro Cardi
- CREA Research Centre for Vegetable and Ornamental Crops, Via dei Cavalleggeri 25, 84098 Pontecagnano Faiano (SA), Italy.
| | - Nunzia Scotti
- CNR-IBBR, National Research Council of Italy, Institute of Biosciences and BioResources, Via Università 133, 80055 Portici (NA), Italy.
| |
Collapse
|
43
|
Tilak MK, Botero-Castro F, Galtier N, Nabholz B. Illumina Library Preparation for Sequencing the GC-Rich Fraction of Heterogeneous Genomic DNA. Genome Biol Evol 2018; 10:616-622. [PMID: 29385572 PMCID: PMC5808798 DOI: 10.1093/gbe/evy022] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/18/2018] [Indexed: 02/06/2023] Open
Abstract
Standard Illumina libraries are biased toward sequences of intermediate GC-content. This results in an underrepresentation of GC-rich regions in sequencing projects of genomes with heterogeneous base composition, such as mammals and birds. We developed a simple, cost-effective protocol to enrich sheared genomic DNA in its GC-rich fraction by subtracting AT-rich DNA. This was achieved by heating DNA up to 90 °C before applying Illumina library preparation. We tested the new approach on chicken DNA and found that heated DNA increased average coverage in the GC-richest chromosomes by a factor up to six. Using a Taq polymerase supposedly appropriate for PCR amplification of GC-rich sequences had a much weaker effect. Our protocol should greatly facilitate sequencing and resequencing of the GC-richest regions of heterogeneous genomes, in combination with standard short-read and long-read technologies.
Collapse
Affiliation(s)
- Marie-Ka Tilak
- Institut des Sciences de l'Evolution, ISEM, Université de Montellier, CNRS, IRD, EPHE, France
| | - Fidel Botero-Castro
- Institut des Sciences de l'Evolution, ISEM, Université de Montellier, CNRS, IRD, EPHE, France
| | - Nicolas Galtier
- Institut des Sciences de l'Evolution, ISEM, Université de Montellier, CNRS, IRD, EPHE, France
| | - Benoit Nabholz
- Institut des Sciences de l'Evolution, ISEM, Université de Montellier, CNRS, IRD, EPHE, France
| |
Collapse
|
44
|
Piégu B, Arensburger P, Guillou F, Bigot Y. But where did the centromeres go in the chicken genome models? Chromosome Res 2018; 26:297-306. [PMID: 30225548 DOI: 10.1007/s10577-018-9585-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/31/2018] [Accepted: 09/03/2018] [Indexed: 11/30/2022]
Abstract
The chicken genome was the third vertebrate to be sequenced. To date, its sequence and feature annotations are used as the reference for avian models in genome sequencing projects developed on birds and other Sauropsida species, and in genetic studies of domesticated birds of economic and evolutionary biology interest. Therefore, an accurate description of this genome model is important to a wide number of scientists. Here, we review the location and features of a very basic element, the centromeres of chromosomes in the galGal5 genome model. Centromeres are elements that are not determined by their DNA sequence but by their epigenetic status, in particular by the accumulation of the histone-like protein CENP-A. Comparison of data from several public sources (primarily marker probes flanking centromeres using fluorescent in situ hybridization done on giant lampbrush chromosomes and CENP-A ChIP-seq datasets) with galGal5 annotations revealed that centromeres are likely inappropriately mapped in 9 of the 16 galGal5 chromosome models in which they are described. Analysis of karyology data confirmed that the location of the main CENP-A peaks in chromosomes is the best means of locating the centromeres in 25 galGal5 chromosome models, the majority of which (16) are fully sequenced and assembled. This data re-analysis reaffirms that several sources of information should be examined to produce accurate genome annotations, particularly for basic structures such as centromeres that are epigenetically determined.
Collapse
Affiliation(s)
- Benoît Piégu
- PRC, UMR INRA0085, CNRS 7247, Centre INRA Val de Loire, 37380, Nouzilly, France
| | - Peter Arensburger
- Biological Sciences Department, California State Polytechnic University, Pomona, CA, 91768, USA
| | - Florian Guillou
- PRC, UMR INRA0085, CNRS 7247, Centre INRA Val de Loire, 37380, Nouzilly, France
| | - Yves Bigot
- PRC, UMR INRA0085, CNRS 7247, Centre INRA Val de Loire, 37380, Nouzilly, France.
| |
Collapse
|
45
|
Peona V, Weissensteiner MH, Suh A. How complete are “complete” genome assemblies?-An avian perspective. Mol Ecol Resour 2018; 18:1188-1195. [DOI: 10.1111/1755-0998.12933] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Revised: 06/11/2018] [Accepted: 07/06/2018] [Indexed: 12/26/2022]
Affiliation(s)
- Valentina Peona
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Uppsala Sweden
| | - Matthias H. Weissensteiner
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Uppsala Sweden
- Division of Evolutionary Biology; Faculty of Biology; Ludwig-Maximilian University of Munich; Planegg-Martinsried Germany
| | - Alexander Suh
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Uppsala Sweden
| |
Collapse
|
46
|
Affiliation(s)
- Adam M Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
47
|
Sutton JT, Helmkampf M, Steiner CC, Bellinger MR, Korlach J, Hall R, Baybayan P, Muehling J, Gu J, Kingan S, Masuda BM, Ryder OA. A High-Quality, Long-Read De Novo Genome Assembly to Aid Conservation of Hawaii's Last Remaining Crow Species. Genes (Basel) 2018; 9:genes9080393. [PMID: 30071683 PMCID: PMC6115840 DOI: 10.3390/genes9080393] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Revised: 07/23/2018] [Accepted: 07/27/2018] [Indexed: 11/16/2022] Open
Abstract
Genome-level data can provide researchers with unprecedented precision to examine the causes and genetic consequences of population declines, which can inform conservation management. Here, we present a high-quality, long-read, de novo genome assembly for one of the world’s most endangered bird species, the ʻAlalā (Corvus hawaiiensis; Hawaiian crow). As the only remaining native crow species in Hawaiʻi, the ʻAlalā survived solely in a captive-breeding program from 2002 until 2016, at which point a long-term reintroduction program was initiated. The high-quality genome assembly was generated to lay the foundation for both comparative genomics studies and the development of population-level genomic tools that will aid conservation and recovery efforts. We illustrate how the quality of this assembly places it amongst the very best avian genomes assembled to date, comparable to intensively studied model systems. We describe the genome architecture in terms of repetitive elements and runs of homozygosity, and we show that compared with more outbred species, the ʻAlalā genome is substantially more homozygous. We also provide annotations for a subset of immunity genes that are likely to be important in conservation management, and we discuss how this genome is currently being used as a roadmap for downstream conservation applications.
Collapse
Affiliation(s)
- Jolene T Sutton
- Department of Biology, University of Hawaii at Hilo, Hilo, HI 96720, USA.
| | - Martin Helmkampf
- Department of Biology, University of Hawaii at Hilo, Hilo, HI 96720, USA.
| | - Cynthia C Steiner
- Institute for Conservation Research, San Diego Zoo, Escondido, CA 92027, USA.
| | - M Renee Bellinger
- Department of Biology, University of Hawaii at Hilo, Hilo, HI 96720, USA.
| | | | | | | | | | - Jenny Gu
- Pacific Biosciences, Menlo Park, CA 94025, USA.
| | | | - Bryce M Masuda
- Institute for Conservation Research, San Diego Zoo Global, Volcano, HI 96785, USA.
| | - Oliver A Ryder
- Institute for Conservation Research, San Diego Zoo, Escondido, CA 92027, USA.
| |
Collapse
|
48
|
Kronenberg ZN, Fiddes IT, Gordon D, Murali S, Cantsilieris S, Meyerson OS, Underwood JG, Nelson BJ, Chaisson MJP, Dougherty ML, Munson KM, Hastie AR, Diekhans M, Hormozdiari F, Lorusso N, Hoekzema K, Qiu R, Clark K, Raja A, Welch AE, Sorensen M, Baker C, Fulton RS, Armstrong J, Graves-Lindsay TA, Denli AM, Hoppe ER, Hsieh P, Hill CM, Pang AWC, Lee J, Lam ET, Dutcher SK, Gage FH, Warren WC, Shendure J, Haussler D, Schneider VA, Cao H, Ventura M, Wilson RK, Paten B, Pollen A, Eichler EE. High-resolution comparative analysis of great ape genomes. Science 2018; 360:eaar6343. [PMID: 29880660 PMCID: PMC6178954 DOI: 10.1126/science.aar6343] [Citation(s) in RCA: 225] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 04/02/2018] [Indexed: 12/22/2022]
Abstract
Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single- to mega-base pair-sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.
Collapse
Affiliation(s)
- Zev N Kronenberg
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ian T Fiddes
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Shwetha Murali
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Stuart Cantsilieris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Olivia S Meyerson
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Jason G Underwood
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Pacific Biosciences (PacBio) of California, Inc., Menlo Park, CA 94025, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Mark J P Chaisson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Computational Biology and Bioinformatics, University of Southern California, Los Angeles, CA 90089, USA
| | - Max L Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Fereydoun Hormozdiari
- Department of Biochemistry and Molecular Medicine, University of California, Davis, Davis, CA 95817, USA
| | - Nicola Lorusso
- Department of Biology, University of Bari, Aldo Moro, Bari 70121, Italy
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ruolan Qiu
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Karen Clark
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Archana Raja
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - AnneMarie E Welch
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Melanie Sorensen
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Robert S Fulton
- Departments of Medicine and Genetics, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Joel Armstrong
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Tina A Graves-Lindsay
- Departments of Medicine and Genetics, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Ahmet M Denli
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Emma R Hoppe
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Christopher M Hill
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | | - Joyce Lee
- Bionano Genomics, San Diego, CA 92121, USA
| | | | - Susan K Dutcher
- Departments of Medicine and Genetics, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Fred H Gage
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Wesley C Warren
- Departments of Medicine and Genetics, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Valerie A Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Han Cao
- Bionano Genomics, San Diego, CA 92121, USA
| | - Mario Ventura
- Department of Biology, University of Bari, Aldo Moro, Bari 70121, Italy
| | - Richard K Wilson
- Departments of Medicine and Genetics, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Alex Pollen
- Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
49
|
Rohde F, Schusser B, Hron T, Farkašová H, Plachý J, Härtle S, Hejnar J, Elleder D, Kaspers B. Characterization of Chicken Tumor Necrosis Factor-α, a Long Missed Cytokine in Birds. Front Immunol 2018; 9:605. [PMID: 29719531 PMCID: PMC5913325 DOI: 10.3389/fimmu.2018.00605] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 03/09/2018] [Indexed: 11/13/2022] Open
Abstract
Tumor necrosis factor-α (TNF-α) is a pleiotropic cytokine playing critical roles in host defense and acute and chronic inflammation. It has been described in fish, amphibians, and mammals but was considered to be absent in the avian genomes. Here, we report on the identification and functional characterization of the avian ortholog. The chicken TNF-α (chTNF-α) is encoded by a highly GC-rich gene, whose product shares with its mammalian counterpart 45% homology in the extracellular part displaying the characteristic TNF homology domain. Orthologs of chTNF-α were identified in the genomes of 12 additional avian species including Palaeognathae and Neognathae, and the synteny of the closely adjacent loci with mammalian TNF-α orthologs was demonstrated in the crow (Corvus cornix) genome. In addition to chTNF-α, we obtained full sequences for homologs of TNF-α receptors 1 and 2 (TNFR1, TNFR2). chTNF-α mRNA is strongly induced by lipopolysaccharide (LPS) stimulation of monocyte derived, splenic and bone marrow macrophages, and significantly upregulated in splenic tissue in response to i.v. LPS treatment. Activation of T-lymphocytes by TCR crosslinking induces chTNF-α expression in CD4+ but not in CD8+ cells. To gain insights into its biological activity, we generated recombinant chTNF-α in eukaryotic and prokaryotic expression systems. Both, the full-length cytokine and the extracellular domain rapidly induced an NFκB-luciferase reporter in stably transfected CEC-32 reporter cells. Collectively, these data provide strong evidence for the existence of a fully functional TNF-α/TNF-α receptor system in birds thus filling a gap in our understanding of the evolution of cytokine systems.
Collapse
Affiliation(s)
- Franziska Rohde
- Department of Veterinary Science, Ludwig-Maximilians-Universität, Munich, Germany
| | - Benjamin Schusser
- Reproductive Biotechnology, Department of Animal Sciences, Technical University Munich, Munich, Germany
| | - Tomáš Hron
- Laboratory of Viral and Cellular Genetics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Helena Farkašová
- Laboratory of Viral and Cellular Genetics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Jiří Plachý
- Laboratory of Viral and Cellular Genetics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Sonja Härtle
- Department of Veterinary Science, Ludwig-Maximilians-Universität, Munich, Germany
| | - Jiří Hejnar
- Laboratory of Viral and Cellular Genetics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Daniel Elleder
- Laboratory of Viral and Cellular Genetics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Bernd Kaspers
- Department of Veterinary Science, Ludwig-Maximilians-Universität, Munich, Germany
| |
Collapse
|
50
|
Lower SS, McGurk MP, Clark AG, Barbash DA. Satellite DNA evolution: old ideas, new approaches. Curr Opin Genet Dev 2018; 49:70-78. [PMID: 29579574 PMCID: PMC5975084 DOI: 10.1016/j.gde.2018.03.003] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 02/02/2018] [Accepted: 03/08/2018] [Indexed: 12/22/2022]
Abstract
A substantial portion of the genomes of most multicellular eukaryotes consists of large arrays of tandemly repeated sequence, collectively called satellite DNA. The processes generating and maintaining different satellite DNA abundances across lineages are important to understand as satellites have been linked to chromosome mis-segregation, disease phenotypes, and reproductive isolation between species. While much theory has been developed to describe satellite evolution, empirical tests of these models have fallen short because of the challenges in assessing satellite repeat regions of the genome. Advances in computational tools and sequencing technologies now enable identification and quantification of satellite sequences genome-wide. Here, we describe some of these tools and how their applications are furthering our knowledge of satellite evolution and function.
Collapse
Affiliation(s)
- Sarah Sander Lower
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14853, United States
| | - Michael P McGurk
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14853, United States
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14853, United States
| | - Daniel A Barbash
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Rd, Ithaca, NY 14853, United States.
| |
Collapse
|