1
|
Merondun J, Marques CI, Andrade P, Meshcheryagina S, Galván I, Afonso S, Alves JM, Araújo PM, Bachurin G, Balacco J, Bán M, Fedrigo O, Formenti G, Fossøy F, Fülöp A, Golovatin M, Granja S, Hewson C, Honza M, Howe K, Larson G, Marton A, Moskát C, Mountcastle J, Procházka P, Red’kin Y, Sims Y, Šulc M, Tracey A, Wood JMD, Jarvis ED, Hauber ME, Carneiro M, Wolf JBW. Evolution and genetic architecture of sex-limited polymorphism in cuckoos. Sci Adv 2024; 10:eadl5255. [PMID: 38657058 PMCID: PMC11042743 DOI: 10.1126/sciadv.adl5255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 03/20/2024] [Indexed: 04/26/2024]
Abstract
Sex-limited polymorphism has evolved in many species including our own. Yet, we lack a detailed understanding of the underlying genetic variation and evolutionary processes at work. The brood parasitic common cuckoo (Cuculus canorus) is a prime example of female-limited color polymorphism, where adult males are monochromatic gray and females exhibit either gray or rufous plumage. This polymorphism has been hypothesized to be governed by negative frequency-dependent selection whereby the rarer female morph is protected against harassment by males or from mobbing by parasitized host species. Here, we show that female plumage dichromatism maps to the female-restricted genome. We further demonstrate that, consistent with balancing selection, ancestry of the rufous phenotype is shared with the likewise female dichromatic sister species, the oriental cuckoo (Cuculus optatus). This study shows that sex-specific polymorphism in trait variation can be resolved by genetic variation residing on a sex-limited chromosome and be maintained across species boundaries.
Collapse
Affiliation(s)
- Justin Merondun
- Division of Evolutionary Biology, LMU Munich, Planegg-Martinsried, Germany
- Department of Ornithology, Max Planck Institute for Biological Intelligence, Seewiesen, Germany
| | - Cristiana I. Marques
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| | - Pedro Andrade
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| | - Swetlana Meshcheryagina
- Institute of Plant and Animal Ecology, Ural Branch, Russian Academy of Sciences, Yekaterinburg, Russia
| | - Ismael Galván
- Departamento de Ecología Evolutiva, Museo Nacional de Ciencias Naturales, CSIC, Madrid, Spain
| | - Sandra Afonso
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| | - Joel M. Alves
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
- Palaeogenomics and Bio-Archaeology Research Network, School of Archaeology, University of Oxford, Oxford, OX1 3QY, UK
| | - Pedro M. Araújo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Department of Life Sciences, MARE–Marine and Environmental Sciences Centre/ARNET–Aquatic Research Network, University of Coimbra, Coimbra, Portugal
| | | | - Jennifer Balacco
- The Vertebrate Genome Lab, Rockefeller University, New York, NY 10065, USA
| | - Miklós Bán
- HUN-REN-UD Behavioral Ecology Research Group, Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | - Olivier Fedrigo
- The Vertebrate Genome Lab, Rockefeller University, New York, NY 10065, USA
| | - Giulio Formenti
- The Vertebrate Genome Lab, Rockefeller University, New York, NY 10065, USA
| | - Frode Fossøy
- Centre for Biodiversity Genetics, Norwegian Institute for Nature Research, Trondheim, Norway
| | - Attila Fülöp
- HUN-REN-UD Behavioral Ecology Research Group, Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
- Evolutionary Ecology Group, Hungarian Department of Biology and Ecology, Babeş-Bolyai University, Cluj-Napoca, Romania
- STAR-UBB Institute of Advanced Studies in Science and Technology, Babeş-Bolyai University, Cluj-Napoca, Romania
| | - Mikhail Golovatin
- Institute of Plant and Animal Ecology, Ural Branch, Russian Academy of Sciences, Yekaterinburg, Russia
| | - Sofia Granja
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
- Palaeogenomics and Bio-Archaeology Research Network, School of Archaeology, University of Oxford, Oxford, OX1 3QY, UK
| | | | - Marcel Honza
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| | - Kerstin Howe
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Greger Larson
- Palaeogenomics and Bio-Archaeology Research Network, School of Archaeology, University of Oxford, Oxford, OX1 3QY, UK
| | - Attila Marton
- Evolutionary Ecology Group, Faculty of Biology and Geology, Babeș-Bolyai University, Cluj-Napoca, Romania
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | - Csaba Moskát
- Hungarian Natural History Museum, Budapest, Hungary
| | | | - Petr Procházka
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| | | | - Ying Sims
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Michal Šulc
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| | - Alan Tracey
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Erich D. Jarvis
- The Vertebrate Genome Lab, Rockefeller University, New York, NY 10065, USA
| | - Mark E. Hauber
- Advanced Science Research Center and Program in Psychology, Graduate Center of the City University of New York, New York, NY 10031, USA
| | - Miguel Carneiro
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| | - Jochen B. W. Wolf
- Division of Evolutionary Biology, LMU Munich, Planegg-Martinsried, Germany
| |
Collapse
|
2
|
Mirarab S, Rivas-González I, Feng S, Stiller J, Fang Q, Mai U, Hickey G, Chen G, Brajuka N, Fedrigo O, Formenti G, Wolf JBW, Howe K, Antunes A, Schierup MH, Paten B, Jarvis ED, Zhang G, Braun EL. A region of suppressed recombination misleads neoavian phylogenomics. Proc Natl Acad Sci U S A 2024; 121:e2319506121. [PMID: 38557186 PMCID: PMC11009670 DOI: 10.1073/pnas.2319506121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 02/07/2024] [Indexed: 04/04/2024] Open
Abstract
Genomes are typically mosaics of regions with different evolutionary histories. When speciation events are closely spaced in time, recombination makes the regions sharing the same history small, and the evolutionary history changes rapidly as we move along the genome. When examining rapid radiations such as the early diversification of Neoaves 66 Mya, typically no consistent history is observed across segments exceeding kilobases of the genome. Here, we report an exception. We found that a 21-Mb region in avian genomes, mapped to chicken chromosome 4, shows an extremely strong and discordance-free signal for a history different from that of the inferred species tree. Such a strong discordance-free signal, indicative of suppressed recombination across many millions of base pairs, is not observed elsewhere in the genome for any deep avian relationships. Although long regions with suppressed recombination have been documented in recently diverged species, our results pertain to relationships dating circa 65 Mya. We provide evidence that this strong signal may be due to an ancient rearrangement that blocked recombination and remained polymorphic for several million years prior to fixation. We show that the presence of this region has misled previous phylogenomic efforts with lower taxon sampling, showing the interplay between taxon and locus sampling. We predict that similar ancient rearrangements may confound phylogenetic analyses in other clades, pointing to a need for new analytical models that incorporate the possibility of such events.
Collapse
Affiliation(s)
- Siavash Mirarab
- Electrical and Computer Engineering Department, University of California, San Diego, CA95032
| | | | - Shaohong Feng
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou310058, China
- Liangzhu Laboratory, Zhejiang University, Hangzhou311121, China
| | - Josefin Stiller
- Section for Ecology & Evolution, Department of Biology, University of Copenhagen, København2100, Denmark
| | - Qi Fang
- BGI-Research, Shenzhen518083, China
| | - Uyen Mai
- Electrical and Computer Engineering Department, University of California, San Diego, CA95032
| | - Glenn Hickey
- Genomics Institute, University of California, Santa Cruz, CA96064
| | - Guangji Chen
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou310058, China
- Liangzhu Laboratory, Zhejiang University, Hangzhou311121, China
| | - Nadolina Brajuka
- Vertebrate Genome Lab, Rockefeller University, New York, NY10065
| | - Olivier Fedrigo
- Vertebrate Genome Lab, Rockefeller University, New York, NY10065
| | - Giulio Formenti
- Vertebrate Genome Lab, Rockefeller University, New York, NY10065
| | - Jochen B. W. Wolf
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximillians-Universität, Munich82152, Germany
| | - Kerstin Howe
- Tree of Life Division, Wellcome Sanger Institute, CambridgeCB10 1RQ, United Kingdom
| | - Agostinho Antunes
- Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto4099-002, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Porto4099-002, Portugal
| | | | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, CA96064
| | - Erich D. Jarvis
- Vertebrate Genome Lab, Rockefeller University, New York, NY10065
| | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou310058, China
| | - Edward L. Braun
- Department of Biology, University of Florida, Gainesville, FL32611
| |
Collapse
|
3
|
De Jode A, Faria R, Formenti G, Sims Y, Smith TP, Tracey A, Wood JMD, Zagrodzka ZB, Johannesson K, Butlin RK, Leder EH. Chromosome-scale Genome Assembly of the Rough Periwinkle Littorina saxatilis. Genome Biol Evol 2024; 16:evae076. [PMID: 38584387 PMCID: PMC11050657 DOI: 10.1093/gbe/evae076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/26/2024] [Accepted: 03/29/2024] [Indexed: 04/09/2024] Open
Abstract
The intertidal gastropod Littorina saxatilis is a model system to study speciation and local adaptation. The repeated occurrence of distinct ecotypes showing different levels of genetic divergence makes L. saxatilis particularly suited to study different stages of the speciation continuum in the same lineage. A major finding is the presence of several large chromosomal inversions associated with the divergence of ecotypes and, specifically, the species offers a system to study the role of inversions in this divergence. The genome of L. saxatilis is 1.35 Gb and composed of 17 chromosomes. The first reference genome of the species was assembled using Illumina data, was highly fragmented (N50 of 44 kb), and was quite incomplete, with a BUSCO completeness of 80.1% on the Metazoan dataset. A linkage map of one full-sibling family enabled the placement of 587 Mbp of the genome into 17 linkage groups corresponding to the haploid number of chromosomes, but the fragmented nature of this reference genome limited the understanding of the interplay between divergent selection and gene flow during ecotype formation. Here, we present a newly generated reference genome that is highly contiguous, with a N50 of 67 Mb and 90.4% of the total assembly length placed in 17 super-scaffolds. It is also highly complete with a BUSCO completeness of 94.1% of the Metazoa dataset. This new reference will allow for investigations into the genomic regions implicated in ecotype formation as well as better characterization of the inversions and their role in speciation.
Collapse
Affiliation(s)
- Aurélien De Jode
- Department of Marine Sciences, Tjärnö Marine Laboratory, University of Gothenburg, SE 45296 Strömstad, Sweden
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL, USA
- Dauphin Island Sea Lab, Dauphin Island, AL, USA
| | - Rui Faria
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661 Vairão, Portugal
- Ecology and Evolutionary Biology, School of Biosciences, The University of Sheffield, Sheffield S10 2TN, UK
| | - Giulio Formenti
- The Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Ying Sims
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Timothy P Smith
- USDA Agricultural Research Service, U.S. Meat Animal Research Center, Clay Center, NE 68933, USA
| | - Alan Tracey
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Jonathan M D Wood
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Zuzanna B Zagrodzka
- Ecology and Evolutionary Biology, School of Biosciences, The University of Sheffield, Sheffield S10 2TN, UK
| | - Kerstin Johannesson
- Department of Marine Sciences, Tjärnö Marine Laboratory, University of Gothenburg, SE 45296 Strömstad, Sweden
| | - Roger K Butlin
- Department of Marine Sciences, Tjärnö Marine Laboratory, University of Gothenburg, SE 45296 Strömstad, Sweden
- Ecology and Evolutionary Biology, School of Biosciences, The University of Sheffield, Sheffield S10 2TN, UK
| | - Erica H Leder
- Department of Marine Sciences, Tjärnö Marine Laboratory, University of Gothenburg, SE 45296 Strömstad, Sweden
- Natural History Museum, University of Oslo, Oslo, Norway
| |
Collapse
|
4
|
Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y, Marschall T, Li H, Paten B, Abel HJ, Antonacci-Fulton LL, Asri M, Baid G, Baker CA, Belyaeva A, Billis K, Bourque G, Buonaiuto S, Carroll A, Chaisson MJP, Chang PC, Chang XH, Cheng H, Chu J, Cody S, Colonna V, Cook DE, Cook-Deegan RM, Cornejo OE, Diekhans M, Doerr D, Ebert P, Ebler J, Eichler EE, Eizenga JM, Fairley S, Fedrigo O, Felsenfeld AL, Feng X, Fischer C, Flicek P, Formenti G, Frankish A, Fulton RS, Gao Y, Garg S, Garrison E, Garrison NA, Giron CG, Green RE, Groza C, Guarracino A, Haggerty L, Hall IM, Harvey WT, Haukness M, Haussler D, Heumos S, Hickey G, Hoekzema K, Hourlier T, Howe K, Jain M, Jarvis ED, Ji HP, Kenny EE, Koenig BA, Kolesnikov A, Korbel JO, Kordosky J, Koren S, Lee H, Lewis AP, Li H, Liao WW, Lu S, Lu TY, Lucas JK, Magalhães H, Marco-Sola S, Marijon P, Markello C, Marschall T, Martin FJ, McCartney A, McDaniel J, Miga KH, Mitchell MW, Monlong J, Mountcastle J, Munson KM, Mwaniki MN, Nattestad M, Novak AM, Nurk S, Olsen HE, Olson ND, Paten B, Pesout T, Phillippy AM, Popejoy AB, Porubsky D, Prins P, Puiu D, Rautiainen M, Regier AA, Rhie A, Sacco S, Sanders AD, Schneider VA, Schultz BI, Shafin K, Sibbesen JA, Sirén J, Smith MW, Sofia HJ, Tayoun ANA, Thibaud-Nissen F, Tomlinson C, Tricomi FF, Villani F, Vollger MR, Wagner J, Walenz B, Wang T, Wood JMD, Zimin AV, Zook JM. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol 2024; 42:663-673. [PMID: 37165083 PMCID: PMC10638906 DOI: 10.1038/s41587-023-01793-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 04/18/2023] [Indexed: 05/12/2023]
Abstract
Pangenome references address biases of reference genomes by storing a representative set of diverse haplotypes and their alignment, usually as a graph. Alternate alleles determined by variant callers can be used to construct pangenome graphs, but advances in long-read sequencing are leading to widely available, high-quality phased assemblies. Constructing a pangenome graph directly from assemblies, as opposed to variant calls, leverages the graph's ability to represent variation at different scales. Here we present the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments, and demonstrate its ability to scale to 90 human haplotypes from the Human Pangenome Reference Consortium. The method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools. We measure the effect of the quality and completeness of reference genomes used for analysis within the pangenomes and show that using the CHM13 reference from the Telomere-to-Telomere Consortium improves the accuracy of our methods. We also demonstrate construction of a Drosophila melanogaster pangenome.
Collapse
Affiliation(s)
- Glenn Hickey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- These authors contributed equally: Glenn Hickey, Jean Monlong
| | - Jean Monlong
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- These authors contributed equally: Glenn Hickey, Jean Monlong
| | - Jana Ebler
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Adam M. Novak
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Jordan M. Eizenga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Yan Gao
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | | | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Haley J. Abel
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | | | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Carl A. Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Canadian Center for Computational Genomics, McGill University, Montreal, QC, Canada
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Silvia Buonaiuto
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
| | | | - Mark J. P. Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | | | - Xian H. Chang
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Justin Chu
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sarah Cody
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Vincenza Colonna
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - Robert M. Cook-Deegan
- Arizona State University, Barrett and O’Connor Washington Center, Washington, DC, USA
| | - Omar E. Cornejo
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Daniel Doerr
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Peter Ebert
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Core Unit Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Jana Ebler
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Jordan M. Eizenga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Susan Fairley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Adam L. Felsenfeld
- National Institutes of Health (NIH)–National Human Genome Research Institute, Bethesda, MD, USA
| | - Xiaowen Feng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Christian Fischer
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Robert S. Fulton
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Yan Gao
- Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shilpa Garg
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Nanibaa’ A. Garrison
- Institute for Society and Genetics, College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
- Institute for Precision Health, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Division of General Internal Medicine and Health Services Research, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Carlos Garcia Giron
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Richard E. Green
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- Dovetail Genomics, Scotts Valley, CA, USA
| | - Cristian Groza
- Quantitative Life Sciences, McGill University, Montreal, QC, Canada
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ira M. Hall
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
- Center for Genomic Health, Yale University School of Medicine, New Haven, CT, USA
| | - William T. Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Simon Heumos
- Quantitative Biology Center (QBiC), University of Tübingen, Tübingen, Germany
- Biomedical Data Science, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Glenn Hickey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- These authors contributed equally: Glenn Hickey, Jean Monlong
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - Miten Jain
- Northeastern University, Boston, MA, USA
| | - Erich D. Jarvis
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Hanlee P. Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Eimear E. Kenny
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Barbara A. Koenig
- Program in Bioethics and Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | | | - Jan O. Korbel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - HoJoon Lee
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Alexandra P. Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Wen-Wei Liao
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
- Center for Genomic Health, Yale University School of Medicine, New Haven, CT, USA
- Division of Biology and Biomedical Sciences, Washington University School of Medicine, St. Louis, MO, USA
| | - Shuangjia Lu
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
| | - Tsung-Yu Lu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Julian K. Lucas
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Hugo Magalhães
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Santiago Marco-Sola
- Computer Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
- Departament d’Arquitectura de Computadors i Sistemes Operatius, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Pierre Marijon
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Charles Markello
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Tobias Marschall
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Fergal J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ann McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jennifer McDaniel
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Jean Monlong
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- These authors contributed equally: Glenn Hickey, Jean Monlong
| | | | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Adam M. Novak
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Hugh E. Olsen
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Nathan D. Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Trevor Pesout
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice B. Popejoy
- Department of Public Health Sciences, University of California, Davis, Davis, CA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Pjotr Prins
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Daniela Puiu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Allison A. Regier
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Samuel Sacco
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ashley D. Sanders
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Baergen I. Schultz
- National Institutes of Health (NIH)–National Human Genome Research Institute, Bethesda, MD, USA
| | | | - Jonas A. Sibbesen
- Center for Health Data Science, University of Copenhagen, Copenhagen, Denmark
| | - Jouni Sirén
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Michael W. Smith
- National Institutes of Health (NIH)–National Human Genome Research Institute, Bethesda, MD, USA
| | - Heidi J. Sofia
- National Institutes of Health (NIH)–National Human Genome Research Institute, Bethesda, MD, USA
| | - Ahmad N. Abou Tayoun
- Al Jalila Genomics Center of Excellence, Al Jalila Children’s Specialty Hospital, Dubai, UAE
- Center for Genomic Discovery, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Chad Tomlinson
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Francesca Floriana Tricomi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Flavia Villani
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, WA, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Brian Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ting Wang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | | | - Aleksey V. Zimin
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Justin M. Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| |
Collapse
|
5
|
Cadena CD, Pabón L, DoNascimiento C, Abueg L, Tilley T, O-Toole B, Absolon D, Sims Y, Formenti G, Fedrigo O, Jarvis ED, Torres M. A reference genome for the Andean cavefish Trichomycterus rosablanca (Siluriformes, Trichomycteridae): building genomic resources to study evolution in cave environments. J Hered 2024:esae019. [PMID: 38513109 DOI: 10.1093/jhered/esae019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Indexed: 03/23/2024] Open
Abstract
Animals living in caves are of broad relevance to evolutionary biologists interested in understanding the mechanisms underpinning convergent evolution. In the Eastern Andes of Colombia, populations from at least two distinct clades of Trichomycterus catfishes (Siluriformes) independently colonized cave environments and converged in phenotype by losing their eyes and pigmentation. We are pursuing several research questions using genomics to understand the evolutionary forces and molecular mechanisms responsible for repeated morphological changes in this system. As a foundation for such studies, here we describe a diploid, chromosome-scale, long-read reference genome for Trichomycterus rosablanca, a blind, depigmented species endemic to the karstic system of the department of Santander. The nuclear genome comprises 1Gb in 27 chromosomes, with a 40.0x HiFi long-read genome coverage having a N50 scaffold of 40.4 Mb and N50 contig of 13.1 Mb, with 96.9% (Eukaryota) and 95.4% (Actinopterygii) universal single-copy orthologs (BUSCO). This assembly provides the first reference genome for the speciose genus Trichomycterus, serving as a key resource for research on the genomics of phenotypic evolution.
Collapse
Affiliation(s)
| | - Laura Pabón
- Departamento de Ciencias Biológicas, Universidad de los Andes, Bogotá, Colombia
| | | | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Tatiana Tilley
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Brian O-Toole
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Dominic Absolon
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Ying Sims
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, United States
| | | |
Collapse
|
6
|
Bukhman YV, Morin PA, Meyer S, Chu LF, Jacobsen JK, Antosiewicz-Bourget J, Mamott D, Gonzales M, Argus C, Bolin J, Berres ME, Fedrigo O, Steill J, Swanson SA, Jiang P, Rhie A, Formenti G, Phillippy AM, Harris RS, Wood JMD, Howe K, Kirilenko BM, Munegowda C, Hiller M, Jain A, Kihara D, Johnston JS, Ionkov A, Raja K, Toh H, Lang A, Wolf M, Jarvis ED, Thomson JA, Chaisson MJP, Stewart R. A High-Quality Blue Whale Genome, Segmental Duplications, and Historical Demography. Mol Biol Evol 2024; 41:msae036. [PMID: 38376487 PMCID: PMC10919930 DOI: 10.1093/molbev/msae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 01/11/2024] [Accepted: 01/22/2024] [Indexed: 02/21/2024] Open
Abstract
The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long reads and utilized Illumina/10×, optical maps, and Hi-C data for scaffolding, polishing, and manual curation. We also provided long read RNA-seq data to facilitate the annotation of the assembly by NCBI and Ensembl. Additionally, we annotated both haplotypes using TOGA and measured the genome size by flow cytometry. We then compared the blue whale genome with other cetaceans and artiodactyls, including vaquita (Phocoena sinus), the world's smallest cetacean, to investigate blue whale's unique biological traits. We found a dramatic amplification of several genes in the blue whale genome resulting from a recent burst in segmental duplications, though the possible connection between this amplification and giant body size requires further study. We also discovered sites in the insulin-like growth factor-1 gene correlated with body size in cetaceans. Finally, using our assembly to examine the heterozygosity and historical demography of Pacific and Atlantic blue whale populations, we found that the genomes of both populations are highly heterozygous and that their genetic isolation dates to the last interglacial period. Taken together, these results indicate how a high-quality, annotated blue whale genome will serve as an important resource for biology, evolution, and conservation research.
Collapse
Affiliation(s)
- Yury V Bukhman
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Phillip A Morin
- Southwest Fisheries Science Center, National Oceanic and Atmospheric Administration (NOAA), La Jolla, CA 92037, USA
| | - Susanne Meyer
- Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
| | - Li-Fang Chu
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
- Department of Comparative Biology and Experimental Medicine, University of Calgary, Calgary, Canada
| | | | | | - Daniel Mamott
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Maylie Gonzales
- Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
| | - Cara Argus
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Jennifer Bolin
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Mark E Berres
- University of Wisconsin Biotechnology Center, Bioinformatics Resource Center, University of Wisconsin - Madison, Madison, WI 53706, USA
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
| | - John Steill
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Scott A Swanson
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Peng Jiang
- Center for Gene Regulation in Health and Disease (GRHD), Cleveland State University, Cleveland, OH, USA
- Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, OH, USA
- Center for RNA Science and Therapeutics, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Arang Rhie
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD 20892, USA
| | - Giulio Formenti
- Laboratory of Neurogenetics of Language, The Rockefeller University/HHMI, New York, NY 10065, USA
| | - Adam M Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD 20892, USA
| | - Robert S Harris
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | | | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, UK
| | - Bogdan M Kirilenko
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Chetan Munegowda
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Aashish Jain
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - J Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Alexander Ionkov
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Kalpana Raja
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| | - Huishi Toh
- Neuroscience Research Institute, University of California, Santa Barbara, CA, USA
| | - Aimee Lang
- Southwest Fisheries Science Center, National Oceanic and Atmospheric Administration (NOAA), La Jolla, CA 92037, USA
| | - Magnus Wolf
- Institute for Evolution and Biodiversity (IEB), University of Muenster, 48149, Muenster, Germany
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Frankfurt am Main, Germany
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University/HHMI, New York, NY 10065, USA
| | - James A Thomson
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
- Department of Molecular, Cellular and Developmental Biology, University of California Santa Barbara, Santa Barbara, CA 93106, USA
- Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, WI 53726, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Ron Stewart
- Regenerative Biology, Morgridge Institute for Research, Madison, WI 53715, USA
| |
Collapse
|
7
|
Larivière D, Abueg L, Brajuka N, Gallardo-Alba C, Grüning B, Ko BJ, Ostrovsky A, Palmada-Flores M, Pickett BD, Rabbani K, Antunes A, Balacco JR, Chaisson MJP, Cheng H, Collins J, Couture M, Denisova A, Fedrigo O, Gallo GR, Giani AM, Gooder GM, Horan K, Jain N, Johnson C, Kim H, Lee C, Marques-Bonet T, O'Toole B, Rhie A, Secomandi S, Sozzoni M, Tilley T, Uliano-Silva M, van den Beek M, Williams RW, Waterhouse RM, Phillippy AM, Jarvis ED, Schatz MC, Nekrutenko A, Formenti G. Scalable, accessible and reproducible reference genome assembly and evaluation in Galaxy. Nat Biotechnol 2024; 42:367-370. [PMID: 38278971 DOI: 10.1038/s41587-023-02100-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2024]
Affiliation(s)
- Delphine Larivière
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Nadolina Brajuka
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Cristóbal Gallardo-Alba
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs University Freiburg, Freiburg, Germany
| | - Bjorn Grüning
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs University Freiburg, Freiburg, Germany
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Alex Ostrovsky
- Departments of Biology and Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Marc Palmada-Flores
- Department of Medicine and Life Sciences (MELIS), Institut de Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, Barcelona, Spain
| | - Brandon D Pickett
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Keon Rabbani
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | | | - Melanie Couture
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Alexandra Denisova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | | | | | - Kathleen Horan
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Nivesh Jain
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Cassidy Johnson
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
- eGnome, Inc., Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Tomas Marques-Bonet
- Department of Medicine and Life Sciences (MELIS), Institut de Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
- CNAG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Brian O'Toole
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Simona Secomandi
- Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | | | - Tatiana Tilley
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Marius van den Beek
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Robert W Williams
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Robert M Waterhouse
- Department of Ecology & Evolution and Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA.
| | - Michael C Schatz
- Departments of Biology and Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA.
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
8
|
Bukhman YV, Meyer S, Chu LF, Abueg L, Antosiewicz-Bourget J, Balacco J, Brecht M, Dinatale E, Fedrigo O, Formenti G, Fungtammasan A, Giri SJ, Hiller M, Howe K, Kihara D, Mamott D, Mountcastle J, Pelan S, Rabbani K, Sims Y, Tracey A, Wood JMD, Jarvis ED, Thomson JA, Chaisson MJP, Stewart R. Chromosome level genome assembly of the Etruscan shrew Suncus etruscus. Sci Data 2024; 11:176. [PMID: 38326333 PMCID: PMC10850158 DOI: 10.1038/s41597-024-03011-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/26/2024] [Indexed: 02/09/2024] Open
Abstract
Suncus etruscus is one of the world's smallest mammals, with an average body mass of about 2 grams. The Etruscan shrew's small body is accompanied by a very high energy demand and numerous metabolic adaptations. Here we report a chromosome-level genome assembly using PacBio long read sequencing, 10X Genomics linked short reads, optical mapping, and Hi-C linked reads. The assembly is partially phased, with the 2.472 Gbp primary pseudohaplotype and 1.515 Gbp alternate. We manually curated the primary assembly and identified 22 chromosomes, including X and Y sex chromosomes. The NCBI genome annotation pipeline identified 39,091 genes, 19,819 of them protein-coding. We also identified segmental duplications, inferred GO term annotations, and computed orthologs of human and mouse genes. This reference-quality genome will be an important resource for research on mammalian development, metabolism, and body size control.
Collapse
Affiliation(s)
- Yury V Bukhman
- Regenerative Biology, Morgridge Institute for Research, 330 N. Orchard St., Madison, WI, 53715, USA.
| | - Susanne Meyer
- Neuroscience Research Institute, University of California - Santa Barbara, 494 UCEN Rd, Isla Vista, CA, 93117, USA
| | - Li-Fang Chu
- Department of Comparative Biology and Experimental Medicine, University of Calgary, 2500 University Drive NW, Calgary, Alberta, T2N 1N4, Canada
| | - Linelle Abueg
- Vertebrate Genome Lab, The Rockefeller University, 1230 York Avenue, New York, NY, 10065, USA
| | | | - Jennifer Balacco
- Vertebrate Genome Lab, The Rockefeller University, 1230 York Avenue, New York, NY, 10065, USA
| | - Michael Brecht
- BCCN/Humboldt University Berlin, Philippstr, 13 House 6, 10115, Berlin, Germany
| | - Erica Dinatale
- Max Planck Institute for Biology Tübingen, Max-Planck-Ring 5, 72076, Tübingen, Germany
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, 1230 York Avenue, New York, NY, 10065, USA
| | - Giulio Formenti
- Laboratory of Neurogenetics of Language, The Rockefeller University/HHMI, 1230 York Avenue, New York, NY, 10065, USA
| | | | - Swagarika Jaharlal Giri
- Department of Computer Science, Purdue University, 249 S. Martin Jischke Dr, West Lafayette, IN, 47907, USA
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325, Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325, Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438, Frankfurt, Germany
| | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, 249 S. Martin Jischke Dr, West Lafayette, IN, 47907, USA
- Department of Biological Sciences, Purdue University, 249 S. Martin Jischke Dr., West Lafayette, IN, 47907, USA
| | - Daniel Mamott
- Regenerative Biology, Morgridge Institute for Research, 330 N. Orchard St., Madison, WI, 53715, USA
| | - Jacquelyn Mountcastle
- Vertebrate Genome Lab, The Rockefeller University, 1230 York Avenue, New York, NY, 10065, USA
| | - Sarah Pelan
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Keon Rabbani
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way RRI 408, Los Angeles, CA, 90089, USA
| | - Ying Sims
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Alan Tracey
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | | | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, 1230 York Avenue, New York, NY, 10065, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University/HHMI, 1230 York Avenue, New York, NY, 10065, USA
| | - James A Thomson
- Regenerative Biology, Morgridge Institute for Research, 330 N. Orchard St., Madison, WI, 53715, USA
- Department of Molecular, Cellular and Developmental Biology, University of California Santa Barbara, Santa Barbara, CA, 93106, USA
- Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, WI, 53726, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way RRI 408, Los Angeles, CA, 90089, USA
| | - Ron Stewart
- Regenerative Biology, Morgridge Institute for Research, 330 N. Orchard St., Madison, WI, 53715, USA
| |
Collapse
|
9
|
Volpe E, Corda L, Tommaso ED, Pelliccia F, Ottalevi R, Licastro D, Guarracino A, Capulli M, Formenti G, Tassone E, Giunta S. The complete diploid reference genome of RPE-1 identifies human phased epigenetic landscapes. bioRxiv 2023:2023.11.01.565049. [PMID: 38168337 PMCID: PMC10760208 DOI: 10.1101/2023.11.01.565049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Comparative analysis of recent human genome assemblies highlights profound sequence divergence that peaks within polymorphic loci such as centromeres. This raises the question about the adequacy of relying on human reference genomes to accurately analyze sequencing data derived from experimental cell lines. Here, we generated the complete diploid genome assembly for the human retinal epithelial cells (RPE-1), a widely used non-cancer laboratory cell line with a stable karyotype, to use as matched reference for multi-omics sequencing data analysis. Our RPE1v1.0 assembly presents completely phased haplotypes and chromosome-level scaffolds that span centromeres with ultra-high base accuracy (>QV60). We mapped the haplotype-specific genomic variation specific to this cell line including t(Xq;10q), a stable 73.18 Mb duplication of chromosome 10 translocated onto the microdeleted chromosome X telomere t(Xq;10q). Polymorphisms between haplotypes of the same genome reveals genetic and epigenetic variation for all chromosomes, especially at centromeres. The RPE-1 assembly as matched reference genome improves mapping quality of multi-omics reads originating from RPE-1 cells with drastic reduction in alignments mismatches compared to using the most complete human reference to date (CHM13). Leveraging the accuracy achieved using a matched reference, we were able to identify the kinetochore sites at base pair resolution and show unprecedented variation between haplotypes. This work showcases the use of matched reference genomes for multiomics analyses and serves as the foundation for a call to comprehensively assemble experimentally relevant cell lines for widespread application.
Collapse
Affiliation(s)
- Emilia Volpe
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Luca Corda
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Elena Di Tommaso
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Franca Pelliccia
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Riccardo Ottalevi
- Department of Bioinformatic, Dante Genomics Corp Inc., 667 Madison Avenue, New York, NY 10065 USA and S.s.17, 67100, L’Aquila, Italy
| | | | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Mattia Capulli
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, L’Aquila, Italy
| | - Giulio Formenti
- The Rockefeller University, 1230 York Avenue, 10065 New York, USA
| | - Evelyne Tassone
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Simona Giunta
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|
10
|
Lee YH, Abueg L, Kim JK, Kim YW, Fedrigo O, Balacco J, Formenti G, Howe K, Tracey A, Wood J, Thibaud-Nissen F, Nam BH, No ES, Kim HR, Lee C, Jarvis ED, Kim H. Chromosome-level genome assembly of chub mackerel (Scomber japonicus) from the Indo-Pacific Ocean. Sci Data 2023; 10:880. [PMID: 38066002 PMCID: PMC10709322 DOI: 10.1038/s41597-023-02782-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
Chub mackerels (Scomber japonicus) are a migratory marine fish widely distributed in the Indo-Pacific Ocean. They are globally consumed for their high Omega-3 content, but their population is declining due to global warming. Here, we generated the first chromosome-level genome assembly of chub mackerel (fScoJap1) using the Vertebrate Genomes Project assembly pipeline with PacBio HiFi genomic sequencing and Arima Hi-C chromosome contact data. The final assembly is 828.68 Mb with 24 chromosomes, nearly all containing telomeric repeats at their ends. We annotated 31,656 genes and discovered that approximately 2.19% of the genome contained DNA transposon elements repressed within duplicated genes. Analyzing 5-methylcytosine (5mC) modifications using HiFi reads, we observed open/close chromatin patterns at gene promoters, including the FADS2 gene involved in Omega-3 production. This chromosome-level reference genome provides unprecedented opportunities for advancing our knowledge of chub mackerels in biology, industry, and conservation.
Collapse
Affiliation(s)
- Young Ho Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Jin-Koo Kim
- Department of Marine Biology, Pukyong National University, Busan, 48513, Republic of Korea
| | - Young Wook Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Jennifer Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Alan Tracey
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Jonathan Wood
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Bo Hye Nam
- Biotechnology Research Division, National Institute of Fisheries Science, Haean-ro 216, Gijang-eup, Gijang-gun, Busan, 46083, Korea
| | - Eun Soo No
- Biotechnology Research Division, National Institute of Fisheries Science, Haean-ro 216, Gijang-eup, Gijang-gun, Busan, 46083, Korea
| | - Hye Ran Kim
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, NY, 10065, USA.
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, NY, 10065, USA.
- Howard Hughes Medical Institute, Chevy Chase, Maryland, USA.
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- eGnome inc., C-1008, H Businesspark, 26, Beobwon-ro 9-gil, Songpa-gu, Seoul, Republic of Korea.
- Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
11
|
Rice ES, Alberdi A, Alfieri J, Athrey G, Balacco JR, Bardou P, Blackmon H, Charles M, Cheng HH, Fedrigo O, Fiddaman SR, Formenti G, Frantz LAF, Gilbert MTP, Hearn CJ, Jarvis ED, Klopp C, Marcos S, Mason AS, Velez-Irizarry D, Xu L, Warren WC. A pangenome graph reference of 30 chicken genomes allows genotyping of large and complex structural variants. BMC Biol 2023; 21:267. [PMID: 37993882 PMCID: PMC10664547 DOI: 10.1186/s12915-023-01758-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/02/2023] [Indexed: 11/24/2023] Open
Abstract
BACKGROUND The red junglefowl, the wild outgroup of domestic chickens, has historically served as a reference for genomic studies of domestic chickens. These studies have provided insight into the etiology of traits of commercial importance. However, the use of a single reference genome does not capture diversity present among modern breeds, many of which have accumulated molecular changes due to drift and selection. While reference-based resequencing is well-suited to cataloging simple variants such as single-nucleotide changes and short insertions and deletions, it is mostly inadequate to discover more complex structural variation in the genome. METHODS We present a pangenome for the domestic chicken consisting of thirty assemblies of chickens from different breeds and research lines. RESULTS We demonstrate how this pangenome can be used to catalog structural variants present in modern breeds and untangle complex nested variation. We show that alignment of short reads from 100 diverse wild and domestic chickens to this pangenome reduces reference bias by 38%, which affects downstream genotyping results. This approach also allows for the accurate genotyping of a large and complex pair of structural variants at the K feathering locus using short reads, which would not be possible using a linear reference. CONCLUSIONS We expect that this new paradigm of genomic reference will allow better pinpointing of exact mutations responsible for specific phenotypes, which will in turn be necessary for breeding chickens that meet new sustainability criteria and are resilient to quickly evolving pathogen threats.
Collapse
Affiliation(s)
- Edward S Rice
- Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Faculty of Veterinary Medicine, Ludwig-Maximilians-Universität, Munich, Germany
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
| | - James Alfieri
- Department of Ecology & Evolutionary Biology, Texas A&M University, College Station, TX, USA
| | - Giridhar Athrey
- Department of Poultry Science, Texas A&M University, College Station, TX, USA
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Philippe Bardou
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, 31326, France
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Mathieu Charles
- University Paris-Saclay, INRAE, AgroParisTech, GABI, Sigenae, Jouy-en-Josas, France
| | - Hans H Cheng
- Avian Disease and Oncology Laboratory, USDA, ARS, USNPRC, East Lansing, MI, USA
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Laurent A F Frantz
- Faculty of Veterinary Medicine, Ludwig-Maximilians-Universität, Munich, Germany
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, E1 4DQ, UK
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
| | - Cari J Hearn
- Avian Disease and Oncology Laboratory, USDA, ARS, USNPRC, East Lansing, MI, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- The Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Christophe Klopp
- Sigenae, Genotoul Bioinfo, MIAT UR875, INRAE, Castanet Tolosan, France
| | - Sofia Marcos
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
- Applied Genomics and Bioinformatics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain
| | | | | | - Luohao Xu
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Key Laboratory of Aquatic Science of Chongqing, School of Life Sciences, Southwest University, Chongqing, 400715, China
| | - Wesley C Warren
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA.
| |
Collapse
|
12
|
Skorupski J, Brandes F, Seebass C, Festl W, Śmietana P, Balacco J, Jain N, Tilley T, Abueg L, Wood J, Sims Y, Formenti G, Fedrigo O, Jarvis ED. Prioritizing Endangered Species in Genome Sequencing: Conservation Genomics in Action with the First Platinum-Standard Reference-Quality Genome of the Critically Endangered European Mink Mustela lutreola L., 1761. Int J Mol Sci 2023; 24:14816. [PMID: 37834264 PMCID: PMC10573602 DOI: 10.3390/ijms241914816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 09/23/2023] [Accepted: 09/26/2023] [Indexed: 10/15/2023] Open
Abstract
The European mink Mustela lutreola (Mustelidae) ranks among the most endangered mammalian species globally, experiencing a rapid and severe decline in population size, density, and distribution. Given the critical need for effective conservation strategies, understanding its genomic characteristics becomes paramount. To address this challenge, the platinum-quality, chromosome-level reference genome assembly for the European mink was successfully generated under the project of the European Mink Centre consortium. Leveraging PacBio HiFi long reads, we obtained a 2586.3 Mbp genome comprising 25 scaffolds, with an N50 length of 154.1 Mbp. Through Hi-C data, we clustered and ordered the majority of the assembly (>99.9%) into 20 chromosomal pseudomolecules, including heterosomes, ranging from 6.8 to 290.1 Mbp. The newly sequenced genome displays a GC base content of 41.9%. Additionally, we successfully assembled the complete mitochondrial genome, spanning 16.6 kbp in length. The assembly achieved a BUSCO (Benchmarking Universal Single-Copy Orthologs) completeness score of 98.2%. This high-quality reference genome serves as a valuable genomic resource for future population genomics studies concerning the European mink and related taxa. Furthermore, the newly assembled genome holds significant potential in addressing key conservation challenges faced by M. lutreola. Its applications encompass potential revision of management units, assessment of captive breeding impacts, resolution of phylogeographic questions, and facilitation of monitoring and evaluating the efficiency and effectiveness of dedicated conservation strategies for the European mink. This species serves as an example that highlights the paramount importance of prioritizing endangered species in genome sequencing projects due to the race against time, which necessitates the comprehensive exploration and characterization of their genomic resources before their populations face extinction.
Collapse
Affiliation(s)
- Jakub Skorupski
- Institute of Marine and Environmental Sciences, University of Szczecin, Wąska 13 St., 71-415 Szczecin, Poland
- Polish Society for Conservation Genetics LUTREOLA, Maciejkowa 21 St., 71-784 Szczecin, Poland
| | - Florian Brandes
- Wildtier- und Artenschutzstation e.V., Hohe Warte 1, 31553 Sachsenhagen, Germany
| | | | - Wolfgang Festl
- EuroNerz e.V., Kleine Gildewart 3, 49074 Osnabrück, Germany
| | - Przemysław Śmietana
- Institute of Marine and Environmental Sciences, University of Szczecin, Wąska 13 St., 71-415 Szczecin, Poland
- Polish Society for Conservation Genetics LUTREOLA, Maciejkowa 21 St., 71-784 Szczecin, Poland
| | - Jennifer Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Nivesh Jain
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Tatiana Tilley
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Jonathan Wood
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Ying Sims
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| | - Erich D. Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA
| |
Collapse
|
13
|
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, Hook PW, Koren S, Rautiainen M, Alexandrov IA, Allen J, Asri M, Bzikadze AV, Chen NC, Chin CS, Diekhans M, Flicek P, Formenti G, Fungtammasan A, Garcia Giron C, Garrison E, Gershman A, Gerton JL, Grady PGS, Guarracino A, Haggerty L, Halabian R, Hansen NF, Harris R, Hartley GA, Harvey WT, Haukness M, Heinz J, Hourlier T, Hubley RM, Hunt SE, Hwang S, Jain M, Kesharwani RK, Lewis AP, Li H, Logsdon GA, Lucas JK, Makalowski W, Markovic C, Martin FJ, Mc Cartney AM, McCoy RC, McDaniel J, McNulty BM, Medvedev P, Mikheenko A, Munson KM, Murphy TD, Olsen HE, Olson ND, Paulin LF, Porubsky D, Potapova T, Ryabov F, Salzberg SL, Sauria MEG, Sedlazeck FJ, Shafin K, Shepelev VA, Shumate A, Storer JM, Surapaneni L, Taravella Oill AM, Thibaud-Nissen F, Timp W, Tomaszkiewicz M, Vollger MR, Walenz BP, Watwood AC, Weissensteiner MH, Wenger AM, Wilson MA, Zarate S, Zhu Y, Zook JM, Eichler EE, O'Neill RJ, Schatz MC, Miga KH, Makova KD, Phillippy AM. The complete sequence of a human Y chromosome. Nature 2023; 621:344-354. [PMID: 37612512 PMCID: PMC10752217 DOI: 10.1038/s41586-023-06457-y] [Citation(s) in RCA: 41] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 07/19/2023] [Indexed: 08/25/2023]
Abstract
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
Collapse
Affiliation(s)
- Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies Inc., Oxford, UK
| | - Monika Cechova
- Faculty of Informatics, Masaryk University, Brno, Czech Republic
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Savannah J Hoyt
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Paul W Hook
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ivan A Alexandrov
- Federal Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv-Yafo, Israel
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Chen-Shan Chin
- GeneDX Holdings Corp, Stamford, CT, USA
- Foundation of Biological Data Science, Belmont, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | | | | | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ariel Gershman
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer L Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical Center, Kansas City, MO, USA
| | - Patrick G S Grady
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Reza Halabian
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Nancy F Hansen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Robert Harris
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Gabrielle A Hartley
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jakob Heinz
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen Hwang
- XDBio Program, Johns Hopkins University, Baltimore, MD, USA
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Northeastern University, Boston, MA, USA
| | - Rupesh K Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Wojciech Makalowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Christopher Markovic
- Genome Technology Access Center at the McDonnell Genome Institute, Washington University, St. Louis, MO, USA
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ann M Mc Cartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Brandy M McNulty
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Medvedev
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Hugh E Olsen
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Nathan D Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Luis F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Steven L Salzberg
- Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | | | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | | | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Angela M Taravella Oill
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Department of Biomedical Engineering, Pennsylvania State University, State College, PA, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Allison C Watwood
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | | | | | - Melissa A Wilson
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Justin M Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Investigator, Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA
| | - Michael C Schatz
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
14
|
Sharaf A, Ndiribe CC, Omotoriogun TC, Abueg L, Badaoui B, Badiane Markey FJ, Beedessee G, Diouf D, Duru VC, Ebuzome C, Eziuzor SC, Jaufeerally Fakim Y, Formenti G, Ghanmi N, Guerfali FZ, Houaga I, Ideozu JE, Katee SM, Khayi S, Kuja JO, Kwon-Ndung EH, Marks RA, Moila AM, Mungloo-Dilmohamud Z, Muzemil S, Nigussie H, Osuji JO, Ras V, Tchiechoua YH, Zoclanclounon YAB, Tolley KA, Ziyomo C, Mapholi N, Muigai AWT, Djikeng A, Ebenezer TE. Bridging the gap in African biodiversity genomics and bioinformatics. Nat Biotechnol 2023; 41:1348-1354. [PMID: 37699986 DOI: 10.1038/s41587-023-01933-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Affiliation(s)
- Abdoallah Sharaf
- SequAna Core Facility, Department of Biology, University of Konstanz, Konstanz, Germany
- Genetic Department, Faculty of Agriculture, Ain Shams University, Cairo, Egypt
| | - Charlotte C Ndiribe
- Department of Cell Biology and Genetics, University of Lagos, Lagos, Nigeria
| | - Taiwo Crossby Omotoriogun
- Biotechnology Unit, Department of Biological Sciences, Elizade University, Ilara-Mokin, Nigeria
- A.P. Leventis Ornithological Research Institute, University of Jos, Jos, Nigeria
| | - Linelle Abueg
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | - Bouabid Badaoui
- Mohammed V University in Rabat, Rabat, Morocco
- African Sustainable Agriculture Research Institute, Mohammed VI Polytechnic University, Laâyoune, Morocco
| | | | - Girish Beedessee
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Diaga Diouf
- Laboratoire Campus de Biotechnologies Végétales, Département de Biologie Végétale, Faculté des Sciences et Techniques, Université Cheikh Anta Diop, Dakar, Sénégal
| | - Vincent C Duru
- Department of Parasitology and Entomology, Nnamdi Azikiwe University, Awka, Nigeria
| | | | - Samuel C Eziuzor
- Department of Isotope Biogeochemistry, Helmholtz Center for Environmental Research-UFZ, Leipzig, Germany
| | | | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | - Nidhal Ghanmi
- Bioinformatics Lab, Pasteur Institute of Tunis, Tunis, Tunisia
| | - Fatma Zahra Guerfali
- Laboratory of Transmission, Control and Immunobiology of Infections, Pasteur Institute of Tunis, Tunis, Tunisia
- University of Tunis El Manar, University Campus Farhat Hached, Tunis, Tunisia
| | - Isidore Houaga
- Centre for Tropical Livestock Genetics and Health, Roslin Institute, University of Edinburgh, Edinburgh, UK
| | | | | | - Slimane Khayi
- Biotechnology Research Unit, CRRA-Rabat, National Institute of Agricultural Research, Rabat, Morocco
| | - Josiah O Kuja
- Bioinformatics Center, University of Copenhagen, Copenhagen, Denmark
| | | | - Rose A Marks
- Department of Horticulture, Michigan State University, East Lansing, MI, USA
- Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa
| | | | | | - Sadik Muzemil
- School of Life Science, University of Warwick, Coventry, UK
| | - Helen Nigussie
- Department of Microbial, Cellular and Molecular Biology, Addis Ababa University, Addis Ababa, Ethiopia
| | | | - Verena Ras
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI Africa Wellcome Trust Centre, University of Cape Town, Cape Town, South Africa
- Department of Biodiversity and Conservation Biology, University of the Western Cape, Bellville, South Africa
| | - Yves H Tchiechoua
- Pan African University Institute for Basic Sciences Technology and Innovation, Nairobi, Kenya
| | | | - Krystal A Tolley
- South African National Biodiversity Institute, Claremont, Cape Town, South Africa
- Centre for Ecological Genomics and Wildlife Conservation, University of Johannesburg, Johannesburg, South Africa
| | | | - Ntanganedzeni Mapholi
- Department of Agriculture and Animal Health, University of South Africa, Florida, South Africa
| | - Anne W T Muigai
- Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya.
- National Defense University-Kenya, Nakuru, Kenya.
| | - Appolinaire Djikeng
- Centre for Tropical Livestock Genetics and Health, Roslin Institute, University of Edinburgh, Edinburgh, UK.
- International Livestock Research Institute, Nairobi, Kenya.
- Department of Agriculture and Animal Health, University of South Africa, Florida, South Africa.
| | - ThankGod Echezona Ebenezer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.
- Early Cancer Institute, Department of Oncology, School of Clinical Medicine, University of Cambridge, Cambridge, UK.
| |
Collapse
|
15
|
Sozzoni M, Ferrer Obiol J, Formenti G, Tigano A, Paris JR, Balacco JR, Jain N, Tilley T, Collins J, Sims Y, Wood J, Benowitz-Fredericks ZM, Field KA, Seyoum E, Gatt MC, Léandri-Breton DJ, Nakajima C, Whelan S, Gianfranceschi L, Hatch SA, Elliott KH, Shoji A, Cecere JG, Jarvis ED, Pilastro A, Rubolini D. A Chromosome-Level Reference Genome for the Black-Legged Kittiwake (Rissa tridactyla), a Declining Circumpolar Seabird. Genome Biol Evol 2023; 15:evad153. [PMID: 37590950 PMCID: PMC10457150 DOI: 10.1093/gbe/evad153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/02/2023] [Accepted: 08/09/2023] [Indexed: 08/19/2023] Open
Abstract
Amidst the current biodiversity crisis, the availability of genomic resources for declining species can provide important insights into the factors driving population decline. In the early 1990s, the black-legged kittiwake (Rissa tridactyla), a pelagic gull widely distributed across the arctic, subarctic, and temperate zones, suffered a steep population decline following an abrupt warming of sea surface temperature across its distribution range and is currently listed as Vulnerable by the International Union for the Conservation of Nature. Kittiwakes have long been the focus for field studies of physiology, ecology, and ecotoxicology and are primary indicators of fluctuating ecological conditions in arctic and subarctic marine ecosystems. We present a high-quality chromosome-level reference genome and annotation for the black-legged kittiwake using a combination of Pacific Biosciences HiFi sequencing, Bionano optical maps, Hi-C reads, and RNA-Seq data. The final assembly spans 1.35 Gb across 32 chromosomes, with a scaffold N50 of 88.21 Mb and a BUSCO completeness of 97.4%. This genome assembly substantially improves the quality of a previous draft genome, showing an approximately 5× increase in contiguity and a more complete annotation. Using this new chromosome-level reference genome and three more chromosome-level assemblies of Charadriiformes, we uncover several lineage-specific chromosome fusions and fissions, but find no shared rearrangements, suggesting that interchromosomal rearrangements have been commonplace throughout the diversification of Charadriiformes. This new high-quality genome assembly will enable population genomic, transcriptomic, and phenotype-genotype association studies in a widely studied sentinel species, which may provide important insights into the impacts of global change on marine systems.
Collapse
Affiliation(s)
- Marcella Sozzoni
- Department of Biology, University of Florence, Sesto Fiorentino, Florence, Italy
| | - Joan Ferrer Obiol
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Anna Tigano
- Department of Biology, Queen’s University, Kingston, Ontario, Canada
- Department of Biology, The University of British Columbia, Kelowna, British Columbia, Canada
| | - Josephine R Paris
- Department of Life and Environmental Sciences, Marche Polytechnic University, Ancona, Italy
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Nivesh Jain
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Tatiana Tilley
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
| | - Joanna Collins
- Tree of Life, Wellcome Sanger Institute, Cambridge, United Kingdom
| | - Ying Sims
- Tree of Life, Wellcome Sanger Institute, Cambridge, United Kingdom
| | - Jonathan Wood
- Tree of Life, Wellcome Sanger Institute, Cambridge, United Kingdom
| | | | - Kenneth A Field
- Department of Biology, Bucknell University, Lewisburg, Pennsylvania, USA
| | - Eyuel Seyoum
- Department of Biology, Bucknell University, Lewisburg, Pennsylvania, USA
| | - Marie Claire Gatt
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| | - Don-Jean Léandri-Breton
- Department of Natural Resource Sciences, McGill University, Ste-Anne-de-Bellevue, Quebec, Canada
- Centre d’Études Biologiques de Chizé (CEBC), UMR 7372 - CNRS & Université de La Rochelle, Villiers-en-Bois, France
| | - Chinatsu Nakajima
- Department of Life and Environmental Science, University of Tsukuba, Tsukuba, Japan
| | - Shannon Whelan
- Department of Natural Resource Sciences, McGill University, Ste-Anne-de-Bellevue, Quebec, Canada
| | | | - Scott A Hatch
- Institute for Seabird Research and Conservation, Anchorage, Alaska, USA
| | - Kyle H Elliott
- Department of Natural Resource Sciences, McGill University, Ste-Anne-de-Bellevue, Quebec, Canada
| | - Akiko Shoji
- Department of Life and Environmental Science, University of Tsukuba, Tsukuba, Japan
| | | | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, New York, USA
- Howard Hughes Medical Institute, Chevy Chase, Maryland, USA
| | | | - Diego Rubolini
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
- Water Research Institute, IRSA-CNR, Brugherio, Monza and Brianza, Italy
| |
Collapse
|
16
|
Uliano-Silva M, Ferreira JGRN, Krasheninnikova K, Formenti G, Abueg L, Torrance J, Myers EW, Durbin R, Blaxter M, McCarthy SA. MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads. BMC Bioinformatics 2023; 24:288. [PMID: 37464285 PMCID: PMC10354987 DOI: 10.1186/s12859-023-05385-y] [Citation(s) in RCA: 47] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 06/13/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND PacBio high fidelity (HiFi) sequencing reads are both long (15-20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated tool for mitochondrial genome assembly using HiFi reads is still missing. RESULTS MitoHiFi was developed within the Darwin Tree of Life Project to assemble mitochondrial genomes from the HiFi reads generated for target species. The input for MitoHiFi is either the raw reads or the assembled contigs, and the tool outputs a mitochondrial genome sequence fasta file along with annotation of protein and RNA genes. Variants arising from heteroplasmy are assembled independently, and nuclear insertions of mitochondrial sequences are identified and not used in organellar genome assembly. MitoHiFi has been used to assemble 374 mitochondrial genomes (368 Metazoa and 6 Fungi species) for the Darwin Tree of Life Project, the Vertebrate Genomes Project and the Aquatic Symbiosis Genome Project. Inspection of 60 mitochondrial genomes assembled with MitoHiFi for species that already have reference sequences in public databases showed the widespread presence of previously unreported repeats. CONCLUSIONS MitoHiFi is able to assemble mitochondrial genomes from a wide phylogenetic range of taxa from Pacbio HiFi data. MitoHiFi is written in python and is freely available on GitHub ( https://github.com/marcelauliano/MitoHiFi ). MitoHiFi is available with its dependencies as a Docker container on GitHub (ghcr.io/marcelauliano/mitohifi:master).
Collapse
Affiliation(s)
| | - João Gabriel R. N. Ferreira
- Bio Bureau Biotecnologia, Rio de Janeiro, Brazil
- Instituto de Biofísica Carlos Chagas Filho, UniversidadeFederal Do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | | | | | - James Torrance
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA UK
| | - Eugene W. Myers
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Okinawa Institute of Science and Technology, Okinawa, Japan
| | - Richard Durbin
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA UK
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH UK
| | - Mark Blaxter
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA UK
| | - Shane A. McCarthy
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA UK
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH UK
| |
Collapse
|
17
|
Larivière D, Abueg L, Brajuka N, Gallardo-Alba C, Grüning B, Ko BJ, Ostrovsky A, Palmada-Flores M, Pickett BD, Rabbani K, Balacco JR, Chaisson M, Cheng H, Collins J, Denisova A, Fedrigo O, Gallo GR, Giani AM, Gooder GM, Jain N, Johnson C, Kim H, Lee C, Marques-Bonet T, O'Toole B, Rhie A, Secomandi S, Sozzoni M, Tilley T, Uliano-Silva M, van den Beek M, Waterhouse RM, Phillippy AM, Jarvis ED, Schatz MC, Nekrutenko A, Formenti G. Scalable, accessible, and reproducible reference genome assembly and evaluation in Galaxy. bioRxiv 2023:2023.06.28.546576. [PMID: 37425881 PMCID: PMC10327048 DOI: 10.1101/2023.06.28.546576] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Improvements in genome sequencing and assembly are enabling high-quality reference genomes for all species. However, the assembly process is still laborious, computationally and technically demanding, lacks standards for reproducibility, and is not readily scalable. Here we present the latest Vertebrate Genomes Project assembly pipeline and demonstrate that it delivers high-quality reference genomes at scale across a set of vertebrate species arising over the last ~500 million years. The pipeline is versatile and combines PacBio HiFi long-reads and Hi-C-based haplotype phasing in a new graph-based paradigm. Standardized quality control is performed automatically to troubleshoot assembly issues and assess biological complexities. We make the pipeline freely accessible through Galaxy, accommodating researchers even without local computational resources and enhanced reproducibility by democratizing the training and assembly process. We demonstrate the flexibility and reliability of the pipeline by assembling reference genomes for 51 vertebrate species from major taxonomic groups (fish, amphibians, reptiles, birds, and mammals).
Collapse
Affiliation(s)
- Delphine Larivière
- Dept. of Biochemistry and Molecular Biology, Pennsylvania State University, USA
| | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | | | - Cristóbal Gallardo-Alba
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany
| | - Bjorn Grüning
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Alex Ostrovsky
- Departments of Biology and Computer Science, Johns Hopkins University, USA
| | - Marc Palmada-Flores
- Department of Medicine and Life Sciences (MELIS), Institut de Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, Barcelona 08003, Spain
| | - Brandon D Pickett
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Keon Rabbani
- Department of Quantitative and Computational Biology, University of Southern California
| | | | - Mark Chaisson
- Department of Quantitative and Computational Biology, University of Southern California
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Joanna Collins
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Alexandra Denisova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | | | | | | | - Nivesh Jain
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | - Cassidy Johnson
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
- eGnome, Inc, Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, NY, 10065, USA
| | - Tomas Marques-Bonet
- Department of Medicine and Life Sciences (MELIS), Institut de Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, Barcelona 08003, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Barcelona 08010, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès 08193, Spain
| | - Brian O'Toole
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Simona Secomandi
- Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Marcella Sozzoni
- University of Florence, Department of Biology, Via Madonna del Piano 6, Sesto Fiorentino (FI)
| | - Tatiana Tilley
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | | | - Marius van den Beek
- Dept. of Biochemistry and Molecular Biology, Pennsylvania State University, USA
| | - Robert M Waterhouse
- Department of Ecology & Evolution and Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| | - Michael C Schatz
- Departments of Biology and Computer Science, Johns Hopkins University, USA
| | - Anton Nekrutenko
- Dept. of Biochemistry and Molecular Biology, Pennsylvania State University, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, USA
| |
Collapse
|
18
|
Squires TE, Rödin-Mörch P, Formenti G, Tracey A, Abueg L, Brajuka N, Jarvis E, Halapi EC, Melsted P, Höglund J, Magnússon KP. A Chromosome-Level Genome Assembly for the Rock Ptarmigan (Lagopus muta). G3 (Bethesda) 2023:7152384. [PMID: 37141262 DOI: 10.1093/g3journal/jkad099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/21/2023] [Accepted: 04/27/2023] [Indexed: 05/05/2023]
Abstract
The Rock Ptarmigan (Lagopus muta) is a cold-adapted, largely sedentary, game bird with a Holarctic distribution. The species represents an important example of an organism likely to be affected by ongoing climatic shifts across a disparate range. We provide here a high-quality reference genome and mitogenome for the Rock Ptarmigan assembled from PacBio HiFi and Hi-C sequencing of a female bird from Iceland. The total size of the genome is 1.03 Gb with a scaffold N50 of 71.23 Mb and a contig N50 of 17.91 Mb. The final scaffolds represent all 40 predicted chromosomes, and the mitochondria with a BUSCO score of 98.6%. Gene annotation resulted in 16,078 protein-coding genes out of a total 19,831 predicted (81.08% excluding pseudogenes). The genome included 21.07% repeat sequences, and the average length of genes, exons, and introns were, 33605, 394, and 4265 bp respectively. The availability of a new reference-quality genome will contribute to understanding the Rock Ptarmigan's unique evolutionary history, vulnerability to climate change, and demographic trajectories around the globe while serving as a benchmark for species in the family Phasianidae (order Galliformes).
Collapse
Affiliation(s)
- Theodore E Squires
- University of Akureyri, Faculty of Natural Resource Sciences, Borgir við Norðurslóð, Akureyri 600 ICELAND
- Uppsala University, Faculty of Animal Ecology, Centre for Evolution and Genomics, Norbyvägen 18D, Uppsala 75236 Sweden
| | - Patrik Rödin-Mörch
- University of Akureyri, Faculty of Natural Resource Sciences, Borgir við Norðurslóð, Akureyri 600 ICELAND
- Uppsala University, Faculty of Animal Ecology, Centre for Evolution and Genomics, Norbyvägen 18D, Uppsala 75236 Sweden
| | - Giulio Formenti
- The Rockefeller University, Center for Genomics and Systems Biology, 1230 York Ave, New York, NY 10065 United States of America
| | - Alan Tracey
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA United Kingdom
| | - Linelle Abueg
- The Rockefeller University, Center for Genomics and Systems Biology, 1230 York Ave, New York, NY 10065 United States of America
| | - Nadolina Brajuka
- The Rockefeller University, Center for Genomics and Systems Biology, 1230 York Ave, New York, NY 10065 United States of America
| | - Erich Jarvis
- The Rockefeller University, Center for Genomics and Systems Biology, 1230 York Ave, New York, NY 10065 United States of America
| | - Eva C Halapi
- University of Akureyri, Faculty of Natural Resource Sciences, Borgir við Norðurslóð, Akureyri 600 Iceland
| | - Páll Melsted
- University of Iceland, Department of Computer Science, Sæmundargata 2, Reykjavík 102 Iceland
- University of Iceland Biomedical Center, Medical Park, Vatnsmýrarvegur 16, Reykjavík 101 Iceland
| | - Jacob Höglund
- Uppsala University, Faculty of Animal Ecology, Centre for Evolution and Genomics, Norbyvägen 18D, Uppsala 75236 Sweden
| | - Kristinn Pétur Magnússon
- University of Akureyri, Faculty of Natural Resource Sciences, Borgir við Norðurslóð, Akureyri 600 Iceland
- University of Iceland Biomedical Center, Medical Park, Vatnsmýrarvegur 16, Reykjavík 101 Iceland
- Icelandic Institute of Natural History, Borgir við Norðurslóð, Akureyri 600 Iceland
| |
Collapse
|
19
|
Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ, Buonaiuto S, Chang XH, Cheng H, Chu J, Colonna V, Eizenga JM, Feng X, Fischer C, Fulton RS, Garg S, Groza C, Guarracino A, Harvey WT, Heumos S, Howe K, Jain M, Lu TY, Markello C, Martin FJ, Mitchell MW, Munson KM, Mwaniki MN, Novak AM, Olsen HE, Pesout T, Porubsky D, Prins P, Sibbesen JA, Sirén J, Tomlinson C, Villani F, Vollger MR, Antonacci-Fulton LL, Baid G, Baker CA, Belyaeva A, Billis K, Carroll A, Chang PC, Cody S, Cook DE, Cook-Deegan RM, Cornejo OE, Diekhans M, Ebert P, Fairley S, Fedrigo O, Felsenfeld AL, Formenti G, Frankish A, Gao Y, Garrison NA, Giron CG, Green RE, Haggerty L, Hoekzema K, Hourlier T, Ji HP, Kenny EE, Koenig BA, Kolesnikov A, Korbel JO, Kordosky J, Koren S, Lee H, Lewis AP, Magalhães H, Marco-Sola S, Marijon P, McCartney A, McDaniel J, Mountcastle J, Nattestad M, Nurk S, Olson ND, Popejoy AB, Puiu D, Rautiainen M, Regier AA, Rhie A, Sacco S, Sanders AD, Schneider VA, Schultz BI, Shafin K, Smith MW, Sofia HJ, Abou Tayoun AN, Thibaud-Nissen F, Tricomi FF, Wagner J, Walenz B, Wood JMD, Zimin AV, Bourque G, Chaisson MJP, Flicek P, Phillippy AM, Zook JM, Eichler EE, Haussler D, Wang T, Jarvis ED, Miga KH, Garrison E, Marschall T, Hall IM, Li H, Paten B. A draft human pangenome reference. Nature 2023; 617:312-324. [PMID: 37165242 PMCID: PMC10172123 DOI: 10.1038/s41586-023-05896-x] [Citation(s) in RCA: 170] [Impact Index Per Article: 170.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 02/28/2023] [Indexed: 05/12/2023]
Abstract
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.
Collapse
Affiliation(s)
- Wen-Wei Liao
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
- Center for Genomic Health, Yale University School of Medicine, New Haven, CT, USA
- Division of Biology and Biomedical Sciences, Washington University School of Medicine, St. Louis, MO, USA
| | - Mobin Asri
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Jana Ebler
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Daniel Doerr
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Marina Haukness
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Glenn Hickey
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Shuangjia Lu
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
- Center for Genomic Health, Yale University School of Medicine, New Haven, CT, USA
| | - Julian K Lucas
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Jean Monlong
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Haley J Abel
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Silvia Buonaiuto
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
| | - Xian H Chang
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Justin Chu
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Vincenza Colonna
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Jordan M Eizenga
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Xiaowen Feng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Christian Fischer
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Robert S Fulton
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Shilpa Garg
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Copenhagen, Denmark
| | - Cristian Groza
- Quantitative Life Sciences, McGill University, Montréal, Québec, Canada
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Simon Heumos
- Quantitative Biology Center (QBiC), University of Tübingen, Tübingen, Germany
- Biomedical Data Science, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - Miten Jain
- Northeastern University, Boston, MA, USA
| | - Tsung-Yu Lu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Charles Markello
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Adam M Novak
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Hugh E Olsen
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Trevor Pesout
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Pjotr Prins
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Jonas A Sibbesen
- Center for Health Data Science, University of Copenhagen, Copenhagen, Denmark
| | - Jouni Sirén
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Chad Tomlinson
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Flavia Villani
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Carl A Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | | | - Sarah Cody
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | | | - Robert M Cook-Deegan
- Barrett and O'Connor Washington Center, Arizona State University, Washington, DC, USA
| | - Omar E Cornejo
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA, USA
| | - Mark Diekhans
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Peter Ebert
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
- Core Unit Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Susan Fairley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Adam L Felsenfeld
- National Institutes of Health (NIH)-National Human Genome Research Institute, Bethesda, MD, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Yan Gao
- Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Nanibaa' A Garrison
- Institute for Society and Genetics, College of Letters and Science, University of California, Los Angeles, CA, USA
- Institute for Precision Health, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
- Division of General Internal Medicine and Health Services Research, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Richard E Green
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
- Dovetail Genomics, Scotts Valley, CA, USA
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Hanlee P Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Eimear E Kenny
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Barbara A Koenig
- Program in Bioethics and Institute for Human Genetics, University of California, San Francisco, CA, USA
| | | | - Jan O Korbel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - HoJoon Lee
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Hugo Magalhães
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Santiago Marco-Sola
- Computer Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
- Departament d'Arquitectura de Computadors i Sistemes Operatius, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Pierre Marijon
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Ann McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jennifer McDaniel
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | | | | | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Alice B Popejoy
- Department of Public Health Sciences, University of California, Davis, CA, USA
| | - Daniela Puiu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Allison A Regier
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Samuel Sacco
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA, USA
| | - Ashley D Sanders
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Valerie A Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Baergen I Schultz
- National Institutes of Health (NIH)-National Human Genome Research Institute, Bethesda, MD, USA
| | | | - Michael W Smith
- National Institutes of Health (NIH)-National Human Genome Research Institute, Bethesda, MD, USA
| | - Heidi J Sofia
- National Institutes of Health (NIH)-National Human Genome Research Institute, Bethesda, MD, USA
| | - Ahmad N Abou Tayoun
- Al Jalila Genomics Center of Excellence, Al Jalila Children's Specialty Hospital, Dubai, UAE
- Center for Genomic Discovery, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Francesca Floriana Tricomi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Brian Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Aleksey V Zimin
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montréal, Québec, Canada
- Canadian Center for Computational Genomics, McGill University, Montréal, Québec, Canada
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Ting Wang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Karen H Miga
- Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA.
| | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany.
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany.
| | - Ira M Hall
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA.
- Center for Genomic Health, Yale University School of Medicine, New Haven, CT, USA.
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, CA, USA.
| |
Collapse
|
20
|
Timoshevskaya N, Eşkut KI, Timoshevskiy VA, Robb SMC, Holt C, Hess JE, Parker HJ, Baker CF, Miller AK, Saraceno C, Yandell M, Krumlauf R, Narum SR, Lampman RT, Gemmell NJ, Mountcastle J, Haase B, Balacco JR, Formenti G, Pelan S, Sims Y, Howe K, Fedrigo O, Jarvis ED, Smith JJ. An improved germline genome assembly for the sea lamprey Petromyzon marinus illuminates the evolution of germline-specific chromosomes. Cell Rep 2023; 42:112263. [PMID: 36930644 PMCID: PMC10166183 DOI: 10.1016/j.celrep.2023.112263] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 10/17/2022] [Accepted: 02/28/2023] [Indexed: 03/17/2023] Open
Abstract
Programmed DNA loss is a gene silencing mechanism that is employed by several vertebrate and nonvertebrate lineages, including all living jawless vertebrates and songbirds. Reconstructing the evolution of somatically eliminated (germline-specific) sequences in these species has proven challenging due to a high content of repeats and gene duplications in eliminated sequences and a corresponding lack of highly accurate and contiguous assemblies for these regions. Here, we present an improved assembly of the sea lamprey (Petromyzon marinus) genome that was generated using recently standardized methods that increase the contiguity and accuracy of vertebrate genome assemblies. This assembly resolves highly contiguous, somatically retained chromosomes and at least one germline-specific chromosome, permitting new analyses that reconstruct the timing, mode, and repercussions of recruitment of genes to the germline-specific fraction. These analyses reveal major roles of interchromosomal segmental duplication, intrachromosomal duplication, and positive selection for germline functions in the long-term evolution of germline-specific chromosomes.
Collapse
Affiliation(s)
| | - Kaan I Eşkut
- Department of Biology, University of Kentucky, Lexington, KY 40506, USA
| | | | - Sofia M C Robb
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Carson Holt
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Jon E Hess
- Columbia River Inter-Tribal Fish Commission, Portland, OR 97232, USA
| | - Hugo J Parker
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Cindy F Baker
- National Institute of Water and Atmospheric Research Limited (NIWA), Hamilton, Waikato 3261, New Zealand
| | - Allison K Miller
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, Otago 9054, New Zealand
| | - Cody Saraceno
- Department of Biology, University of Kentucky, Lexington, KY 40506, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Robb Krumlauf
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA; Department of Anatomy & Cell Biology, The University of Kansas School of Medicine, Kansas City, KS 66160, USA
| | - Shawn R Narum
- Columbia River Inter-Tribal Fish Commission, Hagerman, ID 83332, USA
| | - Ralph T Lampman
- Yakama Nation Fisheries Resource Management Program, Pacific Lamprey Project, Toppenish, WA 98948, USA
| | - Neil J Gemmell
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, Otago 9054, New Zealand
| | | | - Bettina Haase
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
| | - Jennifer R Balacco
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
| | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA; Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY 10065, USA
| | - Sarah Pelan
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, UK
| | - Ying Sims
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, UK
| | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, UK
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY 10065, USA; Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY 10065, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Jeramiah J Smith
- Department of Biology, University of Kentucky, Lexington, KY 40506, USA.
| |
Collapse
|
21
|
Gabrielli M, Benazzo A, Biello R, Ancona L, Fuselli S, Iannucci A, Balacco J, Mountcastle J, Tracey A, Ficetola GF, Salvi D, Sollitto M, Fedrigo O, Formenti G, Jarvis ED, Gerdol M, Ciofi C, Trucchi E, Bertorelle G. A high-quality reference genome for the critically endangered Aeolian wall lizard, Podarcis raffonei. J Hered 2023; 114:279-285. [PMID: 36866448 DOI: 10.1093/jhered/esad014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 03/01/2023] [Indexed: 03/04/2023] Open
Abstract
The Aeolian wall lizard, Podarcis raffonei, is an endangered species endemic to the Aeolian archipelago, Italy, where it is present only in three tiny islets and a narrow promontory of a larger island. Because of the extremely limited area of occupancy, severe population fragmentation and observed decline, it has been classified as Critically Endangered by the International Union for the Conservation of Nature (IUCN). Using Pacific Biosciences (PacBio) High Fidelity (HiFi) long read sequencing, Bionano optical mapping and Arima chromatin conformation capture sequencing (Hi-C), we produced a high-quality, chromosome-scale reference genome for the Aeolian wall lizard, including Z and W sexual chromosomes. The final assembly spans 1.51 Gb across 28 scaffolds with a contig N50 of 61.4 Mb, a scaffold N50 of 93.6 Mb, and a BUSCO completeness score of 97.3%. This genome constitutes a valuable resource for the species to guide potential conservation efforts and more generally for the squamate reptiles that are underrepresented in terms of available high-quality genomic resources.
Collapse
Affiliation(s)
- Maëva Gabrielli
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Andrea Benazzo
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Roberto Biello
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Lorena Ancona
- Department of Life and Environmental Sciences, Marche Polytechnic University, Ancona, Italy
| | - Silvia Fuselli
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | | | - Jennifer Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Alan Tracey
- Tree of Life, Wellcome Sanger Institute, Cambridge, United Kingdom
| | - Gentile Francesco Ficetola
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy.,Laboratoire d'Ecologie Alpine (LECA), CNRS, Université Grenoble Alpes and Université Savoie Mont Blanc, Grenoble, France
| | - Daniele Salvi
- Department of Health, Life & Environmental Sciences - University of L'Aquila, L'Aquila, Italy
| | - Marco Sollitto
- Department of Life Sciences, University of Trieste, Trieste, Italy
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Giulio Formenti
- Department of Biology, University of Florence, Florence, Italy.,Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA.,Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Marco Gerdol
- Department of Life Sciences, University of Trieste, Trieste, Italy
| | - Claudio Ciofi
- Department of Biology, University of Florence, Florence, Italy
| | - Emiliano Trucchi
- Department of Life and Environmental Sciences, Marche Polytechnic University, Ancona, Italy
| | - Giorgio Bertorelle
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| |
Collapse
|
22
|
Theissinger K, Fernandes C, Formenti G, Bista I, Berg PR, Bleidorn C, Bombarely A, Crottini A, Gallo GR, Godoy JA, Jentoft S, Malukiewicz J, Mouton A, Oomen RA, Paez S, Palsbøll PJ, Pampoulie C, Ruiz-López MJ, Secomandi S, Svardal H, Theofanopoulou C, de Vries J, Waldvogel AM, Zhang G, Jarvis ED, Bálint M, Ciofi C, Waterhouse RM, Mazzoni CJ, Höglund J. How genomics can help biodiversity conservation. Trends Genet 2023:S0168-9525(23)00020-3. [PMID: 36801111 DOI: 10.1016/j.tig.2023.01.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/08/2022] [Accepted: 01/19/2023] [Indexed: 02/18/2023]
Abstract
The availability of public genomic resources can greatly assist biodiversity assessment, conservation, and restoration efforts by providing evidence for scientifically informed management decisions. Here we survey the main approaches and applications in biodiversity and conservation genomics, considering practical factors, such as cost, time, prerequisite skills, and current shortcomings of applications. Most approaches perform best in combination with reference genomes from the target species or closely related species. We review case studies to illustrate how reference genomes can facilitate biodiversity research and conservation across the tree of life. We conclude that the time is ripe to view reference genomes as fundamental resources and to integrate their use as a best practice in conservation genomics.
Collapse
Affiliation(s)
- Kathrin Theissinger
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany
| | - Carlos Fernandes
- CE3C - Centre for Ecology, Evolution and Environmental Changes & CHANGE - Global Change and Sustainability Institute, Departamento de Biologia Animal, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal; Faculdade de Psicologia, Universidade de Lisboa, Alameda da Universidade, 1649-013 Lisboa, Portugal
| | - Giulio Formenti
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Iliana Bista
- Naturalis Biodiversity Center, Darwinweg 2, 2333, CR, Leiden, The Netherlands; Wellcome Sanger Institute, Tree of Life, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Paul R Berg
- NIVA - Norwegian Institute for Water Research, Økernveien, 94, 0579 Oslo, Norway; Centre for Coastal Research, University of Agder, Gimlemoen 25j, 4630 Kristiansand, Norway; Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, PO BOX 1066 Blinderm, 0316 Oslo, Norway
| | - Christoph Bleidorn
- University of Göttingen, Department of Animal Evolution and Biodiversity, Untere Karspüle, 2, 37073, Göttingen, Germany
| | | | - Angelica Crottini
- CIBIO/InBio, Centro de Investigação em Biodiversidade e Recursos Genéticos, Rua Padre Armando Quintas, 7, 4485-661, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4099-002 Porto, Portugal; BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661 Vairão, Portugal
| | - Guido R Gallo
- Department of Biosciences, University of Milan, Milan, Italy
| | - José A Godoy
- Estación Biológica de Doñana, CSIC, Calle Americo Vespucio 26, 41092, Sevillle, Spain
| | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, PO BOX 1066 Blinderm, 0316 Oslo, Norway
| | - Joanna Malukiewicz
- Primate Genetics Laborator, German Primate Center, Kellnerweg 4, 37077, Göttingen, Germany
| | - Alice Mouton
- InBios - Conservation Genetics Lab, University of Liege, Chemin de la Vallée 4, 4000, Liege, Belgium
| | - Rebekah A Oomen
- Centre for Coastal Research, University of Agder, Gimlemoen 25j, 4630 Kristiansand, Norway; Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, PO BOX 1066 Blinderm, 0316 Oslo, Norway
| | - Sadye Paez
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Per J Palsbøll
- Groningen Institute of Evolutionary Life Sciences, University of Groningen, Nijenborgh, 9747, AG, Groningen, The Netherlands; Center for Coastal Studies, 5 Holway Avenue, Provincetown, MA 02657, USA
| | - Christophe Pampoulie
- Marine and Freshwater Research Institute, Fornubúðir, 5,220, Hanafjörður, Iceland
| | - María J Ruiz-López
- Estación Biológica de Doñana, CSIC, Calle Americo Vespucio 26, 41092, Sevillle, Spain; CIBER de Epidemiología y Salud Pública (CIBERESP), Spain
| | | | - Hannes Svardal
- Department of Biology, University of Antwerp, Universiteitsplein 1, 2610 Wilrijk, Antwerp, Belgium
| | - Constantina Theofanopoulou
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA; Hunter College, City University of New York, NY, USA
| | - Jan de Vries
- University of Goettingen, Institute for Microbiology and Genetics, Department of Applied Bioinformatics, Goettingen Center for Molecular Biosciences (GZMB), Campus Institute Data Science (CIDAS), Goldschmidtstr. 1, 37077, Goettingen, Germany
| | - Ann-Marie Waldvogel
- Institute of Zoology, University of Cologne, Zülpicherstrasse 47b, D-50674, Cologne, Germany
| | - Guojie Zhang
- Evolutionary & Organismal Biology Research Center, Zhejiang University School of Medicine, Hangzhou, 310058, China; Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Denmark; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Erich D Jarvis
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Miklós Bálint
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany
| | - Claudio Ciofi
- University of Florence, Department of Biology, Via Madonna del Piano 6, Sesto Fiorentino, (FI) 50019, Italy
| | - Robert M Waterhouse
- University of Lausanne, Department of Ecology and Evolution, Le Biophore, UNIL-Sorge, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Camila J Mazzoni
- Leibniz Institute for Zoo and Wildlife Research (IZW), Alfred-Kowalke-Str 17, 10315 Berlin, Germany; Berlin Center for Genomics in Biodiversity Research (BeGenDiv), Koenigin-Luise-Str 6-8, 14195 Berlin, Germany
| | - Jacob Höglund
- Department of Ecology and Genetics, Uppsala University, Norbyvägen 18D, 75246, Uppsala, Sweden.
| | | |
Collapse
|
23
|
Secomandi S, Gallo GR, Sozzoni M, Iannucci A, Galati E, Abueg L, Balacco J, Caprioli M, Chow W, Ciofi C, Collins J, Fedrigo O, Ferretti L, Fungtammasan A, Haase B, Howe K, Kwak W, Lombardo G, Masterson P, Messina G, Møller AP, Mountcastle J, Mousseau TA, Ferrer Obiol J, Olivieri A, Rhie A, Rubolini D, Saclier M, Stanyon R, Stucki D, Thibaud-Nissen F, Torrance J, Torroni A, Weber K, Ambrosini R, Bonisoli-Alquati A, Jarvis ED, Gianfranceschi L, Formenti G. A chromosome-level reference genome and pangenome for barn swallow population genomics. Cell Rep 2023; 42:111992. [PMID: 36662619 PMCID: PMC10044405 DOI: 10.1016/j.celrep.2023.111992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 07/20/2022] [Accepted: 01/04/2023] [Indexed: 01/20/2023] Open
Abstract
Insights into the evolution of non-model organisms are limited by the lack of reference genomes of high accuracy, completeness, and contiguity. Here, we present a chromosome-level, karyotype-validated reference genome and pangenome for the barn swallow (Hirundo rustica). We complement these resources with a reference-free multialignment of the reference genome with other bird genomes and with the most comprehensive catalog of genetic markers for the barn swallow. We identify potentially conserved and accelerated genes using the multialignment and estimate genome-wide linkage disequilibrium using the catalog. We use the pangenome to infer core and accessory genes and to detect variants using it as a reference. Overall, these resources will foster population genomics studies in the barn swallow, enable detection of candidate genes in comparative genomics studies, and help reduce bias toward a single reference genome.
Collapse
Affiliation(s)
- Simona Secomandi
- Department of Biosciences, University of Milan, Milan, Italy; Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Guido R Gallo
- Department of Biosciences, University of Milan, Milan, Italy
| | | | - Alessio Iannucci
- Department of Biology, University of Florence, Sesto Fiorentino (FI), Italy
| | - Elena Galati
- Department of Biosciences, University of Milan, Milan, Italy
| | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Jennifer Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Manuela Caprioli
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | | | - Claudio Ciofi
- Department of Biology, University of Florence, Sesto Fiorentino (FI), Italy
| | | | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Luca Ferretti
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | | | - Bettina Haase
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Woori Kwak
- Department of Medical and Biological Sciences, The Catholic University of Korea, Bucheon 14662, Korea
| | - Gianluca Lombardo
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | - Anders P Møller
- Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Orsay Cedex, France
| | | | - Timothy A Mousseau
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Joan Ferrer Obiol
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | - Anna Olivieri
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Diego Rubolini
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | | | - Roscoe Stanyon
- Department of Biology, University of Florence, Sesto Fiorentino (FI), Italy
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | - Antonio Torroni
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | | | - Roberto Ambrosini
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | - Andrea Bonisoli-Alquati
- Department of Biological Sciences, California State Polytechnic University - Pomona, Pomona, CA, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA; The Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | | | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
24
|
Smith J, Alfieri JM, Anthony N, Arensburger P, Athrey GN, Balacco J, Balic A, Bardou P, Barela P, Bigot Y, Blackmon H, Borodin PM, Carroll R, Casono MC, Charles M, Cheng H, Chiodi M, Cigan L, Coghill LM, Crooijmans R, Das N, Davey S, Davidian A, Degalez F, Dekkers JM, Derks M, Diack AB, Djikeng A, Drechsler Y, Dyomin A, Fedrigo O, Fiddaman SR, Formenti G, Frantz LAF, Fulton JE, Gaginskaya E, Galkina S, Gallardo RA, Geibel J, Gheyas AA, Godinez CJP, Goodell A, Graves JAM, Griffin DK, Haase B, Han JL, Hanotte O, Henderson LJ, Hou ZC, Howe K, Huynh L, Ilatsia E, Jarvis ED, Johnson SM, Kaufman J, Kelly T, Kemp S, Kern C, Keroack JH, Klopp C, Lagarrigue S, Lamont SJ, Lange M, Lanke A, Larkin DM, Larson G, Layos JKN, Lebrasseur O, Malinovskaya LP, Martin RJ, Martin Cerezo ML, Mason AS, McCarthy FM, McGrew MJ, Mountcastle J, Muhonja CK, Muir W, Muret K, Murphy TD, Ng'ang'a I, Nishibori M, O'Connor RE, Ogugo M, Okimoto R, Ouko O, Patel HR, Perini F, Pigozzi MI, Potter KC, Price PD, Reimer C, Rice ES, Rocos N, Rogers TF, Saelao P, Schauer J, Schnabel RD, Schneider VA, Simianer H, Smith A, Stevens MP, Stiers K, Tiambo CK, Tixier-Boichard M, Torgasheva AA, Tracey A, Tregaskes CA, Vervelde L, Wang Y, Warren WC, Waters PD, Webb D, Weigend S, Wolc A, Wright AE, Wright D, Wu Z, Yamagata M, Yang C, Yin ZT, Young MC, Zhang G, Zhao B, Zhou H. Fourth Report on Chicken Genes and Chromosomes 2022. Cytogenet Genome Res 2023; 162:405-528. [PMID: 36716736 DOI: 10.1159/000529376] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 01/22/2023] [Indexed: 02/01/2023] Open
Affiliation(s)
- Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - James M Alfieri
- Interdisciplinary Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
- Department of Biology, Texas A&M University, College Station, Texas, USA
- Department of Poultry Science, Texas A&M University, College Station, Texas, USA
| | | | - Peter Arensburger
- Biological Sciences Department, California State Polytechnic University, Pomona, California, USA
| | - Giridhar N Athrey
- Interdisciplinary Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
- Department of Poultry Science, Texas A&M University, College Station, Texas, USA
| | | | - Adam Balic
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Philippe Bardou
- Université de Toulouse, INRAE, ENVT, GenPhySE, Sigenae, Castanet Tolosan, France
| | | | - Yves Bigot
- PRC, UMR INRAE 0085, CNRS 7247, Centre INRAE Val de Loire, Nouzilly, France
| | - Heath Blackmon
- Interdisciplinary Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
- Department of Biology, Texas A&M University, College Station, Texas, USA
| | - Pavel M Borodin
- Department of Molecular Genetics, Cell Biology and Bioinformatics, Institute of Cytology and Genetics of Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Rachel Carroll
- Department of Animal Sciences, Data Science and Informatics Institute, University of Missouri, Columbia, Missouri, USA
| | | | - Mathieu Charles
- University Paris-Saclay, INRAE, AgroParisTech, GABI, Sigenae, Jouy-en-Josas, France
| | - Hans Cheng
- USDA, ARS, USNPRC, Avian Disease and Oncology Laboratory, East Lansing, Michigan, USA
| | | | | | - Lyndon M Coghill
- Department of Veterinary Pathology, University of Missouri, Columbia, Missouri, USA
| | - Richard Crooijmans
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | | | - Sean Davey
- University of Arizona, Tucson, Arizona, USA
| | - Asya Davidian
- Saint Petersburg State University, Saint Petersburg, Russian Federation
| | - Fabien Degalez
- INRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, France
| | - Jack M Dekkers
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
| | - Martijn Derks
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Abigail B Diack
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Appolinaire Djikeng
- Centre for Tropical Livestock Genetics and Health (CTLGH) - The Roslin Institute, Edinburgh, UK
| | - Yvonne Drechsler
- College of Veterinary Medicine, Western University of Health Sciences, Pomona, California, USA
| | - Alexander Dyomin
- Saint Petersburg State University, Saint Petersburg, Russian Federation
| | | | | | | | - Laurent A F Frantz
- Queen Mary University of London, Bethnal Green, London, UK
- Palaeogenomics Group, Department of Veterinary Sciences, LMU Munich, Munich, Germany
| | - Janet E Fulton
- Hy-Line International, Research and Development, Dallas Center, Iowa, USA
| | - Elena Gaginskaya
- Saint Petersburg State University, Saint Petersburg, Russian Federation
| | - Svetlana Galkina
- Saint Petersburg State University, Saint Petersburg, Russian Federation
| | - Rodrigo A Gallardo
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- School of Veterinary Medicine, University of California, Davis, California, USA
| | - Johannes Geibel
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Neustadt, Germany
- Center for Integrated Breeding Research, University of Göttingen, Göttingen, Germany
| | - Almas A Gheyas
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Cyrill John P Godinez
- Department of Animal Science, College of Agriculture and Food Science, Visayas State University, Baybay City, Philippines
| | | | - Jennifer A M Graves
- Department of Environment and Genetics, La Trobe University, Melbourne, Victoria, Australia
- Institute for Applied Ecology, University of Canberra, Canberra, Australian Capital Territory, Australia
| | | | | | - Jian-Lin Han
- CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
- International Livestock Research Institute (ILRI), Addis Ababa, Ethiopia
| | - Olivier Hanotte
- International Livestock Research Institute (ILRI), Addis Ababa, Ethiopia
- Cells, Organisms and Molecular Genetics, School of Life Sciences, University of Nottingham, Nottingham, UK
- Centre for Tropical Livestock Genetics and Health, The Roslin Institute, Edinburgh, UK
| | - Lindsay J Henderson
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Zhuo-Cheng Hou
- National Engineering Laboratory for Animal Breeding and Key Laboratory of Animal Genetics, Breeding and Reproduction, MARA, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | | | - Lan Huynh
- Institute for Immunology and Infection Research, University of Edinburgh, Edinburgh, UK
| | - Evans Ilatsia
- Dairy Research Institute, Kenya Agricultural and Livestock Organization, Naivasha, Kenya
| | | | | | - Jim Kaufman
- Institute for Immunology and Infection Research, University of Edinburgh, Edinburgh, UK
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Terra Kelly
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- School of Veterinary Medicine, University of California, Davis, California, USA
| | - Steve Kemp
- Centre for Tropical Livestock Genetics and Health (CTLGH) - ILRI, Nairobi, Kenya
| | - Colin Kern
- Department of Animal Science, University of California, Davis, California, USA
| | | | | | | | - Susan J Lamont
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
| | - Margaret Lange
- Department of Molecular Microbiology and Immunology, University of Missouri, Columbia, Missouri, USA
| | - Anika Lanke
- BASIS Chandler High School, Chandler, Arizona, USA
| | - Denis M Larkin
- Department of Comparative Biomedical Sciences, Royal Veterinary College, University of London, London, UK
| | - Greger Larson
- The Palaeogenomics and Bio-Archaeology Research Network, Research Laboratory for Archaeology and History of Art, The University of Oxford, Oxford, UK
| | - John King N Layos
- College of Agriculture and Forestry, Capiz State University, Mambusao, Philippines
| | - Ophélie Lebrasseur
- Centre d'Anthropobiologie et de Génomique de Toulouse (CAGT), CNRS UMR 5288, Université Toulouse III Paul Sabatier, Toulouse, France
- Instituto Nacional de Antropología y Pensamiento Latinoamericano, Ciudad Autónoma de Buenos Aires, Argentina
| | - Lyubov P Malinovskaya
- Department of Cytology and Genetics, Novosibirsk State University, Novosibirsk, Russian Federation
| | | | | | | | | | - Michael J McGrew
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
- Centre for Tropical Livestock Genetics and Health (CTLGH) - The Roslin Institute, Edinburgh, UK
| | | | - Christine Kamidi Muhonja
- Dairy Research Institute, Kenya Agricultural and Livestock Organization, Naivasha, Kenya
- Centre for Tropical Livestock Genetics and Health (CTLGH) - ILRI, Nairobi, Kenya
| | - William Muir
- Department of Animal Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Kévin Muret
- Université Paris-Saclay, Commissariat à l'Energie Atomique et aux Energies Alternatives, Centre National de Recherche en Génomique Humaine, Evry, France
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | | | - Masahide Nishibori
- Laboratory of Animal Genetics, Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan
| | | | - Moses Ogugo
- Centre for Tropical Livestock Genetics and Health (CTLGH) - ILRI, Nairobi, Kenya
| | - Ron Okimoto
- Cobb-Vantress, Siloam Springs, Arkansas, USA
| | - Ochieng Ouko
- Dairy Research Institute, Kenya Agricultural and Livestock Organization, Naivasha, Kenya
| | - Hardip R Patel
- The John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Francesco Perini
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
- Department of Agricultural, Food and Environmental Sciences, University of Perugia, Perugia, Italy
| | - María Ines Pigozzi
- INBIOMED (CONICET-UBA), Facultad de Medicina, Universidad de Buenos Aires, Buenos Aires, Argentina
| | | | - Peter D Price
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, Sheffield, UK
| | - Christian Reimer
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Neustadt, Germany
| | - Edward S Rice
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA
| | - Nicolas Rocos
- Institute for Immunology and Infection Research, University of Edinburgh, Edinburgh, UK
| | - Thea F Rogers
- Department of Molecular Evolution and Development, University of Vienna, Vienna, Austria
| | - Perot Saelao
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- Department of Animal Science, University of California, Davis, California, USA
- Veterinary Pest Genetics Research Unit, USDA, Kerrville, Texas, USA
| | - Jens Schauer
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Neustadt, Germany
| | - Robert D Schnabel
- Department of Animal Sciences, University of Missouri, Columbia, Missouri, USA
| | - Valerie A Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Henner Simianer
- Center for Integrated Breeding Research, University of Göttingen, Göttingen, Germany
| | - Adrian Smith
- Department of Zoology, University of Oxford, Oxford, UK
| | - Mark P Stevens
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Kyle Stiers
- Department of Veterinary Pathology, University of Missouri, Columbia, Missouri, USA
| | | | | | - Anna A Torgasheva
- Department of Molecular Genetics, Cell Biology and Bioinformatics, Institute of Cytology and Genetics of Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Alan Tracey
- Wellcome Trust Sanger Institute, Hinxton, UK
| | - Clive A Tregaskes
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Lonneke Vervelde
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Ying Wang
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- Department of Animal Science, University of California, Davis, California, USA
| | - Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA
- Department of Animal Sciences, University of Missouri, Columbia, Missouri, USA
| | - Paul D Waters
- School of Biotechnology and Biomolecular Science, Faculty of Science, UNSW Sydney, Sydney, New South Wales, Australia
| | - David Webb
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Steffen Weigend
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Neustadt, Germany
- Center for Integrated Breeding Research, University of Göttingen, Göttingen, Germany
| | - Anna Wolc
- Department of Animal Science, Iowa State University, Ames, Iowa, USA
- Hy-Line International, Research and Development, Dallas Center, Iowa, USA
| | - Alison E Wright
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, Sheffield, UK
| | - Dominic Wright
- AVIAN Behavioural Genomics and Physiology, IFM Biology, Linköping University, Linköping, Sweden
| | - Zhou Wu
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| | - Masahito Yamagata
- Center for Brain Science, Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| | | | - Zhong-Tao Yin
- National Engineering Laboratory for Animal Breeding and Key Laboratory of Animal Genetics, Breeding and Reproduction, MARA, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | | | - Guojie Zhang
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Bingru Zhao
- College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Huaijun Zhou
- Feed the Future Innovation Lab for Genomics to Improve Poultry, University of California, Davis, California, USA
- Department of Animal Science, University of California, Davis, California, USA
| |
Collapse
|
25
|
Meyer BS, Moiron M, Caswara C, Chow W, Fedrigo O, Formenti G, Haase B, Howe K, Mountcastle J, Uliano-Silva M, Wood J, Jarvis ED, Liedvogel M, Bouwhuis S. Sex-specific changes in autosomal methylation rate in ageing common terns. Front Ecol Evol 2023. [DOI: 10.3389/fevo.2023.982443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Senescence, an age-related decline in survival and/or reproductive performance, occurs in species across the tree of life. Molecular mechanisms underlying this within-individual phenomenon are still largely unknown, but DNA methylation changes with age are among the candidates. Using a longitudinal approach, we investigated age-specific changes in autosomal methylation of common terns, relatively long-lived migratory seabirds known to show senescence. We collected blood at 1-, 3- and/or 4-year intervals, extracted DNA from the erythrocytes and estimated autosomal DNA methylation by mapping Reduced Representative Bisulfite Sequencing reads to a de novo assembled reference genome. We found autosomal methylation levels to decrease with age within females, but not males, and no evidence for selective (dis)appearance of birds of either sex in relation to their methylation level. Moreover, although we found positions in the genome to consistently vary in their methylation levels, individuals did not show such strong consistent variance. These results pave the way for studies at the level of genome features or specific positions, which should elucidate the functional consequences of the patterns observed, and how they translate to the ageing phenotype.
Collapse
|
26
|
Karawita AC, Cheng Y, Chew KY, Challagulla A, Kraus R, Mueller RC, Tong MZW, Hulme KD, Bielefeldt-Ohmann H, Steele LE, Wu M, Sng J, Noye E, Bruxner TJ, Au GG, Lowther S, Blommaert J, Suh A, McCauley AJ, Kaur P, Dudchenko O, Aiden E, Fedrigo O, Formenti G, Mountcastle J, Chow W, Martin FJ, Ogeh DN, Thiaud-Nissen F, Howe K, Tracey A, Smith J, Kuo RI, Renfree MB, Kimura T, Sakoda Y, McDougall M, Spencer HG, Pyne M, Tolf C, Waldenström J, Jarvis ED, Baker ML, Burt DW, Short KR. The swan genome and transcriptome, it is not all black and white. Genome Biol 2023; 24:13. [PMID: 36683094 PMCID: PMC9867998 DOI: 10.1186/s13059-022-02838-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 12/12/2022] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The Australian black swan (Cygnus atratus) is an iconic species with contrasting plumage to that of the closely related northern hemisphere white swans. The relative geographic isolation of the black swan may have resulted in a limited immune repertoire and increased susceptibility to infectious diseases, notably infectious diseases from which Australia has been largely shielded. Unlike mallard ducks and the mute swan (Cygnus olor), the black swan is extremely sensitive to highly pathogenic avian influenza. Understanding this susceptibility has been impaired by the absence of any available swan genome and transcriptome information. RESULTS Here, we generate the first chromosome-length black and mute swan genomes annotated with transcriptome data, all using long-read based pipelines generated for vertebrate species. We use these genomes and transcriptomes to show that unlike other wild waterfowl, black swans lack an expanded immune gene repertoire, lack a key viral pattern-recognition receptor in endothelial cells and mount a poorly controlled inflammatory response to highly pathogenic avian influenza. We also implicate genetic differences in SLC45A2 gene in the iconic plumage of the black swan. CONCLUSION Together, these data suggest that the immune system of the black swan is such that should any avian viral infection become established in its native habitat, the black swan would be in a significant peril.
Collapse
Affiliation(s)
- Anjana C. Karawita
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia ,grid.413322.50000 0001 2188 8254Commonwealth Scientific and Industrial Research Organisation, Australian Centre for Disease Preparedness, 5 Portarlington Road, Geelong, VIC 3220 Australia
| | - Yuanyuan Cheng
- grid.1013.30000 0004 1936 834XSchool of Life and Environmental Sciences, The University of Sydney, Sydney, NSW 2006 Australia
| | - Keng Yih Chew
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Arjun Challagulla
- grid.413322.50000 0001 2188 8254Commonwealth Scientific and Industrial Research Organisation, Australian Centre for Disease Preparedness, 5 Portarlington Road, Geelong, VIC 3220 Australia
| | - Robert Kraus
- grid.507516.00000 0004 7661 536XDepartment of Migration, Max Planck Institute of Animal Behavior, Radolfzell, 78315 Germany ,grid.9811.10000 0001 0658 7699Department of Biology, University of Konstanz, Konstanz, 78457 Germany
| | - Ralf C. Mueller
- grid.507516.00000 0004 7661 536XDepartment of Migration, Max Planck Institute of Animal Behavior, Radolfzell, 78315 Germany ,grid.9811.10000 0001 0658 7699Department of Biology, University of Konstanz, Konstanz, 78457 Germany
| | - Marcus Z. W. Tong
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Katina D. Hulme
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Helle Bielefeldt-Ohmann
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Lauren E. Steele
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Melanie Wu
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Julian Sng
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Ellesandra Noye
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Timothy J. Bruxner
- grid.1003.20000 0000 9320 7537Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Gough G. Au
- grid.413322.50000 0001 2188 8254Commonwealth Scientific and Industrial Research Organisation, Australian Centre for Disease Preparedness, 5 Portarlington Road, Geelong, VIC 3220 Australia
| | - Suzanne Lowther
- grid.413322.50000 0001 2188 8254Commonwealth Scientific and Industrial Research Organisation, Australian Centre for Disease Preparedness, 5 Portarlington Road, Geelong, VIC 3220 Australia
| | - Julie Blommaert
- grid.8993.b0000 0004 1936 9457Department of Organismal Biology – Systematic Biology, Evolutionary Biology Centre, Uppsala University, Science for Life Laboratory, Uppsala, 752 36 Sweden ,The New Zealand Institute for Plant & Food Research Ltd, Nelson, 7010 New Zealand
| | - Alexander Suh
- grid.8993.b0000 0004 1936 9457Department of Organismal Biology – Systematic Biology, Evolutionary Biology Centre, Uppsala University, Science for Life Laboratory, Uppsala, 752 36 Sweden ,grid.8273.e0000 0001 1092 7967School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TU UK
| | - Alexander J. McCauley
- grid.413322.50000 0001 2188 8254Commonwealth Scientific and Industrial Research Organisation, Australian Centre for Disease Preparedness, 5 Portarlington Road, Geelong, VIC 3220 Australia
| | - Parwinder Kaur
- grid.1012.20000 0004 1936 7910School of Agriculture and Environment, The University of Western Australia, Perth, WA 6009 Australia
| | - Olga Dudchenko
- grid.39382.330000 0001 2160 926XThe Centre for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA ,grid.21940.3e0000 0004 1936 8278Centre for Theoretical Biological Physics and Department of Computer Science, Rice University, Houston, TX 77030 USA
| | - Erez Aiden
- grid.1012.20000 0004 1936 7910School of Agriculture and Environment, The University of Western Australia, Perth, WA 6009 Australia ,grid.39382.330000 0001 2160 926XThe Centre for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA ,grid.21940.3e0000 0004 1936 8278Centre for Theoretical Biological Physics and Department of Computer Science, Rice University, Houston, TX 77030 USA ,grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA 02139 USA ,Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech, Pudong, 201210 China
| | - Olivier Fedrigo
- grid.134907.80000 0001 2166 1519The Vertebrate Genome Laboratory, The Rockefeller University, NY, 10065 USA
| | - Giulio Formenti
- grid.134907.80000 0001 2166 1519The Vertebrate Genome Laboratory, The Rockefeller University, NY, 10065 USA
| | - Jacquelyn Mountcastle
- grid.134907.80000 0001 2166 1519The Vertebrate Genome Laboratory, The Rockefeller University, NY, 10065 USA
| | - William Chow
- grid.10306.340000 0004 0606 5382Tree of Life, Welcome Sanger Institute, Cambridge, CB10 1SA UK
| | - Fergal J. Martin
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Denye N. Ogeh
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Françoise Thiaud-Nissen
- grid.94365.3d0000 0001 2297 5165National Centre for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD USA
| | - Kerstin Howe
- grid.10306.340000 0004 0606 5382Tree of Life, Welcome Sanger Institute, Cambridge, CB10 1SA UK
| | - Alan Tracey
- grid.10306.340000 0004 0606 5382Tree of Life, Welcome Sanger Institute, Cambridge, CB10 1SA UK
| | - Jacqueline Smith
- grid.4305.20000 0004 1936 7988The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG UK
| | - Richard I. Kuo
- grid.4305.20000 0004 1936 7988The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG UK
| | - Marilyn B. Renfree
- grid.1008.90000 0001 2179 088XSchool of Biosciences, The University of Melbourne, Melbourne, VIC 3052 Australia
| | - Takashi Kimura
- grid.39158.360000 0001 2173 7691Faculty of Veterinary Medicine, Hokkaido University, Sapporo, Hokkaido 060-0818 Japan
| | - Yoshihiro Sakoda
- grid.39158.360000 0001 2173 7691Faculty of Veterinary Medicine, Hokkaido University, Sapporo, Hokkaido 060-0818 Japan
| | - Mathew McDougall
- New Zealand Fish & Game – Eastern Region, Rotorua, 3046 New Zealand
| | - Hamish G. Spencer
- grid.29980.3a0000 0004 1936 7830Department of Zoology, University of Otago, Dunedin, 9054 New Zealand
| | - Michael Pyne
- Currumbin Wildlife Sanctuary, Currumbin, QLD 4223 Australia
| | - Conny Tolf
- grid.8148.50000 0001 2174 3522Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, SE-391 82 Sweden
| | - Jonas Waldenström
- grid.8148.50000 0001 2174 3522Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, SE-391 82 Sweden
| | - Erich D. Jarvis
- grid.134907.80000 0001 2166 1519The Vertebrate Genome Laboratory, The Rockefeller University, NY, 10065 USA
| | - Michelle L. Baker
- grid.413322.50000 0001 2188 8254Commonwealth Scientific and Industrial Research Organisation, Australian Centre for Disease Preparedness, 5 Portarlington Road, Geelong, VIC 3220 Australia
| | - David W. Burt
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Kirsty R. Short
- grid.1003.20000 0000 9320 7537School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| |
Collapse
|
27
|
Toh H, Yang C, Formenti G, Raja K, Yan L, Tracey A, Chow W, Howe K, Bergeron LA, Zhang G, Haase B, Mountcastle J, Fedrigo O, Fogg J, Kirilenko B, Munegowda C, Hiller M, Jain A, Kihara D, Rhie A, Phillippy AM, Swanson SA, Jiang P, Clegg DO, Jarvis ED, Thomson JA, Stewart R, Chaisson MJP, Bukhman YV. A haplotype-resolved genome assembly of the Nile rat facilitates exploration of the genetic basis of diabetes. BMC Biol 2022; 20:245. [DOI: 10.1186/s12915-022-01427-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 09/29/2022] [Indexed: 11/09/2022] Open
Abstract
Abstract
Background
The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, holds the promise of better translation of research findings to the clinic.
Results
We report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including some that affect genes associated with type 2 diabetes and metabolic dysfunctions. We discuss 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse.
Conclusions
Our findings reflect the exceptional level of genomic resolution present in this assembly, which will greatly expand the potential of the Nile rat as a model organism.
Collapse
|
28
|
Jarvis ED, Formenti G, Rhie A, Guarracino A, Yang C, Wood J, Tracey A, Thibaud-Nissen F, Vollger MR, Porubsky D, Cheng H, Asri M, Logsdon GA, Carnevali P, Chaisson MJP, Chin CS, Cody S, Collins J, Ebert P, Escalona M, Fedrigo O, Fulton RS, Fulton LL, Garg S, Gerton JL, Ghurye J, Granat A, Green RE, Harvey W, Hasenfeld P, Hastie A, Haukness M, Jaeger EB, Jain M, Kirsche M, Kolmogorov M, Korbel JO, Koren S, Korlach J, Lee J, Li D, Lindsay T, Lucas J, Luo F, Marschall T, Mitchell MW, McDaniel J, Nie F, Olsen HE, Olson ND, Pesout T, Potapova T, Puiu D, Regier A, Ruan J, Salzberg SL, Sanders AD, Schatz MC, Schmitt A, Schneider VA, Selvaraj S, Shafin K, Shumate A, Stitziel NO, Stober C, Torrance J, Wagner J, Wang J, Wenger A, Xiao C, Zimin AV, Zhang G, Wang T, Li H, Garrison E, Haussler D, Hall I, Zook JM, Eichler EE, Phillippy AM, Paten B, Howe K, Miga KH. Semi-automated assembly of high-quality diploid human reference genomes. Nature 2022; 611:519-531. [PMID: 36261518 PMCID: PMC9668749 DOI: 10.1038/s41586-022-05325-5] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 09/06/2022] [Indexed: 01/01/2023]
Abstract
The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.
Collapse
Affiliation(s)
- Erich D. Jarvis
- grid.134907.80000 0001 2166 1519Vertebrate Genome Laboratory, The Rockefeller University, New York, NY USA ,grid.413575.10000 0001 2167 1581Howard Hughes Medical Institute, Chevy Chase, MD USA
| | - Giulio Formenti
- grid.134907.80000 0001 2166 1519Vertebrate Genome Laboratory, The Rockefeller University, New York, NY USA
| | - Arang Rhie
- grid.94365.3d0000 0001 2297 5165Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
| | - Andrea Guarracino
- grid.510779.d0000 0004 9414 6915Genomics Research Centre, Human Technopole, Viale Rita Levi-Montalcini, Milan, Italy
| | - Chentao Yang
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China
| | - Jonathan Wood
- grid.10306.340000 0004 0606 5382Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Alan Tracey
- grid.10306.340000 0004 0606 5382Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Francoise Thibaud-Nissen
- grid.94365.3d0000 0001 2297 5165National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD USA
| | - Mitchell R. Vollger
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
| | - David Porubsky
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
| | - Haoyu Cheng
- grid.65499.370000 0001 2106 9910Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA USA ,grid.38142.3c000000041936754XDepartment of Biomedical Informatics, Harvard Medical School, Boston, MA USA
| | - Mobin Asri
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Glennis A. Logsdon
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
| | - Paolo Carnevali
- grid.507326.50000 0004 6090 4941Chan Zuckerberg Initiative, Redwood City, CA USA
| | - Mark J. P. Chaisson
- grid.42505.360000 0001 2156 6853Quantitative and Computational Biology, University of Southern California, Los Angeles, CA USA
| | | | - Sarah Cody
- grid.4367.60000 0001 2355 7002McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| | - Joanna Collins
- grid.10306.340000 0004 0606 5382Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Peter Ebert
- grid.411327.20000 0001 2176 9917Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Merly Escalona
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA USA
| | - Olivier Fedrigo
- grid.134907.80000 0001 2166 1519Vertebrate Genome Laboratory, The Rockefeller University, New York, NY USA
| | - Robert S. Fulton
- grid.4367.60000 0001 2355 7002McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| | - Lucinda L. Fulton
- grid.4367.60000 0001 2355 7002McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| | - Shilpa Garg
- grid.5254.60000 0001 0674 042XDepartment of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jennifer L. Gerton
- grid.250820.d0000 0000 9420 1591Stowers Institute for Medical Research, Kansas City, MO USA
| | - Jay Ghurye
- grid.504403.6Dovetail Genomics, Scotts Valley, CA USA
| | | | - Richard E. Green
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - William Harvey
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
| | - Patrick Hasenfeld
- grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Alex Hastie
- grid.470262.50000 0004 0473 1353Bionano Genomics, San Diego, CA USA
| | - Marina Haukness
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Erich B. Jaeger
- grid.185669.50000 0004 0507 3954Illumina, Inc., San Diego, CA USA
| | - Miten Jain
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Melanie Kirsche
- grid.21107.350000 0001 2171 9311Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | - Mikhail Kolmogorov
- grid.266100.30000 0001 2107 4242Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA USA
| | - Jan O. Korbel
- grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Sergey Koren
- grid.94365.3d0000 0001 2297 5165Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
| | - Jonas Korlach
- grid.423340.20000 0004 0640 9878Pacific Biosciences, Menlo Park, CA USA
| | - Joyce Lee
- grid.470262.50000 0004 0473 1353Bionano Genomics, San Diego, CA USA
| | - Daofeng Li
- grid.4367.60000 0001 2355 7002Department of Genetics, Washington University School of Medicine, St. Louis, MO USA ,grid.4367.60000 0001 2355 7002The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Tina Lindsay
- grid.4367.60000 0001 2355 7002McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| | - Julian Lucas
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Feng Luo
- grid.26090.3d0000 0001 0665 0280School of Computing, Clemson University, Clemson, SC USA
| | - Tobias Marschall
- grid.411327.20000 0001 2176 9917Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Matthew W. Mitchell
- grid.282012.b0000 0004 0627 5048Coriell Institute for Medical Research, Camden, NJ USA
| | - Jennifer McDaniel
- grid.94225.38000000012158463XMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD USA
| | - Fan Nie
- grid.216417.70000 0001 0379 7164Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hugh E. Olsen
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Nathan D. Olson
- grid.94225.38000000012158463XMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD USA
| | - Trevor Pesout
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Tamara Potapova
- grid.250820.d0000 0000 9420 1591Stowers Institute for Medical Research, Kansas City, MO USA
| | - Daniela Puiu
- grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Allison Regier
- grid.511991.40000 0004 4910 5831DNAnexus, Mountain View, CA USA
| | - Jue Ruan
- grid.410727.70000 0001 0526 1937Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Steven L. Salzberg
- grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Ashley D. Sanders
- grid.419491.00000 0001 1014 0849Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michael C. Schatz
- grid.21107.350000 0001 2171 9311Department of Computer Science, Johns Hopkins University, Baltimore, MD USA
| | | | - Valerie A. Schneider
- grid.94365.3d0000 0001 2297 5165National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD USA
| | | | - Kishwar Shafin
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Alaina Shumate
- grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Nathan O. Stitziel
- grid.4367.60000 0001 2355 7002McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA ,grid.4367.60000 0001 2355 7002Department of Genetics, Washington University School of Medicine, St. Louis, MO USA ,grid.4367.60000 0001 2355 7002Cardiovascular Division, John T. Milliken Department of Internal Medicine, Washington University School of Medicine, St. Louis, USA
| | - Catherine Stober
- grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - James Torrance
- grid.10306.340000 0004 0606 5382Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Justin Wagner
- grid.94225.38000000012158463XMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD USA
| | - Jianxin Wang
- grid.216417.70000 0001 0379 7164Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Aaron Wenger
- grid.423340.20000 0004 0640 9878Pacific Biosciences, Menlo Park, CA USA
| | - Chuanle Xiao
- grid.12981.330000 0001 2360 039XState Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Aleksey V. Zimin
- grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Guojie Zhang
- grid.13402.340000 0004 1759 700XCenter for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Ting Wang
- grid.4367.60000 0001 2355 7002McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO USA ,grid.4367.60000 0001 2355 7002Department of Genetics, Washington University School of Medicine, St. Louis, MO USA ,grid.4367.60000 0001 2355 7002The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO USA
| | - Heng Li
- grid.65499.370000 0001 2106 9910Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA USA
| | - Erik Garrison
- grid.267301.10000 0004 0386 9246Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN USA
| | - David Haussler
- grid.413575.10000 0001 2167 1581Howard Hughes Medical Institute, Chevy Chase, MD USA ,grid.205975.c0000 0001 0740 6917Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA USA
| | - Ira Hall
- grid.47100.320000000419368710Yale School of Medicine, New Haven, CT USA
| | - Justin M. Zook
- grid.94225.38000000012158463XMaterial Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD USA
| | - Evan E. Eichler
- grid.413575.10000 0001 2167 1581Howard Hughes Medical Institute, Chevy Chase, MD USA ,grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
| | - Adam M. Phillippy
- grid.94365.3d0000 0001 2297 5165Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
| | - Benedict Paten
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | - Kerstin Howe
- grid.10306.340000 0004 0606 5382Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Karen H. Miga
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
| | | |
Collapse
|
29
|
Kim J, Lee C, Ko BJ, Yoo DA, Won S, Phillippy AM, Fedrigo O, Zhang G, Howe K, Wood J, Durbin R, Formenti G, Brown S, Cantin L, Mello CV, Cho S, Rhie A, Kim H, Jarvis ED. False gene and chromosome losses in genome assemblies caused by GC content variation and repeats. Genome Biol 2022; 23:204. [PMID: 36167554 PMCID: PMC9516821 DOI: 10.1186/s13059-022-02765-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/02/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many short-read genome assemblies have been found to be incomplete and contain mis-assemblies. The Vertebrate Genomes Project has been producing new reference genome assemblies with an emphasis on being as complete and error-free as possible, which requires utilizing long reads, long-range scaffolding data, new assembly algorithms, and manual curation. A more thorough evaluation of the recent references relative to prior assemblies can provide a detailed overview of the types and magnitude of improvements. RESULTS Here we evaluate new vertebrate genome references relative to the previous assemblies for the same species and, in two cases, the same individuals, including a mammal (platypus), two birds (zebra finch, Anna's hummingbird), and a fish (climbing perch). We find that up to 11% of genomic sequence is entirely missing in the previous assemblies. In the Vertebrate Genomes Project zebra finch assembly, we identify eight new GC- and repeat-rich micro-chromosomes with high gene density. The impact of missing sequences is biased towards GC-rich 5'-proximal promoters and 5' exon regions of protein-coding genes and long non-coding RNAs. Between 26 and 60% of genes include structural or sequence errors that could lead to misunderstanding of their function when using the previous genome assemblies. CONCLUSIONS Our findings reveal novel regulatory landscapes and protein coding sequences that have been greatly underestimated in previous assemblies and are now present in the Vertebrate Genomes Project reference genomes.
Collapse
Affiliation(s)
- Juwan Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Dong Ahn Yoo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Sohyoung Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA
| | - Guojie Zhang
- BGI-Shenzhen, Shenzhen, 518083, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China
| | | | | | - Richard Durbin
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Samara Brown
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Lindsey Cantin
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, 97239, USA
| | - Seoae Cho
- eGnome, Inc, Seoul, Republic of Korea
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
- eGnome, Inc, Seoul, Republic of Korea.
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
30
|
Ko BJ, Lee C, Kim J, Rhie A, Yoo DA, Howe K, Wood J, Cho S, Brown S, Formenti G, Jarvis ED, Kim H. Widespread false gene gains caused by duplication errors in genome assemblies. Genome Biol 2022; 23:205. [PMID: 36167596 PMCID: PMC9516828 DOI: 10.1186/s13059-022-02764-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/02/2022] [Indexed: 12/22/2022] Open
Abstract
Background False duplications in genome assemblies lead to false biological conclusions. We quantified false duplications in popularly used previous genome assemblies for platypus, zebra finch, and Anna’s Hummingbird, and their new counterparts of the same species generated by the Vertebrate Genomes Project, of which the Vertebrate Genomes Project pipeline attempted to eliminate false duplications through haplotype phasing and purging. These assemblies are among the first generated by the Vertebrate Genomes Project where there was a prior chromosomal level reference assembly to compare with. Results Whole genome alignments revealed that 4 to 16% of the sequences are falsely duplicated in the previous assemblies, impacting hundreds to thousands of genes. These lead to overestimated gene family expansions. The main source of the false duplications is heterotype duplications, where the haplotype sequences were relatively more divergent than other parts of the genome leading the assembly algorithms to classify them as separate genes or genomic regions. A minor source is sequencing errors. Ancient ATP nucleotide binding gene families have a higher prevalence of false duplications compared to other gene families. Although present in a smaller proportion, we observe false duplications remaining in the Vertebrate Genomes Project assemblies that can be identified and purged. Conclusions This study highlights the need for more advanced assembly methods that better separate haplotypes and sequence errors, and the need for cautious analyses on gene gains. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02764-1.
Collapse
Affiliation(s)
- Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Juwan Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, USA
| | - Dong Ahn Yoo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | | | | | - Seoae Cho
- eGnome, Inc, Seoul, Republic of Korea
| | - Samara Brown
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY, USA.,Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Giulio Formenti
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Erich D Jarvis
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY, USA. .,Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea. .,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea. .,eGnome, Inc, Seoul, Republic of Korea.
| |
Collapse
|
31
|
Dahn HA, Mountcastle J, Balacco J, Winkler S, Bista I, Schmitt AD, Pettersson OV, Formenti G, Oliver K, Smith M, Tan W, Kraus A, Mac S, Komoroske LM, Lama T, Crawford AJ, Murphy RW, Brown S, Scott AF, Morin PA, Jarvis ED, Fedrigo O. Benchmarking ultra-high molecular weight DNA preservation methods for long-read and long-range sequencing. Gigascience 2022; 11:6659719. [PMID: 35946988 PMCID: PMC9364683 DOI: 10.1093/gigascience/giac068] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 01/26/2022] [Accepted: 06/16/2022] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra-high molecular weight DNA (uHMW DNA) remains a major challenge. Here we present a comparative study of preservation methods for field and laboratory tissue sampling, across vertebrate classes and different tissue types. RESULTS We find that storage temperature was the strongest predictor of uHMW fragment lengths. While immediate flash-freezing remains the sample preservation gold standard, samples preserved in 95% EtOH or 20-25% DMSO-EDTA showed little fragment length degradation when stored at 4°C for 6 hours. Samples in 95% EtOH or 20-25% DMSO-EDTA kept at 4°C for 1 week after dissection still yielded adequate amounts of uHMW DNA for most applications. Tissue type was a significant predictor of total DNA yield but not fragment length. Preservation solution had a smaller but significant influence on both fragment length and DNA yield. CONCLUSION We provide sample preservation guidelines that ensure sufficient DNA integrity and amount required for use with long-read and long-range sequencing technologies across vertebrates. Our best practices generated the uHMW DNA needed for the high-quality reference genomes for phase 1 of the Vertebrate Genomes Project, whose ultimate mission is to generate chromosome-level reference genome assemblies of all ∼70,000 extant vertebrate species.
Collapse
Affiliation(s)
| | | | | | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Saxony 01307, Germany
| | - Iliana Bista
- Tree of Life Program, Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
- Department of Genetics, University of Cambridge, Cambridge, Cambridgeshire CB2 3EH, UK
| | | | | | | | - Karen Oliver
- Tree of Life Program, Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Michelle Smith
- Tree of Life Program, Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Wenhua Tan
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Saxony 01307, Germany
| | - Anne Kraus
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Saxony 01307, Germany
| | - Stephen Mac
- Arima Genomics, Inc., San Diego, CA 92121, USA
| | - Lisa M Komoroske
- Department of Environmental Conservation, University of Massachusetts Amherst, Amherst, MA 01003-9285, USA
| | - Tanya Lama
- Department of Environmental Conservation, University of Massachusetts Amherst, Amherst, MA 01003-9285, USA
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá 111711, Colombia
| | - Robert W Murphy
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Samara Brown
- The Rockefeller University, New York, NY 10065, USA
| | - Alan F Scott
- Department of Medicine, Johns Hopkins University, Baltimore, MD 21287, USA
| | - Phillip A Morin
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA 92037, USA
| | - Erich D Jarvis
- The Rockefeller University, New York, NY 10065, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Olivier Fedrigo
- Correspondence address. Olivier Fedrigo, Vertebrate Genome Laboratory, The Rockefeller University, 1230 York Avenue, Box 366, New York, NY 10065, USA. E-mail:
| |
Collapse
|
32
|
Formenti G, Abueg L, Brajuka A, Brajuka N, Gallardo-Alba C, Giani A, Fedrigo O, Jarvis ED. Gfastats: conversion, evaluation and manipulation of genome sequences using assembly graphs. Bioinformatics 2022; 38:4214-4216. [PMID: 35799367 PMCID: PMC9438950 DOI: 10.1093/bioinformatics/btac460] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/28/2022] [Accepted: 07/06/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION With the current pace at which reference genomes are being produced, the availability of tools that can reliably and efficiently generate genome assembly summary statistics has become critical. Additionally, with the emergence of new algorithms and data types, tools that can improve the quality of existing assemblies through automated and manual curation are required. RESULTS We sought to address both these needs by developing gfastats, as part of the Vertebrate Genomes Project (VGP) effort to generate high-quality reference genomes at scale. Gfastats is a standalone tool to compute assembly summary statistics and manipulate assembly sequences in FASTA, FASTQ or GFA [.gz] format. Gfastats stores assembly sequences internally in a GFA-like format. This feature allows gfastats to seamlessly convert FAST* to and from GFA [.gz] files. Gfastats can also build an assembly graph that can in turn be used to manipulate the underlying sequences following instructions provided by the user, while simultaneously generating key metrics for the new sequences. AVAILABILITY AND IMPLEMENTATION Gfastats is implemented in C++. Precompiled releases (Linux, MacOS, Windows) and commented source code for gfastats are available under MIT licence at https://github.com/vgl-hub/gfastats. Examples of how to run gfastats are provided in the GitHub. Gfastats is also available in Bioconda, in Galaxy (https://assembly.usegalaxy.eu) and as a MultiQC module (https://github.com/ewels/MultiQC). An automated test workflow is available to ensure consistency of software updates. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Linelle Abueg
- The Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Angelo Brajuka
- The Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Nadolina Brajuka
- The Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Cristóbal Gallardo-Alba
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg 79110, Germany
| | - Alice Giani
- Helen and Robert Appel Alzheimer Disease Research Institute, Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY 10021, USA
| | - Olivier Fedrigo
- The Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Erich D Jarvis
- The Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA,Howard Hughes Medical Institute, Chevy Chase, Maryland 20815, USA
| |
Collapse
|
33
|
Mc Cartney AM, Shafin K, Alonge M, Bzikadze AV, Formenti G, Fungtammasan A, Howe K, Jain C, Koren S, Logsdon GA, Miga KH, Mikheenko A, Paten B, Shumate A, Soto DC, Sović I, Wood JMD, Zook JM, Phillippy AM, Rhie A. Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies. Nat Methods 2022; 19:687-695. [PMID: 35361931 PMCID: PMC9812399 DOI: 10.1038/s41592-022-01440-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 03/04/2022] [Indexed: 01/07/2023]
Abstract
Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.
Collapse
Affiliation(s)
- Ann M. Mc Cartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI, NIH
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Michael Alonge
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Giulio Formenti
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | | | | | - Chirag Jain
- Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI, NIH,Department of Computational and Data Sciences, Indian Institute of Science, Bangalore KA, India
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI, NIH
| | - Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Daniela C. Soto
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis, CA, USA
| | - Ivan Sović
- Pacific Biosciences, Menlo Park, CA, USA,Digital BioLogic d.o.o., Ivanić-Grad, Croatia
| | | | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI, NIH,Correspondence: ,
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI, NIH,Correspondence: ,
| |
Collapse
|
34
|
Lombardo G, Migliore NR, Colombo G, Capodiferro MR, Formenti G, Caprioli M, Moroni E, Caporali L, Lancioni H, Secomandi S, Gallo GR, Costanzo A, Romano A, Garofalo M, Cereda C, Carelli V, Gillespie L, Liu Y, Kiat Y, Marzal A, López-Calderón C, Balbontín J, Mousseau TA, Matyjasiak P, Møller AP, Semino O, Ambrosini R, Alquati AB, Rubolini D, Ferretti L, Achilli A, Gianfranceschi L, Olivieri A, Torroni A. The Mitogenome Relationships and Phylogeography of Barn Swallows (Hirundo rustica). Mol Biol Evol 2022; 39:6591937. [PMID: 35617136 PMCID: PMC9174979 DOI: 10.1093/molbev/msac113] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The barn swallow (Hirundo rustica) poses a number of fascinating scientific questions, including the taxonomic status of postulated subspecies. Here we obtained and assessed the sequence variation of 411 complete mitogenomes, mainly from the European H. r. rustica, but other subspecies as well. In almost every case, we observed subspecies-specific haplogroups, which we employed together with estimated radiation times to postulate a model for the geographical and temporal worldwide spread of the species. The female barn swallow carrying the Hirundo rustica ancestral mitogenome left Africa (or its vicinity) around 280 thousand years ago (kya), and her descendants expanded first into Eurasia and then, at least 51 kya, into the Americas, from where a relatively recent (< 20 kya) back migration to Asia took place. The exception to the haplogroup subspecies specificity is represented by the sedentary Levantine H. r. transitiva that extensively shares haplogroup A with the migratory European H. r. rustica and, to a lesser extent, haplogroup B with the Egyptian H. r. savignii. Our data indicate that rustica and transitiva most likely derive from a sedentary Levantine population source that split at the end of the Younger Dryas (11.7 kya). Since then, however, transitiva received genetic inputs from and admixed with both the closely related rustica and the adjacent savignii. Demographic analyses confirm this species' strong link with climate fluctuations and human activities making it an excellent indicator for monitoring and assessing the impact of current global changes on wildlife.
Collapse
Affiliation(s)
- Gianluca Lombardo
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Nicola Rambaldi Migliore
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Giulia Colombo
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Marco Rosario Capodiferro
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Manuela Caprioli
- Dipartimento di Scienze e Politiche Ambientali, Università degli Studi di Milano, 20133 Milan, Italy
| | - Elisabetta Moroni
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Leonardo Caporali
- IRCCS Istituto delle Scienze Neurologiche di Bologna, Programma di Neurogenetica, 40139 Bologna, Italy
| | - Hovirag Lancioni
- Dipartimento di Chimica, Biologia e Biotecnologie, Università di Perugia, 06123 Perugia, Italy
| | - Simona Secomandi
- Dipartimento di Bioscienze, Università degli Studi di Milano, 20133 Milan, Italy
| | - Guido Roberto Gallo
- Dipartimento di Bioscienze, Università degli Studi di Milano, 20133 Milan, Italy
| | - Alessandra Costanzo
- Dipartimento di Scienze e Politiche Ambientali, Università degli Studi di Milano, 20133 Milan, Italy
| | - Andrea Romano
- Dipartimento di Scienze e Politiche Ambientali, Università degli Studi di Milano, 20133 Milan, Italy
| | - Maria Garofalo
- Genomic and Post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy
| | - Cristina Cereda
- Genomic and Post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy
| | - Valerio Carelli
- IRCCS Istituto delle Scienze Neurologiche di Bologna, Programma di Neurogenetica, 40139 Bologna, Italy.,Dipartimento di Scienze Biomediche e Neuromotorie, Università di Bologna, 40139 Bologna, Italy
| | - Lauren Gillespie
- Department of Academic Education, Central Community College, Columbus, NE 68601, USA
| | - Yang Liu
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Guangzhou 510275, China
| | - Yosef Kiat
- Israeli Bird Ringing Center (IBRC), Israel Ornithological Center, Tel Aviv, Israel
| | - Alfonso Marzal
- Department of Zoology, University of Extremadura, 06071 Badajoz, Spain
| | - Cosme López-Calderón
- Department of Wetland Ecology, Estación Biológica de Doñana CSIC, 41092 Seville, Spain
| | - Javier Balbontín
- Department of Zoology, University of Seville, 41012 Seville, Spain
| | - Timothy A Mousseau
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Piotr Matyjasiak
- Institute of Biological Sciences, Cardinal Stefan Wyszyński University in Warsaw, 01-938 Warsaw, Poland
| | - Anders Pape Møller
- Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91405, Orsay Cedex, France
| | - Ornella Semino
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Roberto Ambrosini
- Dipartimento di Scienze e Politiche Ambientali, Università degli Studi di Milano, 20133 Milan, Italy
| | - Andrea Bonisoli Alquati
- Department of Biological Sciences, California State Polytechnic University - Pomona, Pomona, CA 91767, USA
| | - Diego Rubolini
- Dipartimento di Scienze e Politiche Ambientali, Università degli Studi di Milano, 20133 Milan, Italy
| | - Luca Ferretti
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Alessandro Achilli
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Luca Gianfranceschi
- Dipartimento di Bioscienze, Università degli Studi di Milano, 20133 Milan, Italy
| | - Anna Olivieri
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| | - Antonio Torroni
- Dipartimento di Biologia e Biotecnologie "Lazzaro Spallanzani", Università di Pavia, 27100 Pavia, Italy
| |
Collapse
|
35
|
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, Aganezov S, Hoyt SJ, Diekhans M, Logsdon GA, Alonge M, Antonarakis SE, Borchers M, Bouffard GG, Brooks SY, Caldas GV, Chen NC, Cheng H, Chin CS, Chow W, de Lima LG, Dishuck PC, Durbin R, Dvorkina T, Fiddes IT, Formenti G, Fulton RS, Fungtammasan A, Garrison E, Grady PG, Graves-Lindsay TA, Hall IM, Hansen NF, Hartley GA, Haukness M, Howe K, Hunkapiller MW, Jain C, Jain M, Jarvis ED, Kerpedjiev P, Kirsche M, Kolmogorov M, Korlach J, Kremitzki M, Li H, Maduro VV, Marschall T, McCartney AM, McDaniel J, Miller DE, Mullikin JC, Myers EW, Olson ND, Paten B, Peluso P, Pevzner PA, Porubsky D, Potapova T, Rogaev EI, Rosenfeld JA, Salzberg SL, Schneider VA, Sedlazeck FJ, Shafin K, Shew CJ, Shumate A, Sims Y, Smit AFA, Soto DC, Sović I, Storer JM, Streets A, Sullivan BA, Thibaud-Nissen F, Torrance J, Wagner J, Walenz BP, Wenger A, Wood JMD, Xiao C, Yan SM, Young AC, Zarate S, Surti U, McCoy RC, Dennis MY, Alexandrov IA, Gerton JL, O’Neill RJ, Timp W, Zook JM, Schatz MC, Eichler EE, Miga KH, Phillippy AM. The complete sequence of a human genome. Science 2022; 376:44-53. [PMID: 35357919 PMCID: PMC9186530 DOI: 10.1126/science.abj6987] [Citation(s) in RCA: 894] [Impact Index Per Article: 447.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Collapse
Affiliation(s)
- Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego; La Jolla, CA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Nicolas Altemose
- Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
| | - Lev Uralsky
- Sirius University of Science and Technology; Sochi, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
| | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Michael Alonge
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | | | | | - Gerard G. Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Shelise Y. Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
- Department of Biomedical Informatics, Harvard Medical School; Boston, MA
| | | | | | | | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Richard Durbin
- Wellcome Sanger Institute; Cambridge, UK
- Department of Genetics, University of Cambridge; Cambridge, UK
| | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
| | | | - Giulio Formenti
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | - Robert S. Fulton
- Department of Genetics, Washington University School of Medicine; St. Louis, MO, USA
| | | | - Erik Garrison
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
- University of Tennessee Health Science Center; Memphis, TN, USA
| | - Patrick G.S. Grady
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | | | - Ira M. Hall
- Department of Genetics, Yale University School of Medicine; New Haven, CT, USA
| | - Nancy F. Hansen
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | | | | | - Chirag Jain
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
- Department of Computational and Data Sciences, Indian Institute of Science; Bangalore KA, India
| | - Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Erich D. Jarvis
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | | | - Melanie Kirsche
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
| | | | - Milinn Kremitzki
- McDonnell Genome Institute, Washington University in St. Louis; St. Louis, MO, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
- Department of Biomedical Informatics, Harvard Medical School; Boston, MA
| | - Valerie V. Maduro
- Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Tobias Marschall
- Heinrich Heine University Düsseldorf, Medical Faculty, Institute for Medical Biometry and Bioinformatics; Düsseldorf, Germany
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Danny E. Miller
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital; Seattle, WA, USA
| | - James C. Mullikin
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Eugene W. Myers
- Max-Planck Institute of Molecular Cell Biology and Genetics; Dresden, Germany
| | - Nathan D. Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | | | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research; Kansas City, MO, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology; Sochi, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School; Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University; Moscow, Russia
| | | | - Steven L. Salzberg
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine; Houston TX, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Colin J. Shew
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Ying Sims
- Wellcome Sanger Institute; Cambridge, UK
| | | | - Daniela C. Soto
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Ivan Sović
- Pacific Biosciences; Menlo Park, CA, USA
- Digital BioLogic d.o.o.; Ivanić-Grad, Croatia
| | | | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
- Chan Zuckerberg Biohub; San Francisco, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine; Durham, NC, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | | | - Justin Wagner
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Brian P. Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | | | | | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Stephanie M. Yan
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Alice C. Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Urvashi Surti
- Department of Pathology, University of Pittsburgh; Pittsburgh, PA, USA
| | - Rajiv C. McCoy
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Ivan A. Alexandrov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences; Moscow, Russia
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research; Kansas City, MO, USA
- Department of Biochemistry and Molecular Biology, University of Kansas Medical School; Kansas City, MO, USA
| | - Rachel J. O’Neill
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| |
Collapse
|
36
|
Formenti G, Theissinger K, Fernandes C, Bista I, Bombarely A, Bleidorn C, Ciofi C, Crottini A, Godoy JA, Höglund J, Malukiewicz J, Mouton A, Oomen RA, Paez S, Palsbøll PJ, Pampoulie C, Ruiz-López MJ, Svardal H, Theofanopoulou C, de Vries J, Waldvogel AM, Zhang G, Mazzoni CJ, Jarvis ED, Bálint M. The era of reference genomes in conservation genomics. Trends Ecol Evol 2022; 37:197-202. [PMID: 35086739 DOI: 10.1016/j.tree.2021.11.008] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 11/10/2021] [Accepted: 11/16/2021] [Indexed: 02/08/2023]
Abstract
Progress in genome sequencing now enables the large-scale generation of reference genomes. Various international initiatives aim to generate reference genomes representing global biodiversity. These genomes provide unique insights into genomic diversity and architecture, thereby enabling comprehensive analyses of population and functional genomics, and are expected to revolutionize conservation genomics.
Collapse
Affiliation(s)
- Giulio Formenti
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Kathrin Theissinger
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany; University of Koblenz-Landau, Institute for Environmental Sciences, Fortstrasse 7, 76829 Landau, Germany; Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany
| | - Carlos Fernandes
- CE3C - Centre for Ecology, Evolution and Environmental Changes, Departamento de Biologia Animal, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal; Faculdade de Psicologia, Universidade de Lisboa, Alameda da Universidade, 1649-013 Lisboa, Portugal
| | - Iliana Bista
- University of Cambridge, Department of Genetics, Cambridge CB2 3EH, UK; Wellcome Sanger Institute, CB10 1SA, Hinxton, UK
| | | | - Christoph Bleidorn
- University of Göttingen, Department of Animal Evolution and Biodiversity, Untere Karspüle, 2, 37073, Germany
| | - Claudio Ciofi
- University of Florence, Department of Biology, Via Madonna del Piano 6, Sesto Fiorentino (FI) 50019, Italy
| | - Angelica Crottini
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
| | - José A Godoy
- Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas, Av. Américo Vespucio, 26, 41092, Spain
| | - Jacob Höglund
- Dept. of Ecology and Genetics, Uppsala University, Norbyvägen 18D, 75246, Sweden
| | | | - Alice Mouton
- InBios - Conservation Genetics Lab, University of Liege, Chemin de la Vallée 4, 4000, Belgium
| | - Rebekah A Oomen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Blindernveien 31, 0371 Oslo, Norway; Centre for Coastal Research, University of Agder, Gimlemoen 25j, 4630 Kristiansand, Norway
| | - Sadye Paez
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Per J Palsbøll
- Groningen Institute of Evolutionary Life Sciences University of Groningen Nijenborgh, 9747, AG, Groningen, the Netherlands; Center for Coastal Studies, 5 Holway Avenue, Provincetown, MA 02657, USA
| | - Christophe Pampoulie
- Marine and Freshwater Research Institute, Fornubúðir, 5, 220 Hanafjörður, Iceland
| | - María J Ruiz-López
- Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas, Av. Américo Vespucio, 26, 41092, Spain
| | - Hannes Svardal
- Department of Biology, University of Antwerp, Groenenborgerlaan 171, 2020, Belgium
| | | | - Jan de Vries
- University of Göttingen, Institute for Microbiology and Genetics, Dept. of Applied Bioinformatics, Goettingen Center for Molecular Biosciences (GZMB), Campus Institute Data Science (CIDAS), Goldschmidtstr. 1, 37077, Germany
| | - Ann-Marie Waldvogel
- Institute of Zoology, University of Cologne, Zülpicherstrasse 47b, D-50674, Germany
| | - Guojie Zhang
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Denmark, Build 3, Universitetsparken 15, Copenhagen 2100, Denmark; China National Genebank, BGI-Shenzhen, Jinsha Road, Dapeng District, Shenzhen 518083, China
| | - Camila J Mazzoni
- Leibniz Institute for Zoo and Wildlife Research (IZW), Alfred-Kowalke-Str 17, 10315 Berlin, Germany
| | - Erich D Jarvis
- The Rockefeller University, 1230 York Ave, New York, NY 10065, USA
| | - Miklós Bálint
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany; Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt/Main, Germany; Institute for Insect Biotechnology, Justus-Liebig University Gießen, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany.
| | | |
Collapse
|
37
|
Lewin HA, Richards S, Lieberman Aiden E, Allende ML, Archibald JM, Bálint M, Barker KB, Baumgartner B, Belov K, Bertorelle G, Blaxter ML, Cai J, Caperello ND, Carlson K, Castilla-Rubio JC, Chaw SM, Chen L, Childers AK, Coddington JA, Conde DA, Corominas M, Crandall KA, Crawford AJ, DiPalma F, Durbin R, Ebenezer TE, Edwards SV, Fedrigo O, Flicek P, Formenti G, Gibbs RA, Gilbert MTP, Goldstein MM, Graves JM, Greely HT, Grigoriev IV, Hackett KJ, Hall N, Haussler D, Helgen KM, Hogg CJ, Isobe S, Jakobsen KS, Janke A, Jarvis ED, Johnson WE, Jones SJM, Karlsson EK, Kersey PJ, Kim JH, Kress WJ, Kuraku S, Lawniczak MKN, Leebens-Mack JH, Li X, Lindblad-Toh K, Liu X, Lopez JV, Marques-Bonet T, Mazard S, Mazet JAK, Mazzoni CJ, Myers EW, O'Neill RJ, Paez S, Park H, Robinson GE, Roquet C, Ryder OA, Sabir JSM, Shaffer HB, Shank TM, Sherkow JS, Soltis PS, Tang B, Tedersoo L, Uliano-Silva M, Wang K, Wei X, Wetzer R, Wilson JL, Xu X, Yang H, Yoder AD, Zhang G. The Earth BioGenome Project 2020: Starting the clock. Proc Natl Acad Sci U S A 2022; 119:e2115635118. [PMID: 35042800 PMCID: PMC8795548 DOI: 10.1073/pnas.2115635118] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Harris A Lewin
- Department of Evolution and Ecology, College of Biological Sciences, University of California, Davis, CA 95616;
- Department of Population Health and Reproduction, University of California, Davis, CA 95616
| | - Stephen Richards
- University of California Davis Genome Center, University of California, Davis, CA 95616
| | - Erez Lieberman Aiden
- DNA Zoo and The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030
| | - Miguel L Allende
- Center for Genome Regulation, Universidad de Chile 3425 Santiago, Chile
- Facultad de Ciencias, Universidad de Chile 3425 Santiago, Chile
| | - John M Archibald
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, NS B3H 4H7, Canada
| | - Miklós Bálint
- LOEWE Centre of Translational Biodiversity Genomics, Senckenberg Leibniz Institution for Biodiversity and Earth System Research 60325 Frankfurt am Main, Germany
- Institute for Insect Biotechnology, Justus-Liebig University 35392 Giessen, Germany
| | - Katharine B Barker
- Global Genome Biodiversity Network Secretariat, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560
| | | | - Katherine Belov
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Giorgio Bertorelle
- Department of Life Sciences and Biotechnology, University of Ferrara 44121 Ferrara, Italy
| | - Mark L Blaxter
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Jing Cai
- School of Ecology and Environment, Northwestern Polytechnical University 710072 Xi'an, China
| | - Nicolette D Caperello
- University of California Davis Genome Center, University of California, Davis, CA 95616
| | - Keith Carlson
- The Novim Group, University of California, Santa Barbara, CA 93106
| | | | - Shu-Miaw Chaw
- Biodiversity Research Center, Academia Sinica 11529 Taipei, Taiwan
| | - Lei Chen
- School of Ecology and Environment, Northwestern Polytechnical University 710072 Xi'an, China
| | - Anna K Childers
- Bee Research Laboratory, Beltsville Agricultural Research Center, US Department of Agriculture, Agriculture Research Service, Beltsville, MD 20705
| | - Jonathan A Coddington
- Global Genome Initiative, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560
| | - Dalia A Conde
- Conservation Science, Species360 Conservation Science Alliance, Bloomington, MN 55425
- Department of Biology, University of Southern Denmark 5230 Odense M, Denmark
| | - Montserrat Corominas
- Department of Genetics, Microbiology, and Statistics, Universitat de Barcelona 08028 Barcelona, Spain
- Catalan Society for Biology, Institute for Catalan Studies 08001 Barcelona, Spain
| | - Keith A Crandall
- Department of Biostatistics & Bioinformatics, Computational Biology Institute, George Washington University, Washington, DC 20052
- Department of Biostatistics & Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC 20052
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes 111711 Bogotá, Colombia
| | | | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - ThankGod E Ebenezer
- UniProt, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138
| | - Olivier Fedrigo
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
| | - Paul Flicek
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, United Kingdom
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030
| | - M Thomas P Gilbert
- GLOBE Institute, University of Copenhagen 1350 Copenhagen, Denmark
- University Museum, Norwegian University of Science and Technology 7491 Trondheim, Norway
| | - Melissa M Goldstein
- Department of Health Policy and Management, George Washington University, Washington, DC 20052
| | - Jennifer Marshall Graves
- School of Life Sciences, La Trobe University, Bundoora, VIC 3086, Australia
- Institute for Applied Ecology, University of Canberra, Bruce, ACT 2617, Australia
| | - Henry T Greely
- Stanford Law School, Stanford University, Stanford, CA 94305
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Kevin J Hackett
- Office of National Programs, US Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705
| | - Neil Hall
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - David Haussler
- Genome Institute, University of California, Santa Cruz, CA 95060
- HHMI, Chevy Chase, MD 20815
| | - Kristofer M Helgen
- Australian Museum Research Institute, Australian Museum, Sydney, NSW 2000, Australia
| | - Carolyn J Hogg
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Sachiko Isobe
- Department of Frontier Research and Development, Kazusa DNA Research Institute, Chiba 292-0818, Japan
| | | | - Axel Janke
- LOEWE Centre of Translational Biodiversity Genomics, Senckenberg Leibniz Institution for Biodiversity and Earth System Research 60325 Frankfurt am Main, Germany
| | - Erich D Jarvis
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
- HHMI, Chevy Chase, MD 20815
| | - Warren E Johnson
- Walter Reed Biosystematics Unit, Smithsonian Institution, Suitland, MD 20746
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630
| | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Elinor K Karlsson
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond TW9 3AE, United Kingdom
| | - Jin-Hyoung Kim
- Division of Life Sciences, Korea Polar Research Institute 21990 Incheon, South Korea
| | - W John Kress
- Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012
| | - Shigehiro Kuraku
- Department of Genomics and Evolutionary Biology, National Institute of Genetics 411-8540 Shizuoka, Japan
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research 650-0047 Hyogo, Japan
| | - Mara K N Lawniczak
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | | | - Xueyan Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences 650223 Yunnan, China
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University 752 36 Uppsala, Sweden
| | - Xin Liu
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Jose V Lopez
- Department of Biological Sciences, Halmos College of Arts and Sciences, Nova Southeastern University, Dania Beach, FL 33004
- Guy Harvey Oceanographic Center, Dania Beach, FL 33004
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology, Pompeu Fabra University, Consejo Superior de Investigaciones Cientificas, Parc de Recerca Biomedica de Barcelona 08003 Barcelona, Spain
- Catalan Institute of Research and Advanced Studies 08010 Barcelona, Spain
- Centre Nacional d'Anàlisi Genòmica, Centre for Genomic Regulation, Barcelona Institute of Science and Technology 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona 08193 Barcelona, Spain
| | - Sophie Mazard
- Bioplatforms Australia, Macquarie University, Sydney, NSW 2109, Australia
| | - Jonna A K Mazet
- One Health Institute, University of California Davis, CA 95616
| | - Camila J Mazzoni
- Berlin Center for Genomics in Biodiversity Research 14195 Berlin, Germany
- Evolutionary Genetics Department, Leibniz Institute for Zoo and Wildlife Research 10315 Berlin, Germany
| | - Eugene W Myers
- Max Planck Institute for Molecular Cell Biology and Genetics 01307 Dresden, Germany
| | - Rachel J O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Sadye Paez
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
| | - Hyun Park
- Division of Biotechnology, Korea University 02841 Seoul, Korea
| | - Gene E Robinson
- Department of Entomology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - Cristina Roquet
- Systematics and Evolution of Vascular Plants Associated Unit to Consejo Superior de Investigaciones Cientificas, Departament de Biologia Animal, Biologia Vegetal i Ecologia, Universitat Autònoma de Barcelona 08193 Bellaterra, Spain
- Laboratoire d'Ecologie Alpine, University Grenoble Alpes, University Savoie Mont Blanc, CNRS 38000 Grenoble, France
| | - Oliver A Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027
- Division of Biology, Department of Evolution, Behavior, and Ecology, University of California, San Diego, La Jolla, CA 92039
| | - Jamal S M Sabir
- Department of Biological Sciences, Faculty of Science, King Abdulaziz University 21589 Jeddah, Saudi Arabia
- Centre of Excellence in Bionanoscience Research, King Abdulaziz University 21589 Jeddah, Saudi Arabia
| | - H Bradley Shaffer
- La Kretz Center for California Conservation Science, Institute of Environment and Sustainability, University of California, Los Angeles, CA 90024
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Timothy M Shank
- Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543
| | - Jacob S Sherkow
- Department of Entomology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
- College of Law, University of Illinois at Urbana-Champaign, Champaign, IL 61820
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
| | - Boping Tang
- Jiangsu Key Laboratory for Bioresources of Saline Soils, Jiangsu Provincial Key Laboratory of Coastal Wetland Bioresources and Environmental Protection, Jiangsu Synthetic Innovation Center for Coastal Bio-agriculture, School of Wetlands, Yancheng Teachers University 224002 Yancheng, China
| | - Leho Tedersoo
- Center of Mycology and Microbiology, University of Tartu 50411 Tartu, Estonia
- College of Science, King Saud University 11451 Riyadh, Saudi Arabia
| | | | - Kun Wang
- School of Ecology and Environment, Northwestern Polytechnical University 710072 Xi'an, China
| | - Xiaofeng Wei
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Regina Wetzer
- Research and Collections, Natural History Museum of Los Angeles County, Los Angeles, CA 90007
- Biological Sciences, University of Southern California, Los Angeles, CA 90089
| | - Julia L Wilson
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Xun Xu
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Huanming Yang
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, NC 27708
- Duke Center for Genomic and Computational Biology, Duke University, Durham, NC 27708
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences 650223 Yunnan, China
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen 2100 Copenhagen, Denmark
- China National Genebank, Beijing Genomics Institute 51803 Shenzhen, China
| |
Collapse
|
38
|
Secomandi S, Spina F, Formenti G, Gallo GR, Caprioli M, Ambrosini R, Riello S. The genome sequence of the European nightjar, Caprimulgus europaeus (Linnaeus, 1758). Wellcome Open Res 2021; 6:332. [PMID: 35028428 PMCID: PMC8729189 DOI: 10.12688/wellcomeopenres.17451.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/02/2021] [Indexed: 11/28/2022] Open
Abstract
We present a genome assembly from an individual female Caprimulgus europaeus (the European nightjar; Chordata; Aves; Caprimulgiformes; Caprimulgidae). The genome sequence is 1,178 megabases in span. The majority of the assembly (99.33%) is scaffolded into 37 chromosomal pseudomolecules, including the W and Z sex chromosomes.
Collapse
Affiliation(s)
| | - Fernando Spina
- Institute for Environmental Protection and Research (ISPRA), Ozzano dell'Emilia, Italy
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | | | - Manuela Caprioli
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | - Roberto Ambrosini
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | - Sara Riello
- Riserva Naturale Statale “Isole di Ventotene e S. Stefano”, Ventotene, Italy
| | - Wellcome Sanger Institute Tree of Life programme
- Department of Biosciences, University of Milan, Milan, Italy
- Institute for Environmental Protection and Research (ISPRA), Ozzano dell'Emilia, Italy
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
- Riserva Naturale Statale “Isole di Ventotene e S. Stefano”, Ventotene, Italy
| | - Wellcome Sanger Institute Scientific Operations: DNA Pipelines collective
- Department of Biosciences, University of Milan, Milan, Italy
- Institute for Environmental Protection and Research (ISPRA), Ozzano dell'Emilia, Italy
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
- Riserva Naturale Statale “Isole di Ventotene e S. Stefano”, Ventotene, Italy
| | - Tree of Life Core Informatics collective
- Department of Biosciences, University of Milan, Milan, Italy
- Institute for Environmental Protection and Research (ISPRA), Ozzano dell'Emilia, Italy
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
- Riserva Naturale Statale “Isole di Ventotene e S. Stefano”, Ventotene, Italy
| | | |
Collapse
|
39
|
Peart CR, Williams C, Pophaly SD, Neely BA, Gulland FMD, Adams DJ, Ng BL, Cheng W, Goebel ME, Fedrigo O, Haase B, Mountcastle J, Fungtammasan A, Formenti G, Collins J, Wood J, Sims Y, Torrance J, Tracey A, Howe K, Rhie A, Hoffman JI, Johnson J, Jarvis ED, Breen M, Wolf JBW. Hi-C scaffolded short- and long-read genome assemblies of the California sea lion are broadly consistent for syntenic inference across 45 million years of evolution. Mol Ecol Resour 2021; 21:2455-2470. [PMID: 34097816 PMCID: PMC9732816 DOI: 10.1111/1755-0998.13443] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 05/06/2021] [Accepted: 05/26/2021] [Indexed: 12/13/2022]
Abstract
With the advent of chromatin-interaction maps, chromosome-level genome assemblies have become a reality for a wide range of organisms. Scaffolding quality is, however, difficult to judge. To explore this gap, we generated multiple chromosome-scale genome assemblies of an emerging wild animal model for carcinogenesis, the California sea lion (Zalophus californianus). Short-read assemblies were scaffolded with two independent chromatin interaction mapping data sets (Hi-C and Chicago), and long-read assemblies with three data types (Hi-C, optical maps and 10X linked reads) following the "Vertebrate Genomes Project (VGP)" pipeline. In both approaches, 18 major scaffolds recovered the karyotype (2n = 36), with scaffold N50s of 138 and 147 Mb, respectively. Synteny relationships at the chromosome level with other pinniped genomes (2n = 32-36), ferret (2n = 34), red panda (2n = 36) and domestic dog (2n = 78) were consistent across approaches and recovered known fissions and fusions. Comparative chromosome painting and multicolour chromosome tiling with a panel of 264 genome-integrated single-locus canine bacterial artificial chromosome probes provided independent evaluation of genome organization. Broad-scale discrepancies between the approaches were observed within chromosomes, most commonly in translocations centred around centromeres and telomeres, which were better resolved in the VGP assembly. Genomic and cytological approaches agreed on near-perfect synteny of the X chromosome, and in combination allowed detailed investigation of autosomal rearrangements between dog and sea lion. This study presents high-quality genomes of an emerging cancer model and highlights that even highly fragmented short-read assemblies scaffolded with Hi-C can yield reliable chromosome-level scaffolds suitable for comparative genomic analyses.
Collapse
Affiliation(s)
- Claire R. Peart
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Munchen, Germany
| | - Christina Williams
- Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, North Carolina, USA
| | - Saurabh D. Pophaly
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Munchen, Germany,Max Planck institute for Plant Breeding Research, Cologne, Germany
| | - Benjamin A. Neely
- National Institute of Standards and Technology, NIST Charleston, Charleston, South Carolina, USA
| | - Frances M. D. Gulland
- Karen Dryer Wildlife Health Center, University of California Davis, Davis, California, USA
| | - David J. Adams
- Cytometry Core Facility, Wellcome Sanger Institute, Cambridge, UK
| | - Bee Ling Ng
- Cytometry Core Facility, Wellcome Sanger Institute, Cambridge, UK
| | - William Cheng
- Cytometry Core Facility, Wellcome Sanger Institute, Cambridge, UK
| | - Michael E. Goebel
- Institute of Marine Science, University of California Santa Cruz, Santa Cruz, California, USA
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York City, New York, USA
| | - Bettina Haase
- Vertebrate Genome Lab, The Rockefeller University, New York City, New York, USA
| | | | | | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York City, New York, USA,Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, New York, USA
| | - Joanna Collins
- Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK
| | - Jonathan Wood
- Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK
| | - Ying Sims
- Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK
| | - James Torrance
- Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK
| | - Alan Tracey
- Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK
| | - Kerstin Howe
- Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, Maryland, USA
| | - Joseph I. Hoffman
- Department of Animal Behaviour, Bielefeld University, Bielefeld, Germany,British Antarctic Survey, Cambridge, UK
| | - Jeremy Johnson
- Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, USA
| | - Erich D. Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York City, New York, USA,Howard Hughes Medical Institute, Chevy Chase, Maryland, USA
| | - Matthew Breen
- Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, North Carolina, USA,Comparative Medicine Institute, North Carolina State University, Raleigh, North Carolina, USA
| | - Jochen B. W. Wolf
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Munchen, Germany
| |
Collapse
|
40
|
Dussex N, van der Valk T, Morales HE, Wheat CW, Díez-del-Molino D, von Seth J, Foster Y, Kutschera VE, Guschanski K, Rhie A, Phillippy AM, Korlach J, Howe K, Chow W, Pelan S, Mendes Damas JD, Lewin HA, Hastie AR, Formenti G, Fedrigo O, Guhlin J, Harrop TW, Le Lec MF, Dearden PK, Haggerty L, Martin FJ, Kodali V, Thibaud-Nissen F, Iorns D, Knapp M, Gemmell NJ, Robertson F, Moorhouse R, Digby A, Eason D, Vercoe D, Howard J, Jarvis ED, Robertson BC, Dalén L. Population genomics of the critically endangered kākāpō. Cell Genom 2021; 1:100002. [PMID: 36777713 PMCID: PMC9903828 DOI: 10.1016/j.xgen.2021.100002] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 04/23/2021] [Accepted: 06/22/2021] [Indexed: 12/30/2022]
Abstract
The kākāpō is a flightless parrot endemic to New Zealand. Once common in the archipelago, only 201 individuals remain today, most of them descending from an isolated island population. We report the first genome-wide analyses of the species, including a high-quality genome assembly for kākāpō, one of the first chromosome-level reference genomes sequenced by the Vertebrate Genomes Project (VGP). We also sequenced and analyzed 35 modern genomes from the sole surviving island population and 14 genomes from the extinct mainland population. While theory suggests that such a small population is likely to have accumulated deleterious mutations through genetic drift, our analyses on the impact of the long-term small population size in kākāpō indicate that present-day island kākāpō have a reduced number of harmful mutations compared to mainland individuals. We hypothesize that this reduced mutational load is due to the island population having been subjected to a combination of genetic drift and purging of deleterious mutations, through increased inbreeding and purifying selection, since its isolation from the mainland ∼10,000 years ago. Our results provide evidence that small populations can survive even when isolated for hundreds of generations. This work provides key insights into kākāpō breeding and recovery and more generally into the application of genetic tools in conservation efforts for endangered species.
Collapse
Affiliation(s)
- Nicolas Dussex
- Centre for Palaeogenetics, Svante Arrhenius väg 20C, 10691 Stockholm, Sweden,Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Box 50007, 10405 Stockholm, Sweden,Department of Zoology, Stockholm University, 10691 Stockholm, Sweden,Department of Anatomy, University of Otago, PO Box 913, Dunedin 9016, New Zealand,Corresponding author
| | - Tom van der Valk
- Centre for Palaeogenetics, Svante Arrhenius väg 20C, 10691 Stockholm, Sweden,Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Box 50007, 10405 Stockholm, Sweden
| | - Hernán E. Morales
- Section for Evolutionary Genomics, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | | | - David Díez-del-Molino
- Centre for Palaeogenetics, Svante Arrhenius väg 20C, 10691 Stockholm, Sweden,Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Box 50007, 10405 Stockholm, Sweden
| | - Johanna von Seth
- Centre for Palaeogenetics, Svante Arrhenius väg 20C, 10691 Stockholm, Sweden,Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Box 50007, 10405 Stockholm, Sweden,Department of Zoology, Stockholm University, 10691 Stockholm, Sweden
| | - Yasmin Foster
- Department of Zoology, University of Otago, PO Box 56, Dunedin 9054, New Zealand
| | - Verena E. Kutschera
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Katerina Guschanski
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK,Department of Ecology and Genetics, Animal Ecology, Uppsala University, 75236 Uppsala, Sweden
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jonas Korlach
- Pacific Biosciences, 1305 O’Brien Drive, Menlo Park, CA 94025, USA
| | - Kerstin Howe
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - William Chow
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Sarah Pelan
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Joanna D. Mendes Damas
- Department of Evolution and Ecology and the UC Davis Genome Center, 4321 Genome and Biomedical Sciences Facility, University of California Davis, Davis, CA 95616, USA
| | - Harris A. Lewin
- Department of Evolution and Ecology and the UC Davis Genome Center, 4321 Genome and Biomedical Sciences Facility, University of California Davis, Davis, CA 95616, USA
| | - Alex R. Hastie
- Bionano Genomics, 9540 Towne Centre Drive, San Diego, CA 92121, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA,Laboratory of Neurogenetics of Language, Box 54, The Rockefeller University, New York, NY 10065, USA,Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA
| | - Joseph Guhlin
- Genomics Aotearoa and Laboratory for Evolution and Development, Department of Biochemistry, University of Otago, PO Box 56, Dunedin 9016, New Zealand
| | - Thomas W.R. Harrop
- Genomics Aotearoa and Laboratory for Evolution and Development, Department of Biochemistry, University of Otago, PO Box 56, Dunedin 9016, New Zealand
| | - Marissa F. Le Lec
- Genomics Aotearoa and Laboratory for Evolution and Development, Department of Biochemistry, University of Otago, PO Box 56, Dunedin 9016, New Zealand
| | - Peter K. Dearden
- Genomics Aotearoa and Laboratory for Evolution and Development, Department of Biochemistry, University of Otago, PO Box 56, Dunedin 9016, New Zealand
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fergal J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vamsi Kodali
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - David Iorns
- The Genetic Rescue Foundation, Wellington, New Zealand
| | - Michael Knapp
- Department of Anatomy, University of Otago, PO Box 913, Dunedin 9016, New Zealand
| | - Neil J. Gemmell
- Department of Anatomy, University of Otago, PO Box 913, Dunedin 9016, New Zealand
| | - Fiona Robertson
- Department of Zoology, University of Otago, PO Box 56, Dunedin 9054, New Zealand
| | - Ron Moorhouse
- Kākāpō Recovery, Department of Conservation, PO Box 743, Invercargill 9840, New Zealand
| | - Andrew Digby
- Kākāpō Recovery, Department of Conservation, PO Box 743, Invercargill 9840, New Zealand
| | - Daryl Eason
- Kākāpō Recovery, Department of Conservation, PO Box 743, Invercargill 9840, New Zealand
| | - Deidre Vercoe
- Kākāpō Recovery, Department of Conservation, PO Box 743, Invercargill 9840, New Zealand
| | - Jason Howard
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA,BioSkryb Genomics, 701 W Main Street, Suite 200, Durham, NC 27701, USA
| | - Erich D. Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065, USA,Laboratory of Neurogenetics of Language, Box 54, The Rockefeller University, New York, NY 10065, USA,Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA,Corresponding author
| | - Bruce C. Robertson
- Department of Zoology, University of Otago, PO Box 56, Dunedin 9054, New Zealand,Corresponding author
| | - Love Dalén
- Centre for Palaeogenetics, Svante Arrhenius väg 20C, 10691 Stockholm, Sweden,Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Box 50007, 10405 Stockholm, Sweden,Department of Zoology, Stockholm University, 10691 Stockholm, Sweden,Corresponding author
| |
Collapse
|
41
|
Formenti G, Rhie A, Balacco J, Haase B, Mountcastle J, Fedrigo O, Brown S, Capodiferro MR, Al-Ajli FO, Ambrosini R, Houde P, Koren S, Oliver K, Smith M, Skelton J, Betteridge E, Dolucan J, Corton C, Bista I, Torrance J, Tracey A, Wood J, Uliano-Silva M, Howe K, McCarthy S, Winkler S, Kwak W, Korlach J, Fungtammasan A, Fordham D, Costa V, Mayes S, Chiara M, Horner DS, Myers E, Durbin R, Achilli A, Braun EL, Phillippy AM, Jarvis ED. Complete vertebrate mitogenomes reveal widespread repeats and gene duplications. Genome Biol 2021; 22:120. [PMID: 33910595 PMCID: PMC8082918 DOI: 10.1186/s13059-021-02336-9] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 03/31/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. RESULTS As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100-300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. CONCLUSIONS Our results indicate that even in the "simple" case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.
Collapse
Affiliation(s)
- Giulio Formenti
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA.
- Laboratory of Neurogenetics of Language, Rockefeller University, New York, NY, USA.
- The Howards Hughes Medical Institute, Chevy Chase, MD, USA.
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jennifer Balacco
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA
| | - Bettina Haase
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA
| | | | - Olivier Fedrigo
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA
| | - Samara Brown
- Laboratory of Neurogenetics of Language, Rockefeller University, New York, NY, USA
- The Howards Hughes Medical Institute, Chevy Chase, MD, USA
| | | | - Farooq O Al-Ajli
- Monash University Malaysia Genomics Facility, School of Science, Bandar Sunway, Selangor Darul Ehsan, Malaysia
- Tropical Medicine and Biology Multidisciplinary Platform, Monash University Malaysia, Bandar Sunway, Selangor Darul Ehsan, Malaysia
- Qatar Falcon Genome Project, Doha, State of Qatar
| | - Roberto Ambrosini
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| | - Peter Houde
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | | | | | | | - Iliana Bista
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | | | | | | | | | | | - Shane McCarthy
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology & Genetics, Dresden, Germany
| | | | | | | | - Daniel Fordham
- Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK
| | - Vania Costa
- Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK
| | - Simon Mayes
- Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK
| | - Matteo Chiara
- Department of Biosciences, University of Milan, Milan, Italy
| | - David S Horner
- Department of Biosciences, University of Milan, Milan, Italy
| | - Eugene Myers
- Max Planck Institute of Molecular Cell Biology & Genetics, Dresden, Germany
| | - Richard Durbin
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Alessandro Achilli
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Erich D Jarvis
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA.
- Laboratory of Neurogenetics of Language, Rockefeller University, New York, NY, USA.
- The Howards Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
42
|
Yang C, Zhou Y, Marcus S, Formenti G, Bergeron LA, Song Z, Bi X, Bergman J, Rousselle MMC, Zhou C, Zhou L, Deng Y, Fang M, Xie D, Zhu Y, Tan S, Mountcastle J, Haase B, Balacco J, Wood J, Chow W, Rhie A, Pippel M, Fabiszak MM, Koren S, Fedrigo O, Freiwald WA, Howe K, Yang H, Phillippy AM, Schierup MH, Jarvis ED, Zhang G. Evolutionary and biomedical insights from a marmoset diploid genome assembly. Nature 2021; 594:227-233. [PMID: 33910227 PMCID: PMC8189906 DOI: 10.1038/s41586-021-03535-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 04/12/2021] [Indexed: 01/23/2023]
Abstract
The accurate and complete assembly of both haplotype sequences of a diploid organism is essential to understanding the role of variation in genome functions, phenotypes and diseases1. Here, using a trio-binning approach, we present a high-quality, diploid reference genome, with both haplotypes assembled independently at the chromosome level, for the common marmoset (Callithrix jacchus), an primate model system that is widely used in biomedical research2,3. The full spectrum of heterozygosity between the two haplotypes involves 1.36% of the genome-much higher than the 0.13% indicated by the standard estimation based on single-nucleotide heterozygosity alone. The de novo mutation rate is 0.43 × 10-8 per site per generation, and the paternal inherited genome acquired twice as many mutations as the maternal. Our diploid assembly enabled us to discover a recent expansion of the sex-differentiation region and unique evolutionary changes in the marmoset Y chromosome. In addition, we identified many genes with signatures of positive selection that might have contributed to the evolution of Callithrix biological features. Brain-related genes were highly conserved between marmosets and humans, although several genes experienced lineage-specific copy number variations or diversifying selection, with implications for the use of marmosets as a model system.
Collapse
Affiliation(s)
- Chentao Yang
- BGI-Shenzhen, Shenzhen, China.,Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | | - Stephanie Marcus
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Giulio Formenti
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA.,Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Lucie A Bergeron
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Zhenzhen Song
- University of the Chinese Academy of Sciences, Beijing, China
| | | | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | | | | | | | - Yuan Deng
- BGI-Shenzhen, Shenzhen, China.,Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | | - Duo Xie
- BGI-Shenzhen, Shenzhen, China
| | | | | | | | - Bettina Haase
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Jennifer Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | | | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Martin Pippel
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Center for Systems Biology, Dresden, Germany
| | | | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Winrich A Freiwald
- Laboratory of Neural Systems, The Rockefeller University, New York, NY, USA.,Center for Brains, Minds and Machines (CBMM), The Rockefeller University, New York, NY, USA
| | | | - Huanming Yang
- BGI-Shenzhen, Shenzhen, China.,University of the Chinese Academy of Sciences, Beijing, China.,James D. Watson Institute of Genome Sciences, Hangzhou, China.,Guangdong Provincial Academician Workstation of BGI Synthetic Genomics, BGI-Shenzhen, Shenzhen, China
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Erich D Jarvis
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA.,Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA.,Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Guojie Zhang
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark. .,State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China. .,China National GeneBank, BGI-Shenzhen, Shenzhen, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China.
| |
Collapse
|
43
|
Rhie A, McCarthy SA, Fedrigo O, Damas J, Formenti G, Koren S, Uliano-Silva M, Chow W, Fungtammasan A, Kim J, Lee C, Ko BJ, Chaisson M, Gedman GL, Cantin LJ, Thibaud-Nissen F, Haggerty L, Bista I, Smith M, Haase B, Mountcastle J, Winkler S, Paez S, Howard J, Vernes SC, Lama TM, Grutzner F, Warren WC, Balakrishnan CN, Burt D, George JM, Biegler MT, Iorns D, Digby A, Eason D, Robertson B, Edwards T, Wilkinson M, Turner G, Meyer A, Kautt AF, Franchini P, Detrich HW, Svardal H, Wagner M, Naylor GJP, Pippel M, Malinsky M, Mooney M, Simbirsky M, Hannigan BT, Pesout T, Houck M, Misuraca A, Kingan SB, Hall R, Kronenberg Z, Sović I, Dunn C, Ning Z, Hastie A, Lee J, Selvaraj S, Green RE, Putnam NH, Gut I, Ghurye J, Garrison E, Sims Y, Collins J, Pelan S, Torrance J, Tracey A, Wood J, Dagnew RE, Guan D, London SE, Clayton DF, Mello CV, Friedrich SR, Lovell PV, Osipova E, Al-Ajli FO, Secomandi S, Kim H, Theofanopoulou C, Hiller M, Zhou Y, Harris RS, Makova KD, Medvedev P, Hoffman J, Masterson P, Clark K, Martin F, Howe K, Flicek P, Walenz BP, Kwak W, Clawson H, Diekhans M, Nassar L, Paten B, Kraus RHS, Crawford AJ, Gilbert MTP, Zhang G, Venkatesh B, Murphy RW, Koepfli KP, Shapiro B, Johnson WE, Di Palma F, Marques-Bonet T, Teeling EC, Warnow T, Graves JM, Ryder OA, Haussler D, O'Brien SJ, Korlach J, Lewin HA, Howe K, Myers EW, Durbin R, Phillippy AM, Jarvis ED. Towards complete and error-free genome assemblies of all vertebrate species. Nature 2021; 592:737-746. [PMID: 33911273 PMCID: PMC8081667 DOI: 10.1038/s41586-021-03451-0] [Citation(s) in RCA: 591] [Impact Index Per Article: 197.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 03/12/2021] [Indexed: 02/02/2023]
Abstract
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1-4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
Collapse
Affiliation(s)
- Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shane A McCarthy
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA, USA
| | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Marcela Uliano-Silva
- Leibniz Institute for Zoo and Wildlife Research, Department of Evolutionary Genetics, Berlin, Germany
- Berlin Center for Genomics in Biodiversity Research, Berlin, Germany
| | | | | | - Juwan Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Mark Chaisson
- University of Southern California, Los Angeles, CA, USA
| | - Gregory L Gedman
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Lindsey J Cantin
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Iliana Bista
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
| | | | - Bettina Haase
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | | | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- DRESDEN-concept Genome Center, Dresden, Germany
| | - Sadye Paez
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | | | - Sonja C Vernes
- Neurogenetics of Vocal Communication Group, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
- School of Biology, University of St Andrews, St Andrews, UK
| | - Tanya M Lama
- University of Massachusetts Cooperative Fish and Wildlife Research Unit, Amherst, MA, USA
| | - Frank Grutzner
- School of Biological Science, The Environment Institute, University of Adelaide, Adelaide, South Australia, Australia
| | - Wesley C Warren
- Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | | | - Dave Burt
- UQ Genomics, University of Queensland, Brisbane, Queensland, Australia
| | - Julia M George
- Department of Biological Sciences, Clemson University, Clemson, SC, USA
| | - Matthew T Biegler
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - David Iorns
- The Genetic Rescue Foundation, Wellington, New Zealand
| | - Andrew Digby
- Kākāpō Recovery, Department of Conservation, Invercargill, New Zealand
| | - Daryl Eason
- Kākāpō Recovery, Department of Conservation, Invercargill, New Zealand
| | - Bruce Robertson
- Department of Zoology, University of Otago, Dunedin, New Zealand
| | | | - Mark Wilkinson
- Department of Life Sciences, Natural History Museum, London, UK
| | - George Turner
- School of Natural Sciences, Bangor University, Gwynedd, UK
| | - Axel Meyer
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Andreas F Kautt
- Department of Biology, University of Konstanz, Konstanz, Germany
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Paolo Franchini
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - H William Detrich
- Department of Marine and Environmental Sciences, Northeastern University Marine Science Center, Nahant, MA, USA
| | - Hannes Svardal
- Department of Biology, University of Antwerp, Antwerp, Belgium
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | - Maximilian Wagner
- Institute of Biology, Karl-Franzens University of Graz, Graz, Austria
| | - Gavin J P Naylor
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | - Martin Pippel
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology, Dresden, Germany
| | - Milan Malinsky
- Wellcome Sanger Institute, Cambridge, UK
- Zoological Institute, University of Basel, Basel, Switzerland
| | | | | | | | - Trevor Pesout
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | | | | | | | | | | | - Ivan Sović
- Pacific Biosciences, Menlo Park, CA, USA
- Digital BioLogic, Ivanić-Grad, Croatia
| | | | - Zemin Ning
- Wellcome Sanger Institute, Cambridge, UK
| | | | - Joyce Lee
- Bionano Genomics, San Diego, CA, USA
| | | | - Richard E Green
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
- Dovetail Genomics, Santa Cruz, CA, USA
| | | | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Jay Ghurye
- Dovetail Genomics, Santa Cruz, CA, USA
- Department of Computer Science, University of Maryland College Park, College Park, MD, USA
| | - Erik Garrison
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Ying Sims
- Wellcome Sanger Institute, Cambridge, UK
| | | | | | | | | | | | | | - Dengfeng Guan
- Department of Genetics, University of Cambridge, Cambridge, UK
- School of Computer Science and Technology, Center for Bioinformatics, Harbin Institute of Technology, Harbin, China
| | - Sarah E London
- Department of Psychology, Institute for Mind and Biology, University of Chicago, Chicago, IL, USA
| | - David F Clayton
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Samantha R Friedrich
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Peter V Lovell
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Ekaterina Osipova
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology, Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| | - Farooq O Al-Ajli
- Monash University Malaysia Genomics Facility, School of Science, Selangor Darul Ehsan, Malaysia
- Tropical Medicine and Biology Multidisciplinary Platform, Monash University Malaysia, Selangor Darul Ehsan, Malaysia
- Qatar Falcon Genome Project, Doha, Qatar
| | | | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
- eGnome, Inc., Seoul, Republic of Korea
| | | | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt, Germany
- Senckenberg Research Institute, Frankfurt, Germany
- Goethe-University, Faculty of Biosciences, Frankfurt, Germany
| | | | - Robert S Harris
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Center for Medical Genomics, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| | - Paul Medvedev
- Center for Medical Genomics, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Jinna Hoffman
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Karen Clark
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Fergal Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Kevin Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Woori Kwak
- eGnome, Inc., Seoul, Republic of Korea
- Hoonygen, Seoul, Korea
| | - Hiram Clawson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Luis Nassar
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Robert H S Kraus
- Department of Biology, University of Konstanz, Konstanz, Germany
- Department of Migration, Max Planck Institute of Animal Behavior, Radolfzell, Germany
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, The GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
- University Museum, NTNU, Trondheim, Norway
| | - Guojie Zhang
- China National Genebank, BGI-Shenzhen, Shenzhen, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Byrappa Venkatesh
- Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore, Singapore
| | - Robert W Murphy
- Centre for Biodiversity, Royal Ontario Museum, Toronto, Ontario, Canada
| | - Klaus-Peter Koepfli
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
- The Walter Reed Biosystematics Unit, Museum Support Center MRC-534, Smithsonian Institution, Suitland, MD, USA
- Walter Reed Army Institute of Research, Silver Spring, MD, USA
| | - Federica Di Palma
- Department of Biological Sciences, Earlham Institute, University of East Anglia, Norwich, UK
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Emma C Teeling
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
| | - Tandy Warnow
- Department of Computer Science, The University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | - Oliver A Ryder
- San Diego Zoo Global, Escondido, CA, USA
- Department of Evolution, Behavior, and Ecology, University of California San Diego, La Jolla, CA, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Stephen J O'Brien
- Laboratory of Genomics Diversity-Center for Computer Technologies, ITMO University, St. Petersburg, Russian Federation
- Guy Harvey Oceanographic Center, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Fort Lauderdale, FL, USA
| | | | - Harris A Lewin
- The Genome Center, University of California Davis, Davis, CA, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA, USA
| | | | - Eugene W Myers
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.
- Center for Systems Biology, Dresden, Germany.
- Faculty of Computer Science, Technical University Dresden, Dresden, Germany.
| | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge, UK.
- Wellcome Sanger Institute, Cambridge, UK.
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
44
|
Morin PA, Archer FI, Avila CD, Balacco JR, Bukhman YV, Chow W, Fedrigo O, Formenti G, Fronczek JA, Fungtammasan A, Gulland FMD, Haase B, Peter Heide-Jorgensen M, Houck ML, Howe K, Misuraca AC, Mountcastle J, Musser W, Paez S, Pelan S, Phillippy A, Rhie A, Robinson J, Rojas-Bracho L, Rowles TK, Ryder OA, Smith CR, Stevenson S, Taylor BL, Teilmann J, Torrance J, Wells RS, Westgate AJ, Jarvis ED. Reference genome and demographic history of the most endangered marine mammal, the vaquita. Mol Ecol Resour 2020; 21:1008-1020. [PMID: 33089966 PMCID: PMC8247363 DOI: 10.1111/1755-0998.13284] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 09/08/2020] [Accepted: 10/08/2020] [Indexed: 12/12/2022]
Abstract
The vaquita is the most critically endangered marine mammal, with fewer than 19 remaining in the wild. First described in 1958, the vaquita has been in rapid decline for more than 20 years resulting from inadvertent deaths due to the increasing use of large-mesh gillnets. To understand the evolutionary and demographic history of the vaquita, we used combined long-read sequencing and long-range scaffolding methods with long- and short-read RNA sequencing to generate a near error-free annotated reference genome assembly from cell lines derived from a female individual. The genome assembly consists of 99.92% of the assembled sequence contained in 21 nearly gapless chromosome-length autosome scaffolds and the X-chromosome scaffold, with a scaffold N50 of 115 Mb. Genome-wide heterozygosity is the lowest (0.01%) of any mammalian species analysed to date, but heterozygosity is evenly distributed across the chromosomes, consistent with long-term small population size at genetic equilibrium, rather than low diversity resulting from a recent population bottleneck or inbreeding. Historical demography of the vaquita indicates long-term population stability at less than 5,000 (Ne) for over 200,000 years. Together, these analyses indicate that the vaquita genome has had ample opportunity to purge highly deleterious alleles and potentially maintain diversity necessary for population health.
Collapse
Affiliation(s)
- Phillip A Morin
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA, USA
| | - Frederick I Archer
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA, USA
| | - Catherine D Avila
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Yury V Bukhman
- Regenerative Biology, Morgridge Institute for Research, Madison, WI, USA
| | | | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Julie A Fronczek
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | | | - Bettina Haase
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Marlys L Houck
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | - Ann C Misuraca
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | | | - Sadye Paez
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | | | - Adam Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Jacqueline Robinson
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | | | - Teri K Rowles
- Office of Protected Resources, National Marine Fisheries Service, NOAA, Silver Spring, MD, USA
| | - Oliver A Ryder
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | | | - Barbara L Taylor
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA, USA
| | - Jonas Teilmann
- Marine Mammal Research, Department of Bioscience, Aarhus University, Roskilde, Denmark
| | | | - Randall S Wells
- Chicago Zoological Society's Sarasota Dolphin Research Program, c/o Mote Marine Laboratory, Sarasota, FL, USA
| | | | - Erich D Jarvis
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA.,Howard Hughes Medical Institute, Chevy Chase, MD, USA
| |
Collapse
|
45
|
Giani AM, Gallo GR, Gianfranceschi L, Formenti G. Long walk to genomics: History and current approaches to genome sequencing and assembly. Comput Struct Biotechnol J 2019; 18:9-19. [PMID: 31890139 PMCID: PMC6926122 DOI: 10.1016/j.csbj.2019.11.002] [Citation(s) in RCA: 99] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 11/03/2019] [Accepted: 11/06/2019] [Indexed: 12/13/2022] Open
Abstract
Genomes represent the starting point of genetic studies. Since the discovery of DNA structure, scientists have devoted great efforts to determine their sequence in an exact way. In this review we provide a comprehensive historical background of the improvements in DNA sequencing technologies that have accompanied the major milestones in genome sequencing and assembly, ranging from early sequencing methods to Next-Generation Sequencing platforms. We then focus on the advantages and challenges of the current technologies and approaches, collectively known as Third Generation Sequencing. As these technical advancements have been accompanied by progress in analytical methods, we also review the bioinformatic tools currently employed in de novo genome assembly, as well as some applications of Third Generation Sequencing technologies and high-quality reference genomes.
Collapse
Key Words
- BAC, Bacterial Artificial Chromosome
- Bioinformatics
- Genome assembly
- HGP, Human Genome Project
- HMW, high molecular weight
- HapMap, haplotype map
- NGS, Next Generation Sequencing
- Next-generation
- OLC, Overlap-Layout-Consensus
- QV, Quality Value (QV)
- Reference
- SBS, Sequencing by Synthesis
- SMRT, Single Molecule Real-Time
- SNPs, Single Nucleotide Polymorphisms
- SRA, Short Read Archive
- SV, Structural Variant
- Sequencing
- TGS, Third Generation Sequencing
- Third-generation
- WGS, Whole Genome Sequencing
- ZMW, Zero-Mode Waveguide
- bp, base pair
- dNTPs, deoxynucleoside triphosphates
- ddNTP, 2,3-dideoxynucleoside triphosphate
Collapse
Affiliation(s)
- Alice Maria Giani
- Department of Surgery, Weill Cornell Medical College, New York, NY, USA
| | | | | | | |
Collapse
|
46
|
Carnevali I, Riva C, Chiaravalli AM, Sahnane N, Di Lauro E, Viel A, Rovera F, Formenti G, Ghezzi F, Sessa F, Tibiletti MG. Inherited cancer syndromes in 220 Italian ovarian cancer patients. Cancer Genet 2019; 237:55-62. [PMID: 31447066 DOI: 10.1016/j.cancergen.2019.06.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 05/20/2019] [Accepted: 06/10/2019] [Indexed: 01/06/2023]
Abstract
BACKGROUND A subsets of ovarian carcinomas (OCs) are related to inherited conditions including Hereditary Breast and Ovarian Cancers (HBOC) and Lynch Syndrome (LS). The identification of inherited conditions using genetic testing might be a strategic model for cancer prevention that include benefits for the ovarian cancer patients and for their family members. METHODS We describe a retrospective Italian experience for the identification of inherited conditions in 232 patients affected by OCs using both somatic and germline analyses. RESULTS Immunohistochemical and microsatellite analyses performed on OCs identified 20 out of 101 MMR defective cancers and 15 of these were from patients carriers of the MMR germline pathogenetic variants. BRCA1 and BRCA2 testing offered to 198 OC patients revealed 67 (34%) pathogenetic variant carriers of BRCA1/2 genes. Interestingly LS patients revealed a mean age of OC onset of 45.4 years, which was significantly lower than the mean age of OCs onset of HBOC patients. CONCLUSIONS Somatic and germline analyses offered to OC patients has proved to be an efficient strategy for the identification of inherited conditions involving OC also in absence of suggestive family histories. The identification of LS and HBOC syndromes through OC patients is an effective tool for OC prevention.
Collapse
Affiliation(s)
- I Carnevali
- Department of Pathology, Ospedale di Circolo, ASST-Sette Laghi, Via O. Rossi 9, 21100 Varese, Italy; Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy.
| | - C Riva
- Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy; Department of Medicine and Surgery, University of Insubria, Varese, Italy
| | - A M Chiaravalli
- Department of Pathology, Ospedale di Circolo, ASST-Sette Laghi, Via O. Rossi 9, 21100 Varese, Italy; Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy
| | - N Sahnane
- Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy; Department of Medicine and Surgery, University of Insubria, Varese, Italy
| | - E Di Lauro
- Department of Pathology, Ospedale di Circolo, ASST-Sette Laghi, Via O. Rossi 9, 21100 Varese, Italy
| | - A Viel
- Unit of Functional Onco-genomics and Genetics, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Aviano, PN, Italy
| | - F Rovera
- Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy; Breast Unit Ospedale di Circolo, ASST Settelaghi, Varese, Italy
| | - G Formenti
- Department of Obstetrics and Gynaecology, Ospedale F.Del Ponte, ASST Settelaghi, Varese, Italy
| | - F Ghezzi
- Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy; Department of Obstetrics and Gynaecology, Ospedale F.Del Ponte, ASST Settelaghi, Varese, Italy
| | - F Sessa
- Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy; Department of Medicine and Surgery, University of Insubria, Varese, Italy
| | - M G Tibiletti
- Department of Pathology, Ospedale di Circolo, ASST-Sette Laghi, Via O. Rossi 9, 21100 Varese, Italy; Research Center for the Study of Hereditary and Familial Tumors, Department of Medicine and Surgery, University of Insubria, Varese, Italy
| |
Collapse
|
47
|
Saino N, Albetti B, Ambrosini R, Caprioli M, Costanzo A, Mariani J, Parolini M, Romano A, Rubolini D, Formenti G, Gianfranceschi L, Bollati V. Inter-generational resemblance of methylation levels at circadian genes and associations with phenology in the barn swallow. Sci Rep 2019; 9:6505. [PMID: 31019206 PMCID: PMC6482194 DOI: 10.1038/s41598-019-42798-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 03/25/2019] [Indexed: 12/16/2022] Open
Abstract
Regulation of gene expression can occur via epigenetic effects as mediated by DNA methylation. The potential for epigenetic effects to be transmitted across generations, thus modulating phenotypic variation and affecting ecological and evolutionary processes, is increasingly appreciated. However, the study of variation in epigenomes and inter-generational transmission of epigenetic alterations in wild populations is at its very infancy. We studied sex- and age-related variation in DNA methylation and parent-offspring resemblance in methylation profiles in the barn swallows. We focused on a class of highly conserved ‘clock’ genes (clock, cry1, per2, per3, timeless) relevant in the timing of activities of major ecological importance. In addition, we considerably expanded previous analyses on the relationship between methylation at clock genes and breeding date, a key fitness trait in barn swallows. We found positive assortative mating for methylation at one clock locus. Methylation varied between the nestling and the adult stage, and according to sex. Individuals with relatively high methylation as nestlings also had high methylation levels when adults. Extensive parent-nestling resemblance in methylation levels was observed. Occurrence of extra-pair fertilizations allowed to disclose evidence hinting at a prevalence of paternal germline or sperm quality effects over common environment effects in generating father-offspring resemblance in methylation. Finally, we found an association between methylation at the clock poly-Q region, but not at other loci, and breeding date. We thus provided evidence for sex-dependent variation and the first account of parent-offspring resemblance in methylation in any wild vertebrate. We also showed that epigenetics may influence phenotypic plasticity of timing of life cycle events, thus having a major impact on fitness.
Collapse
Affiliation(s)
- Nicola Saino
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy.
| | - Benedetta Albetti
- Department of Clinical Sciences and Community Health, via S. Barnaba 8, I-20122, Milan, Italy
| | - Roberto Ambrosini
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Manuela Caprioli
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Alessandra Costanzo
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Jacopo Mariani
- Department of Clinical Sciences and Community Health, via S. Barnaba 8, I-20122, Milan, Italy
| | - Marco Parolini
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Andrea Romano
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy.,Department of Ecology and Evolution, University of Lausanne, Building Biophore, CH-1015, Lausanne, Switzerland
| | - Diego Rubolini
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Giulio Formenti
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Luca Gianfranceschi
- Department of Biosciences, University of Milan, via Celoria 26, I-20133, Milan, Italy
| | - Valentina Bollati
- Department of Clinical Sciences and Community Health, via S. Barnaba 8, I-20122, Milan, Italy.
| |
Collapse
|
48
|
Formenti G, Chiara M, Poveda L, Francoijs KJ, Bonisoli-Alquati A, Canova L, Gianfranceschi L, Horner DS, Saino N. SMRT long reads and Direct Label and Stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica). Gigascience 2019; 8:5202456. [PMID: 30496513 PMCID: PMC6324554 DOI: 10.1093/gigascience/giy142] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 11/14/2018] [Indexed: 11/12/2022] Open
Abstract
Background The barn swallow (Hirundo rustica) is a migratory bird that has been the focus of a large number of ecological, behavioral, and genetic studies. To facilitate further population genetics and genomic studies, we present a reference genome assembly for the European subspecies (H. r. rustica). Findings As part of the Genome10K effort on generating high-quality vertebrate genomes (Vertebrate Genomes Project), we have assembled a highly contiguous genome assembly using single molecule real-time (SMRT) DNA sequencing and several Bionano optical map technologies. We compared and integrated optical maps derived from both the Nick, Label, Repair, and Stain technology and from the Direct Label and Stain (DLS) technology. As proposed by Bionano, DLS more than doubled the scaffold N50 with respect to the nickase. The dual enzyme hybrid scaffold led to a further marginal increase in scaffold N50 and an overall increase of confidence in the scaffolds. After removal of haplotigs, the final assembly is approximately 1.21 Gbp in size, with a scaffold N50 value of more than 25.95 Mbp. Conclusions This high-quality genome assembly represents a valuable resource for future studies of population genetics and genomics in the barn swallow and for studies concerning the evolution of avian genomes. It also represents one of the very first genomes assembled by combining SMRT long-read sequencing with the new Bionano DLS technology for scaffolding. The quality of this assembly demonstrates the potential of this methodology to substantially increase the contiguity of genome assemblies.
Collapse
Affiliation(s)
- Giulio Formenti
- Department of Environmental Science and Policy, University of Milan, via celoria 2, Milan, 20133, Italy
| | - Matteo Chiara
- Department of Biosciences, University of Milan, via celoria 26, Milan, 20133, Italy
| | - Lucy Poveda
- Functional Genomics Center of Zurich, University of Zurich, Winterthurerstrasse 190, Zürich, 8057, Switzerland
| | | | - Andrea Bonisoli-Alquati
- Department of Biological Sciences, California State Polytechnic University, 3801 West Temple Avenue, Pomona, California, 91768, USA
| | - Luca Canova
- Department of Biochemistry, University of Pavia, Via Taramelli 12, Pavia, 27100, Italy
| | - Luca Gianfranceschi
- Department of Biosciences, University of Milan, via celoria 26, Milan, 20133, Italy
| | - David Stephen Horner
- Department of Biosciences, University of Milan, via celoria 26, Milan, 20133, Italy
| | - Nicola Saino
- Department of Environmental Science and Policy, University of Milan, via celoria 2, Milan, 20133, Italy
| |
Collapse
|