1
|
Llamas B, Narzisi G, Schneider V, Audano PA, Biederstedt E, Blauvelt L, Bradbury P, Chang X, Chin CS, Fungtammasan A, Clarke WE, Cleary A, Ebler J, Eizenga J, Sibbesen JA, Markello CJ, Garrison E, Garg S, Hickey G, Lazo GR, Lin MF, Mahmoud M, Marschall T, Minkin I, Monlong J, Musunuri RL, Sagayaradj S, Novak AM, Rautiainen M, Regier A, Sedlazeck FJ, Siren J, Souilmi Y, Wagner J, Wrightsman T, Yokoyama TT, Zeng Q, Zook JM, Paten B, Busby B. A strategy for building and using a human reference pangenome. F1000Res 2019; 8:1751. [PMID: 34386196 PMCID: PMC8350888 DOI: 10.12688/f1000research.19630.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 01/27/2024] Open
Abstract
In March 2019, 45 scientists and software engineers from around the world converged at the University of California, Santa Cruz for the first pangenomics codeathon. The purpose of the meeting was to propose technical specifications and standards for a usable human pangenome as well as to build relevant tools for genome graph infrastructures. During the meeting, the group held several intense and productive discussions covering a diverse set of topics, including advantages of graph genomes over a linear reference representation, design of new methods that can leverage graph-based data structures, and novel visualization and annotation approaches for pangenomes. Additionally, the participants self-organized themselves into teams that worked intensely over a three-day period to build a set of pipelines and tools for specific pangenomic applications. A summary of the questions raised and the tools developed are reported in this manuscript.
Collapse
Affiliation(s)
- Bastien Llamas
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | | | - Valerie Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Peter A. Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Evan Biederstedt
- Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02215, USA
| | - Lon Blauvelt
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Peter Bradbury
- Robert W. Holley Center, USDA-ARS, Ithaca, NY, 14853, USA
| | - Xian Chang
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | | | | | - Alan Cleary
- National Center for Genome Resources 87505, Santa Fe, NM, 87505, USA
| | - Jana Ebler
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Jordan Eizenga
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Jonas A. Sibbesen
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Charles J. Markello
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Erik Garrison
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Shilpa Garg
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Glenn Hickey
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Gerard R. Lazo
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710-1105, USA
| | | | - Medhat Mahmoud
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | | | - Ilia Minkin
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Jean Monlong
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Sagayamary Sagayaradj
- Genome Center, University of California, Davis, Davis, CA, USA
- BASF, West Sacramento, CA, USA
| | - Adam M. Novak
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Allison Regier
- McDonnell Genome Institute, Washington University in St Louis, St Louis, MO, 63108, USA
| | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | - Jouni Siren
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Yassine Souilmi
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Travis Wrightsman
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Toshiyuki T. Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Qiandong Zeng
- Laboratory Corporation of America Holdings, Westborough, MA, 01581, USA
| | - Justin M. Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Ben Busby
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| |
Collapse
|
2
|
Llamas B, Narzisi G, Schneider V, Audano PA, Biederstedt E, Blauvelt L, Bradbury P, Chang X, Chin CS, Fungtammasan A, Clarke WE, Cleary A, Ebler J, Eizenga J, Sibbesen JA, Markello CJ, Garrison E, Garg S, Hickey G, Lazo GR, Lin MF, Mahmoud M, Marschall T, Minkin I, Monlong J, Musunuri RL, Sagayaradj S, Novak AM, Rautiainen M, Regier A, Sedlazeck FJ, Siren J, Souilmi Y, Wagner J, Wrightsman T, Yokoyama TT, Zeng Q, Zook JM, Paten B, Busby B. A strategy for building and using a human reference pangenome. F1000Res 2019; 8:1751. [PMID: 34386196 PMCID: PMC8350888 DOI: 10.12688/f1000research.19630.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 11/20/2022] Open
Abstract
In March 2019, 45 scientists and software engineers from around the world converged at the University of California, Santa Cruz for the first pangenomics codeathon. The purpose of the meeting was to propose technical specifications and standards for a usable human pangenome as well as to build relevant tools for genome graph infrastructures. During the meeting, the group held several intense and productive discussions covering a diverse set of topics, including advantages of graph genomes over a linear reference representation, design of new methods that can leverage graph-based data structures, and novel visualization and annotation approaches for pangenomes. Additionally, the participants self-organized themselves into teams that worked intensely over a three-day period to build a set of pipelines and tools for specific pangenomic applications. A summary of the questions raised and the tools developed are reported in this manuscript.
Collapse
Affiliation(s)
- Bastien Llamas
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | | | - Valerie Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Evan Biederstedt
- Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02215, USA
| | - Lon Blauvelt
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Peter Bradbury
- Robert W. Holley Center, USDA-ARS, Ithaca, NY, 14853, USA
| | - Xian Chang
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | | | | | - Alan Cleary
- National Center for Genome Resources 87505, Santa Fe, NM, 87505, USA
| | - Jana Ebler
- Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Jordan Eizenga
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Jonas A Sibbesen
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Charles J Markello
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Erik Garrison
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Shilpa Garg
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Glenn Hickey
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Gerard R Lazo
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710-1105, USA
| | | | - Medhat Mahmoud
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | | | - Ilia Minkin
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Jean Monlong
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Sagayamary Sagayaradj
- Genome Center, University of California, Davis, Davis, CA, USA.,BASF, West Sacramento, CA, USA
| | - Adam M Novak
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | | | - Allison Regier
- McDonnell Genome Institute, Washington University in St Louis, St Louis, MO, 63108, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, TX, 77030, USA
| | - Jouni Siren
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Yassine Souilmi
- Australian Centre for Ancient DNA, School of Biological Sciences, Environment Institute, The University of Adelaide, Adelaide, South Australia, 5005, Australia
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Travis Wrightsman
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Toshiyuki T Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Qiandong Zeng
- Laboratory Corporation of America Holdings, Westborough, MA, 01581, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
| | - Benedict Paten
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95064, USA
| | - Ben Busby
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| |
Collapse
|