2
|
Xiao W, Ren L, Chen Z, Fang LT, Zhao Y, Lack J, Guan M, Zhu B, Jaeger E, Kerrigan L, Blomquist TM, Hung T, Sultan M, Idler K, Lu C, Scherer A, Kusko R, Moos M, Xiao C, Sherry ST, Abaan OD, Chen W, Chen X, Nordlund J, Liljedahl U, Maestro R, Polano M, Drabek J, Vojta P, Kõks S, Reimann E, Madala BS, Mercer T, Miller C, Jacob H, Truong T, Moshrefi A, Natarajan A, Granat A, Schroth GP, Kalamegham R, Peters E, Petitjean V, Walton A, Shen TW, Talsania K, Vera CJ, Langenbach K, de Mars M, Hipp JA, Willey JC, Wang J, Shetty J, Kriga Y, Raziuddin A, Tran B, Zheng Y, Yu Y, Cam M, Jailwala P, Nguyen C, Meerzaman D, Chen Q, Yan C, Ernest B, Mehra U, Jensen RV, Jones W, Li JL, Papas BN, Pirooznia M, Chen YC, Seifuddin F, Li Z, Liu X, Resch W, Wang J, Wu L, Yavas G, Miles C, Ning B, Tong W, Mason CE, Donaldson E, Lababidi S, Staudt LM, Tezak Z, Hong H, Wang C, Shi L. Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing. Nat Biotechnol 2021; 39:1141-1150. [PMID: 34504346 PMCID: PMC8506910 DOI: 10.1038/s41587-021-00994-5] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 06/18/2021] [Indexed: 02/01/2023]
Abstract
Clinical applications of precision oncology require accurate tests that can distinguish true cancer-specific mutations from errors introduced at each step of next-generation sequencing (NGS). To date, no bulk sequencing study has addressed the effects of cross-site reproducibility, nor the biological, technical and computational factors that influence variant identification. Here we report a systematic interrogation of somatic mutations in paired tumor-normal cell lines to identify factors affecting detection reproducibility and accuracy at six different centers. Using whole-genome sequencing (WGS) and whole-exome sequencing (WES), we evaluated the reproducibility of different sample types with varying input amount and tumor purity, and multiple library construction protocols, followed by processing with nine bioinformatics pipelines. We found that read coverage and callers affected both WGS and WES reproducibility, but WES performance was influenced by insert fragment size, genomic copy content and the global imbalance score (GIV; G > T/C > A). Finally, taking into account library preparation protocol, tumor content, read coverage and bioinformatics processes concomitantly, we recommend actionable practices to improve the reproducibility and accuracy of NGS experiments for cancer mutation detection.
Collapse
Affiliation(s)
- Wenming Xiao
- The Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA.
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zhong Chen
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Li Tai Fang
- Bioinformatics Research & Early Development, Roche Sequencing Solutions Inc., Belmont, CA, USA
| | - Yongmei Zhao
- Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Justin Lack
- Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | | | - Bin Zhu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, MD, USA
| | | | | | - Thomas M Blomquist
- Departments of Medicine and Pathology, University of Toledo Medical Center, Toledo, OH, USA
| | | | - Marc Sultan
- Biomarker Development, Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Kenneth Idler
- Computational Genomics, Genomics Research Center, AbbVie, North Chicago, IL, USA
| | - Charles Lu
- Computational Genomics, Genomics Research Center, AbbVie, North Chicago, IL, USA
| | - Andreas Scherer
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | | | - Malcolm Moos
- The Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Stephen T Sherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Ogan D Abaan
- Illumina Inc., Foster City, CA, USA
- Seven Bridges Genomics Inc., Cambridge, MA, USA
| | - Wanqiu Chen
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Xin Chen
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Jessica Nordlund
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Ulrika Liljedahl
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Centro di Riferimento Oncologico di Aviano IRCCS, National Cancer Institute, Unit of Oncogenetics and Functional Oncogenomics, Aviano, Italy
| | - Roberta Maestro
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Centro di Riferimento Oncologico di Aviano IRCCS, National Cancer Institute, Unit of Oncogenetics and Functional Oncogenomics, Aviano, Italy
| | - Maurizio Polano
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Centro di Riferimento Oncologico di Aviano IRCCS, National Cancer Institute, Unit of Oncogenetics and Functional Oncogenomics, Aviano, Italy
| | - Jiri Drabek
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- IMTM, Faculty of Medicine and Dentistry, Palacky University Olomouc, Olomouc, Czech Republic
| | - Petr Vojta
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- IMTM, Faculty of Medicine and Dentistry, Palacky University Olomouc, Olomouc, Czech Republic
| | - Sulev Kõks
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Perron Institute for Neurological and Translational Science, Nedlands, Perth, Western Australia, Australia
- Centre for Molecular Medicine and Innovative Therapeutics, Murdoch University, Murdoch, Perth, Western Australia, Australia
| | - Ene Reimann
- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Bindu Swapna Madala
- Garvan Institute of Medical Research, The Kinghorn Cancer Centre, Darlinghurst, New South Wales, Australia
| | - Timothy Mercer
- Garvan Institute of Medical Research, The Kinghorn Cancer Centre, Darlinghurst, New South Wales, Australia
| | - Chris Miller
- Computational Genomics, Genomics Research Center, AbbVie, North Chicago, IL, USA
| | - Howard Jacob
- Computational Genomics, Genomics Research Center, AbbVie, North Chicago, IL, USA
| | | | | | | | | | | | | | | | - Virginie Petitjean
- Biomarker Development, Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Ashley Walton
- Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Tsai-Wei Shen
- Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Keyur Talsania
- Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Cristobal Juan Vera
- Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | | | | | - Jennifer A Hipp
- Departments of Medicine and Pathology, University of Toledo Medical Center, Toledo, OH, USA
| | - James C Willey
- Departments of Medicine and Pathology, University of Toledo Medical Center, Toledo, OH, USA
| | - Jing Wang
- National Institute of Metrology, Beijing, China
| | - Jyoti Shetty
- Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Yuliya Kriga
- Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Arati Raziuddin
- Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Bao Tran
- Sequencing Facility, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Margaret Cam
- CCR Collaborative Bioinformatics Resource, Office of Science and Technology Resources, Center for Cancer Research, Bethesda, MD, USA
| | - Parthav Jailwala
- CCR Collaborative Bioinformatics Resource, Office of Science and Technology Resources, Center for Cancer Research, Bethesda, MD, USA
| | - Cu Nguyen
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | - Daoud Meerzaman
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | - Qingrong Chen
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | - Chunhua Yan
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | | | | | - Roderick V Jensen
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | | | - Jian-Liang Li
- Integrative Bioinformatics, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Brian N Papas
- Integrative Bioinformatics, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Mehdi Pirooznia
- Bioinformatics and Computational Biology Core, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yun-Ching Chen
- Bioinformatics and Computational Biology Core, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Fayaz Seifuddin
- Bioinformatics and Computational Biology Core, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Zhipan Li
- Sentieon Inc., Mountain View, CA, USA
| | - Xuelu Liu
- Center for Information Technology, National Institutes of Health, Bethesda, MD, USA
| | - Wolfgang Resch
- Center for Information Technology, National Institutes of Health, Bethesda, MD, USA
| | | | - Leihong Wu
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Gokhan Yavas
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Corey Miles
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Baitang Ning
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Eric Donaldson
- The Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Samir Lababidi
- Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Information, Silver Spring, MD, USA
| | - Louis M Staudt
- Lymphoid Malignancies Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Zivana Tezak
- The Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Huixiao Hong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Charles Wang
- Center for Genomics, Loma Linda University School of Medicine, Loma Linda, CA, USA.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
3
|
Zhang F, Christiansen L, Thomas J, Pokholok D, Jackson R, Morrell N, Zhao Y, Wiley M, Welch E, Jaeger E, Granat A, Norberg SJ, Halpern A, C Rogert M, Ronaghi M, Shendure J, Gormley N, Gunderson KL, Steemers FJ. Haplotype phasing of whole human genomes using bead-based barcode partitioning in a single tube. Nat Biotechnol 2017. [PMID: 28650462 DOI: 10.1038/nbt.3897] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Haplotype-resolved genome sequencing promises to unlock a wealth of information in population and medical genetics. However, for the vast majority of genomes sequenced to date, haplotypes have not been determined because of cumbersome haplotyping workflows that require fractions of the genome to be sequenced in a large number of compartments. Here we demonstrate barcode partitioning of long DNA molecules in a single compartment using "on-bead" barcoded tagmentation. The key to the method that we call "contiguity preserving transposition" sequencing on beads (CPTv2-seq) is transposon-mediated transfer of homogenous populations of barcodes from beads to individual long DNA molecules that get fragmented at the same time (tagmentation). These are then processed to sequencing libraries wherein all sequencing reads originating from each long DNA molecule share a common barcode. Single-tube, bulk processing of long DNA molecules with ∼150,000 different barcoded bead types provides a barcode-linked read structure that reveals long-range molecular contiguity. This technology provides a simple, rapid, plate-scalable and automatable route to accurate, haplotype-resolved sequencing, and phasing of structural variants of the genome.
Collapse
Affiliation(s)
- Fan Zhang
- Advanced Research Department, Illumina, San Diego, California, USA
| | | | - Jerushah Thomas
- Advanced Research Department, Illumina, San Diego, California, USA
| | - Dmitry Pokholok
- Advanced Research Department, Illumina, San Diego, California, USA
| | - Ros Jackson
- Technology Development Department, Illumina, Little Chesterford, Essex, UK
| | - Natalie Morrell
- Technology Development Department, Illumina, Little Chesterford, Essex, UK
| | - Yannan Zhao
- Technology Development, Illumina, San Diego, California, USA
| | - Melissa Wiley
- Technology Development, Illumina, San Diego, California, USA
| | - Emily Welch
- Technology Development, Illumina, San Diego, California, USA
| | - Erich Jaeger
- Gene Expression Department, Illumina, San Francisco, California, USA
| | - Ana Granat
- Gene Expression Department, Illumina, San Francisco, California, USA
| | - Steven J Norberg
- Advanced Research Department, Illumina, San Diego, California, USA
| | - Aaron Halpern
- Gene Expression Department, Illumina, San Francisco, California, USA
| | - Maria C Rogert
- Technology Development, Illumina, San Diego, California, USA
| | - Mostafa Ronaghi
- Advanced Research Department, Illumina, San Diego, California, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Niall Gormley
- Technology Development Department, Illumina, Little Chesterford, Essex, UK
| | | | - Frank J Steemers
- Advanced Research Department, Illumina, San Diego, California, USA
| |
Collapse
|