3
|
Hunt M, Hinrichs AS, Anderson D, Karim L, Dearlove BL, Knaggs J, Constantinides B, Fowler PW, Rodger G, Street T, Lumley S, Webster H, Sanderson T, Ruis C, Kotzen B, de Maio N, Amenga-Etego LN, Amuzu DSY, Avaro M, Awandare GA, Ayivor-Djanie R, Barkham T, Bashton M, Batty EM, Bediako Y, De Belder D, Benedetti E, Bergthaler A, Boers SA, Campos J, Carr RAA, Chen YYC, Cuba F, Dattero ME, Dejnirattisai W, Dilthey A, Duedu KO, Endler L, Engelmann I, Francisco NM, Fuchs J, Gnimpieba EZ, Groc S, Gyamfi J, Heemskerk D, Houwaart T, Hsiao NY, Huska M, Hölzer M, Iranzadeh A, Jarva H, Jeewandara C, Jolly B, Joseph R, Kant R, Ki KKK, Kurkela S, Lappalainen M, Lataretu M, Lemieux J, Liu C, Malavige GN, Mashe T, Mongkolsapaya J, Montes B, Mora JAM, Morang'a CM, Mvula B, Nagarajan N, Nelson A, Ngoi JM, da Paixão JP, Panning M, Poklepovich T, Quashie PK, Ranasinghe D, Russo M, San JE, Sanderson ND, Scaria V, Screaton G, Sessions OM, Sironen T, Sisay A, Smith D, Smura T, Supasa P, Suphavilai C, Swann J, Tegally H, Tegomoh B, Vapalahti O, Walker A, Wilkinson RJ, Williamson C, Zair X, de Oliveira T, Peto TE, Crook D, Corbett-Detig R, et alHunt M, Hinrichs AS, Anderson D, Karim L, Dearlove BL, Knaggs J, Constantinides B, Fowler PW, Rodger G, Street T, Lumley S, Webster H, Sanderson T, Ruis C, Kotzen B, de Maio N, Amenga-Etego LN, Amuzu DSY, Avaro M, Awandare GA, Ayivor-Djanie R, Barkham T, Bashton M, Batty EM, Bediako Y, De Belder D, Benedetti E, Bergthaler A, Boers SA, Campos J, Carr RAA, Chen YYC, Cuba F, Dattero ME, Dejnirattisai W, Dilthey A, Duedu KO, Endler L, Engelmann I, Francisco NM, Fuchs J, Gnimpieba EZ, Groc S, Gyamfi J, Heemskerk D, Houwaart T, Hsiao NY, Huska M, Hölzer M, Iranzadeh A, Jarva H, Jeewandara C, Jolly B, Joseph R, Kant R, Ki KKK, Kurkela S, Lappalainen M, Lataretu M, Lemieux J, Liu C, Malavige GN, Mashe T, Mongkolsapaya J, Montes B, Mora JAM, Morang'a CM, Mvula B, Nagarajan N, Nelson A, Ngoi JM, da Paixão JP, Panning M, Poklepovich T, Quashie PK, Ranasinghe D, Russo M, San JE, Sanderson ND, Scaria V, Screaton G, Sessions OM, Sironen T, Sisay A, Smith D, Smura T, Supasa P, Suphavilai C, Swann J, Tegally H, Tegomoh B, Vapalahti O, Walker A, Wilkinson RJ, Williamson C, Zair X, de Oliveira T, Peto TE, Crook D, Corbett-Detig R, Iqbal Z. Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591666. [PMID: 38746185 PMCID: PMC11092452 DOI: 10.1101/2024.04.29.591666] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The SARS-CoV-2 genome occupies a unique place in infection biology - it is the most highly sequenced genome on earth (making up over 20% of public sequencing datasets) with fine scale information on sampling date and geography, and has been subject to unprecedented intense analysis. As a result, these phylogenetic data are an incredibly valuable resource for science and public health. However, the vast majority of the data was sequenced by tiling amplicons across the full genome, with amplicon schemes that changed over the pandemic as mutations in the viral genome interacted with primer binding sites. In combination with the disparate set of genome assembly workflows and lack of consistent quality control (QC) processes, the current genomes have many systematic errors that have evolved with the virus and amplicon schemes. These errors have significant impacts on the phylogeny, and therefore over the last few years, many thousands of hours of researchers time has been spent in "eyeballing" trees, looking for artefacts, and then patching the tree. Given the huge value of this dataset, we therefore set out to reprocess the complete set of public raw sequence data in a rigorous amplicon-aware manner, and build a cleaner phylogeny. Here we provide a global tree of 4,471,579 samples, built from a consistently assembled set of high quality consensus sequences from all available public data as of June 2024, viewable at https://viridian.taxonium.org. Each genome was constructed using a novel assembly tool called Viridian (https://github.com/iqbal-lab-org/viridian), developed specifically to process amplicon sequence data, eliminating artefactual errors and mask the genome at low quality positions. We provide simulation and empirical validation of the methodology, and quantify the improvement in the phylogeny. We hope the tree, consensus sequences and Viridian will be a valuable resource for researchers.
Collapse
Affiliation(s)
- Martin Hunt
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
| | - Daniel Anderson
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Lily Karim
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA
| | - Bethany L Dearlove
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Jeff Knaggs
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Bede Constantinides
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Philip W Fowler
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Gillian Rodger
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Teresa Street
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
| | - Sheila Lumley
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Department of Infectious Diseases and Microbiology, John Radcliffe Hospital, Oxford, UK
| | - Hermione Webster
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Christopher Ruis
- Victor Phillip Dahdaleh Heart & Lung Research Institute, University of Cambridge, Cambridge, UK
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Benjamin Kotzen
- Department of Infectious Diseases, Massachusetts General Hospital., Boston, Massachusetts, USA
| | - Nicola de Maio
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Lucas N Amenga-Etego
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Dominic S Y Amuzu
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Martin Avaro
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Gordon A Awandare
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Reuben Ayivor-Djanie
- Laboratory for Medical Biotechnology and Biomanufacturing, International Centre for Genetic Engineering and Biotechnology, Tristie, Italy
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
| | | | - Matthew Bashton
- The Hub for Biotechnology in the Built Environment, Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Elizabeth M Batty
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Mahidol-Oxford Tropical Medicine Research Unit, Bangkok, Thailand
| | - Yaw Bediako
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Denise De Belder
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Estefania Benedetti
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Andreas Bergthaler
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Stefan A Boers
- Dept. Medical Microbiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Josefina Campos
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Rosina Afua Ampomah Carr
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- Department of Computational Medicine and Bioinformatics, University of Michigan, Michigan, Ann Arbor, MI, USA
| | | | - Facundo Cuba
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Maria Elena Dattero
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Wanwisa Dejnirattisai
- Division of Emerging Infectious Disease, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkoknoi, Bangkok 10700, Thailand
| | - Alexander Dilthey
- Institute of Medical Microbiology and Hospital Hygiene, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Kwabena Obeng Duedu
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- College of Life Sciences, Birmingham City University, Birmingham, UK
| | - Lukas Endler
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Ilka Engelmann
- Pathogenesis and Control of Chronic and Emerging Infections, Univ Montpellier, INSERM, Etablissement Français du Sang, Virology Laboratory, CHU Montpellier, Montpellier, France
| | - Ngiambudulu M Francisco
- Grupo de Investigação Microbiana e Imunológica, Instituto Nacional de Investigação em Saúde (National Institute for Health Research), Luanda, Angola
| | - Jonas Fuchs
- Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Etienne Z Gnimpieba
- Biomedical Engineering Department, University of South Dakota, Sioux Falls, SD 57107
| | - Soraya Groc
- Virology Laboratory, CHU Montpellier, Montpellier, France
| | - Jones Gyamfi
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- School of Health and Life Sciences, Teesside University, Middlesbrough, UK
| | - Dennis Heemskerk
- Dept. Medical Microbiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Torsten Houwaart
- Institute of Medical Microbiology and Hospital Hygiene, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Nei-Yuan Hsiao
- Divison of Medical Virology, University of Cape Town and National Health Laboratory Service
| | - Matthew Huska
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Martin Hölzer
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | | | - Hanna Jarva
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Chandima Jeewandara
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Bani Jolly
- Karkinos Healthcare Private Limited (KHPL), Aurbis Business Parks, Bellandur, Bengaluru, Karnataka, 560103, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| | | | - Ravi Kant
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- Department of Tropical Parasitology, Institute of Maritime and Tropical Medicine, Medical University of Gdansk, 81-519 Gdynia, Poland
| | | | - Satu Kurkela
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Maija Lappalainen
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Marie Lataretu
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Jacob Lemieux
- Department of Infectious Diseases, Massachusetts General Hospital., Boston, Massachusetts, USA
| | - Chang Liu
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Gathsaurie Neelika Malavige
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Tapfumanei Mashe
- Health System Strengthening Unit, World Health Organisation, Harare, Zimbabwe
| | - Juthathip Mongkolsapaya
- Mahidol-Oxford Tropical Medicine Research Unit, Bangkok, Thailand
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Jose Arturo Molina Mora
- Centro de investigación en Enfermedades Tropicales & Facultad de Microbiología, Universidad de Costa Rica, Costa Rica
| | - Collins M Morang'a
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Bernard Mvula
- Public Health Institute of Malawi, Ministry of Health, Malawi
| | - Niranjan Nagarajan
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Andrew Nelson
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Joyce M Ngoi
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Joana Paula da Paixão
- Grupo de Investigação Microbiana e Imunológica, Instituto Nacional de Investigação em Saúde (National Institute for Health Research), Luanda, Angola
| | - Marcus Panning
- Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Tomas Poklepovich
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Peter K Quashie
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Diyanath Ranasinghe
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Mara Russo
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - James Emmanuel San
- Duke Human Vaccine Institute, Duke University, Durham, NC 27710
- University of KwaZulu Natal, Durban, South Africa, 4001
| | - Nicholas D Sanderson
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
| | - Vinod Scaria
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
- Vishwanath Cancer Care Foundation (VCCF), Neelkanth Business Park Kirol Village, West Mumbai, Maharashtra, 400086, India
| | - Gavin Screaton
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Tarja Sironen
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Abay Sisay
- Department of Medical Laboratory Sciences, College of Health Sciences, Addis Ababa University, P.O.Box 1176, Addis Ababa, Ethiopia
| | - Darren Smith
- The Hub for Biotechnology in the Built Environment, Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Teemu Smura
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Piyada Supasa
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Chayaporn Suphavilai
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore
| | - Jeremy Swann
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Houriiyah Tegally
- Centre for Epidemic Response and Innovation (CERI), Stellenbosch University, South Africa
| | - Bryan Tegomoh
- Centre de Coordination des Opérations d'Urgences de Santé Publique, Ministere de Sante Publique, Cameroun
- University of California, Berkeley, Berkeley, California, USA
- Nebraska Department of Health and Human Services, Lincoln, Nebraska, USA
| | - Olli Vapalahti
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Andreas Walker
- Institute of Virology, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Robert J Wilkinson
- Francis Crick Institute, London, UK
- Centre for Infectious Diseases Research in Africa, University of Cape Town
- Imperial College London, UK
| | | | - Xavier Zair
- Saw Swee Hock School of Public Health, National Univeristy of Singapore
| | - Tulio de Oliveira
- Centre for Epidemic Response and Innovation (CERI), Stellenbosch University, South Africa
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), University of KwaZulu-Natal, South Africa
| | - Timothy Ea Peto
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Derrick Crook
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Russell Corbett-Detig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA
| | - Zamin Iqbal
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Milner Centre for Evolution, University of Bath, UK
| |
Collapse
|