1
|
Fiam RN, István C, Norbert S. Comparing full variation profile analysis with the conventional consensus method in SARS-CoV-2 phylogeny. Brief Bioinform 2024; 25:bbae296. [PMID: 38920083 PMCID: PMC11199993 DOI: 10.1093/bib/bbae296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 05/13/2024] [Indexed: 06/27/2024] Open
Abstract
This study proposes a novel approach to studying severe acute respiratory syndrome coronavirus 2 virus mutations through sequencing data comparison. Traditional consensus-based methods, which focus on the most common nucleotide at each position, might overlook or obscure the presence of low-frequency variants. Our method, in contrast, retains all sequenced nucleotides at each position, forming a genomic matrix. Utilizing simulated short reads from genomes with specified mutations, we contrasted our genomic matrix approach with the consensus sequence method. Our matrix methodology, across multiple simulated datasets, accurately reflected the known mutations with an average accuracy improvement of 20% over the consensus method. In real-world tests using data from GISAID and NCBI-SRA, our approach demonstrated an increase in reliability by reducing the error margin by approximately 15%. The genomic matrix approach offers a more accurate representation of the viral genomic diversity, thereby providing superior insights into virus evolution and epidemiology.
Collapse
Affiliation(s)
- Regina Nóra Fiam
- Department of Physics of Complex Systems, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Csabai István
- Department of Physics of Complex Systems, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Solymosi Norbert
- Department of Physics of Complex Systems, Eötvös Loránd University, 1117 Budapest, Hungary
- Centre for Bioinformatics, University of Veterinary Medicine, 1078 Budapest, Hungary
| |
Collapse
|
2
|
Miura S, Dolker T, Sanderford M, Kumar S. Improving cellular phylogenies through the integrated use of mutation order and optimality principles. Comput Struct Biotechnol J 2023; 21:3894-3903. [PMID: 37602230 PMCID: PMC10432911 DOI: 10.1016/j.csbj.2023.07.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 07/10/2023] [Accepted: 07/19/2023] [Indexed: 08/22/2023] Open
Abstract
The study of tumor evolution is being revolutionalized by single-cell sequencing technologies that survey the somatic variation of cancer cells. In these endeavors, reliable inference of the evolutionary relationship of single cells is a key step. However, single-cell sequences contain many errors and missing bases, which necessitate advancing standard molecular phylogenetics approaches for applications in analyzing these datasets. We have developed a computational approach that integratively applies standard phylogenetic optimality principles and patterns of co-occurrence of sequence variations to produce more expansive and accurate cellular phylogenies from single-cell sequence datasets. We found the new approach to also perform well for CRISPR/Cas9 genome editing datasets, suggesting that it can be useful for various applications. We apply the new approach to some empirical datasets to showcase its use for reconstructing recurrent mutations and mutational reversals as well as for phylodynamics analysis to infer metastatic cell migrations between tumors.
Collapse
Affiliation(s)
- Sayaka Miura
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Tenzin Dolker
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Maxwell Sanderford
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| |
Collapse
|
3
|
Neher RA. Contributions of adaptation and purifying selection to SARS-CoV-2 evolution. Virus Evol 2022; 8:veac113. [PMID: 37593203 PMCID: PMC10431346 DOI: 10.1093/ve/veac113] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 10/30/2022] [Accepted: 12/05/2022] [Indexed: 08/19/2023] Open
Abstract
Continued evolution and adaptation of SARS-CoV-2 has led to more transmissible and immune-evasive variants with profound impacts on the course of the pandemic. Here I analyze the evolution of the virus over 2.5 years since its emergence and estimate the rates of evolution for synonymous and non-synonymous changes separately for evolution within clades-well-defined monophyletic groups with gradual evolution-and for the pandemic overall. The rate of synonymous mutation is found to be around 6 changes per year. Synonymous rates within variants vary little from variant to variant and are compatible with the overall rate of 7 changes per year (or [Formula: see text] per year and codon). In contrast, the rate at which variants accumulate amino acid changes (non-synonymous mutations) was initially around 12-16 changes per year, but in 2021 and 2022 it dropped to 6-9 changes per year. The overall rate of non-synonymous evolution, that is across variants, is estimated to be about 26 amino acid changes per year (or [Formula: see text] per year and codon). This strong acceleration of the overall rate compared to within clade evolution indicates that the evolutionary process that gave rise to the different variants is qualitatively different from that in typical transmission chains and likely dominated by adaptive evolution. I further quantify the spectrum of mutations and purifying selection in different SARS-CoV-2 proteins and show that the massive global sampling of SARS-CoV-2 is sufficient to estimate site-specific fitness costs across the entire genome. Many accessory proteins evolve under limited evolutionary constraints with little short-term purifying selection. About half of the mutations in other proteins are strongly deleterious.
Collapse
Affiliation(s)
- Richard A Neher
- Biozentrum, University of Basel, Spitalstrasse 41, Basel
4053, Switzerland
- Swiss Institute of Bioinformatics, Spitalstrasse 41, Basel
4053, Switzerland
| |
Collapse
|
5
|
Pekar JE, Magee A, Parker E, Moshiri N, Izhikevich K, Havens JL, Gangavarapu K, Malpica Serrano LM, Crits-Christoph A, Matteson NL, Zeller M, Levy JI, Wang JC, Hughes S, Lee J, Park H, Park MS, Ching KZY, Lin RTP, Mat Isa MN, Noor YM, Vasylyeva TI, Garry RF, Holmes EC, Rambaut A, Suchard MA, Andersen KG, Worobey M, Wertheim JO. The molecular epidemiology of multiple zoonotic origins of SARS-CoV-2. Science 2022; 377:960-966. [PMID: 35881005 PMCID: PMC9348752 DOI: 10.1126/science.abp8337] [Citation(s) in RCA: 114] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 07/18/2022] [Indexed: 01/08/2023]
Abstract
Understanding the circumstances that lead to pandemics is important for their prevention. We analyzed the genomic diversity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) early in the coronavirus disease 2019 (COVID-19) pandemic. We show that SARS-CoV-2 genomic diversity before February 2020 likely comprised only two distinct viral lineages, denoted "A" and "B." Phylodynamic rooting methods, coupled with epidemic simulations, reveal that these lineages were the result of at least two separate cross-species transmission events into humans. The first zoonotic transmission likely involved lineage B viruses around 18 November 2019 (23 October to 8 December), and the separate introduction of lineage A likely occurred within weeks of this event. These findings indicate that it is unlikely that SARS-CoV-2 circulated widely in humans before November 2019 and define the narrow window between when SARS-CoV-2 first jumped into humans and when the first cases of COVID-19 were reported. As with other coronaviruses, SARS-CoV-2 emergence likely resulted from multiple zoonotic events.
Collapse
Affiliation(s)
- Jonathan E. Pekar
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA 92093, USA
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrew Magee
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Edyth Parker
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Niema Moshiri
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Katherine Izhikevich
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA
- Department of Mathematics, University of California San Diego, La Jolla, CA 92093, USA
| | - Jennifer L. Havens
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Karthik Gangavarapu
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | | | - Alexander Crits-Christoph
- W. Harry Feinstone Department of Molecular Microbiology and Immunology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland 21205, USA
| | - Nathaniel L. Matteson
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Mark Zeller
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Joshua I. Levy
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Jade C. Wang
- New York City Public Health Laboratory, New York City Department of Health and Mental Hygiene, New York, NY 11101, USA
| | - Scott Hughes
- New York City Public Health Laboratory, New York City Department of Health and Mental Hygiene, New York, NY 11101, USA
| | - Jungmin Lee
- Department of Microbiology, Institute for Viral Diseases, Biosafety Center, College of Medicine, Korea University, Seoul, South Korea
| | - Heedo Park
- Department of Microbiology, Institute for Viral Diseases, Biosafety Center, College of Medicine, Korea University, Seoul, South Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul, 02841, Republic of Korea
| | - Man-Seong Park
- Department of Microbiology, Institute for Viral Diseases, Biosafety Center, College of Medicine, Korea University, Seoul, South Korea
- BK21 Graduate Program, Department of Biomedical Sciences, Korea University College of Medicine, Seoul, 02841, Republic of Korea
| | | | - Raymond Tzer Pin Lin
- National Public Health Laboratory, National Centre for Infectious Diseases, Singapore
| | - Mohd Noor Mat Isa
- Malaysia Genome and Vaccine Institute, Jalan Bangi, 43000 Kajang, Selangor, Malaysia
| | - Yusuf Muhammad Noor
- Malaysia Genome and Vaccine Institute, Jalan Bangi, 43000 Kajang, Selangor, Malaysia
| | - Tetyana I. Vasylyeva
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Robert F. Garry
- Tulane University, School of Medicine, Department of Microbiology and Immunology, New Orleans, LA 70112, USA
- Zalgen Labs, LCC, Frederick, MD 21703 USA
- Global Virus Network (GVN), Baltimore, MD 21201, USA
| | - Edward C. Holmes
- Sydney Institute for Infectious Diseases, School of Life and Environmental Sciences and School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - Andrew Rambaut
- Institute of Evolutionary Biology, University of Edinburgh, King's Buildings, Edinburgh, EH9 3FL, UK
| | - Marc A. Suchard
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Kristian G. Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Scripps Research Translational Institute, La Jolla, CA 92037, USA
| | - Michael Worobey
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Joel O. Wertheim
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|