1
|
Petrillo M, Querci M, Brogna C, Ponti J, Cristoni S, Markov PV, Valsesia A, Leoni G, Benedetti A, Wiss T, Van den Eede G. Evidence of SARS-CoV-2 bacteriophage potential in human gut microbiota. F1000Res 2025; 11:292. [PMID: 40444030 PMCID: PMC12120431 DOI: 10.12688/f1000research.109236.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/17/2025] [Indexed: 06/02/2025] Open
Abstract
Background In previous studies we have shown that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) replicates in vitro in bacterial growth medium, that the viral replication follows bacterial growth, and it is influenced by the administration of specific antibiotics. These observations are compatible with a 'bacteriophage-like' behaviour of SARS-CoV-2. Methods We have further elaborated on these unusual findings and here we present the results of three different supplementary experiments: (1) an electron-microscope analysis of samples of bacteria obtained from a faecal sample of a subject positive to SARS-CoV-2; (2) mass spectrometric analysis of these cultures to assess the eventual de novo synthesis of SARS-CoV-2 spike protein; (3) sequencing of SARS-CoV-2 collected from plaques obtained from two different gut microbial bacteria inoculated with supernatant from faecal microbiota of an individual positive to SARS-CoV-2. Results Immuno-labelling with Anti-SARS-CoV-2 nucleocapsid protein antibody confirmed presence of SARS-CoV-2 both outside and inside bacteria. De novo synthesis of SARS-CoV-2 spike protein was observed, as evidence that SARS-CoV-2 RNA is translated in the bacterial cultures. In addition, phage-like plaques were spotted on faecal bacteria cultures after inoculation with supernatant from faecal microbiota of an individual positive to SARS-CoV-2. Bioinformatic analyses on the reads obtained by sequencing RNA extracted from the plaques revealed nucleic acid polymorphisms, suggesting different replication environment in the two bacterial cultures. Conclusions Based on these results we conclude that, in addition to its well-documented interactions with eukaryotic cells, SARS-CoV-2 may act as a bacteriophage when interacting with at least two bacterial species known to be present in the human microbiota. If the hypothesis proposed, i.e., that under certain conditions SARS-CoV-2 may multiply at the expense of human gut bacteria, is further substantiated, it would drastically change the model of acting and infecting of SARS-CoV-2, and most likely that of other human pathogenic viruses.
Collapse
Affiliation(s)
| | | | | | - Jessica Ponti
- European Commission Joint Research Centre, Ispra, 21027, Italy
| | | | - Peter V Markov
- European Commission Joint Research Centre, Ispra, 21027, Italy
| | - Andrea Valsesia
- European Commission Joint Research Centre, Ispra, 21027, Italy
| | - Gabriele Leoni
- European Commission Joint Research Centre, Ispra, 21027, Italy
- International School for Advanced Studies (SISSA), Trieste, 34136, Italy
| | | | - Thierry Wiss
- European Commission Joint Research Centre, Karlsruhe, 76344, Germany
| | | |
Collapse
|
2
|
Petrillo M, Querci M, Brogna C, Ponti J, Cristoni S, Markov PV, Valsesia A, Leoni G, Benedetti A, Wiss T, Van den Eede G. Evidence of SARS-CoV-2 bacteriophage potential in human gut microbiota. F1000Res 2025; 11:292. [PMID: 40444030 PMCID: PMC12120431 DOI: 10.12688/f1000research.109236.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/17/2025] [Indexed: 06/11/2025] Open
Abstract
BACKGROUND In previous studies we have shown that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) replicates in vitro in bacterial growth medium, that the viral replication follows bacterial growth, and it is influenced by the administration of specific antibiotics. These observations are compatible with a 'bacteriophage-like' behaviour of SARS-CoV-2. METHODS We have further elaborated on these unusual findings and here we present the results of three different supplementary experiments: (1) an electron-microscope analysis of samples of bacteria obtained from a faecal sample of a subject positive to SARS-CoV-2; (2) mass spectrometric analysis of these cultures to assess the eventual de novo synthesis of SARS-CoV-2 spike protein; (3) sequencing of SARS-CoV-2 collected from plaques obtained from two different gut microbial bacteria inoculated with supernatant from faecal microbiota of an individual positive to SARS-CoV-2. RESULTS Immuno-labelling with Anti-SARS-CoV-2 nucleocapsid protein antibody confirmed presence of SARS-CoV-2 both outside and inside bacteria. De novo synthesis of SARS-CoV-2 spike protein was observed, as evidence that SARS-CoV-2 RNA is translated in the bacterial cultures. In addition, phage-like plaques were spotted on faecal bacteria cultures after inoculation with supernatant from faecal microbiota of an individual positive to SARS-CoV-2. Bioinformatic analyses on the reads obtained by sequencing RNA extracted from the plaques revealed nucleic acid polymorphisms, suggesting different replication environment in the two bacterial cultures. CONCLUSIONS Based on these results we conclude that, in addition to its well-documented interactions with eukaryotic cells, SARS-CoV-2 may act as a bacteriophage when interacting with at least two bacterial species known to be present in the human microbiota. If the hypothesis proposed, i.e., that under certain conditions SARS-CoV-2 may multiply at the expense of human gut bacteria, is further substantiated, it would drastically change the model of acting and infecting of SARS-CoV-2, and most likely that of other human pathogenic viruses.
Collapse
Affiliation(s)
| | | | | | - Jessica Ponti
- European Commission Joint Research Centre, Ispra, 21027, Italy
| | | | - Peter V Markov
- European Commission Joint Research Centre, Ispra, 21027, Italy
| | - Andrea Valsesia
- European Commission Joint Research Centre, Ispra, 21027, Italy
| | - Gabriele Leoni
- European Commission Joint Research Centre, Ispra, 21027, Italy
- International School for Advanced Studies (SISSA), Trieste, 34136, Italy
| | | | - Thierry Wiss
- European Commission Joint Research Centre, Karlsruhe, 76344, Germany
| | | |
Collapse
|
3
|
Penn MJ, Scheidwasser N, Penn J, Donnelly CA, Duchêne DA, Bhatt S. Leaping through Tree Space: Continuous Phylogenetic Inference for Rooted and Unrooted Trees. Genome Biol Evol 2023; 15:evad213. [PMID: 38085949 PMCID: PMC10745275 DOI: 10.1093/gbe/evad213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2023] [Indexed: 12/24/2023] Open
Abstract
Phylogenetics is now fundamental in life sciences, providing insights into the earliest branches of life and the origins and spread of epidemics. However, finding suitable phylogenies from the vast space of possible trees remains challenging. To address this problem, for the first time, we perform both tree exploration and inference in a continuous space where the computation of gradients is possible. This continuous relaxation allows for major leaps across tree space in both rooted and unrooted trees, and is less susceptible to convergence to local minima. Our approach outperforms the current best methods for inference on unrooted trees and, in simulation, accurately infers the tree and root in ultrametric cases. The approach is effective in cases of empirical data with negligible amounts of data, which we demonstrate on the phylogeny of jawed vertebrates. Indeed, only a few genes with an ultrametric signal were generally sufficient for resolving the major lineages of vertebrates. Optimization is possible via automatic differentiation and our method presents an effective way forward for exploring the most difficult, data-deficient phylogenetic questions.
Collapse
Affiliation(s)
- Matthew J Penn
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Neil Scheidwasser
- Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
| | - Joseph Penn
- Department of Physics, University of Oxford, Oxford, United Kingdom
| | - Christl A Donnelly
- Department of Statistics, University of Oxford, Oxford, United Kingdom
- Pandemic Sciences Institute, University of Oxford, Oxford, United Kingdom
- Department of Infectious Disease Epidemiology, MRC Centre for Global Infectious Disease Analysis, School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - David A Duchêne
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Samir Bhatt
- Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
- Department of Infectious Disease Epidemiology, MRC Centre for Global Infectious Disease Analysis, School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
4
|
Wei Y, Zou Q, Tang F, Yu L. WMSA: a novel method for multiple sequence alignment of DNA sequences. Bioinformatics 2022; 38:5019-5025. [PMID: 36179076 DOI: 10.1093/bioinformatics/btac658] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/30/2022] [Accepted: 09/29/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Multiple sequence alignment (MSA) is a fundamental problem in bioinformatics. The quality of alignment will affect downstream analysis. MAFFT has adopted the Fast Fourier Transform method for searching the homologous segments and using them as anchors to divide the sequences, then making alignment only on segments, which can save time and memory without overly reducing the sequence alignment quality. MAFFT becomes slow when the dataset is large. RESULTS We made a software, WMSA, which uses the divide-and-conquer method to split the sequences into clusters, aligns those clusters into profiles with the center star strategy and then makes a progressive profile-profile alignment. The alignment is conducted by the compiled algorithms of MAFFT, K-Band with multithread parallelism. Our method can balance time, space and quality and performs better than MAFFT in test experiments on highly conserved datasets. AVAILABILITY AND IMPLEMENTATION Source code is freely available at https://github.com/malabz/WMSA/, which is implemented in C/C++ and supported on Linux, and datasets are available at https://github.com/malabz/WMSA-dataset. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yanming Wei
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324003, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| |
Collapse
|
5
|
Sokhansanj BA, Rosen GL. Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences. mSystems 2022; 7:e0003522. [PMID: 35311562 PMCID: PMC9040592 DOI: 10.1128/msystems.00035-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2022] [Indexed: 12/22/2022] Open
Abstract
Next-generation sequencing has been essential to the global response to the COVID-19 pandemic. As of January 2022, nearly 7 million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences are available to researchers in public databases. Sequence databases are an abundant resource from which to extract biologically relevant and clinically actionable information. As the pandemic has gone on, SARS-CoV-2 has rapidly evolved, involving complex genomic changes that challenge current approaches to classifying SARS-CoV-2 variants. Deep sequence learning could be a potentially powerful way to build complex sequence-to-phenotype models. Unfortunately, while they can be predictive, deep learning typically produces "black box" models that cannot directly provide biological and clinical insight. Researchers should therefore consider implementing emerging methods for visualizing and interpreting deep sequence models. Finally, researchers should address important data limitations, including (i) global sequencing disparities, (ii) insufficient sequence metadata, and (iii) screening artifacts due to poor sequence quality control.
Collapse
Affiliation(s)
- Bahrad A. Sokhansanj
- Drexel University, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA
| | - Gail L. Rosen
- Drexel University, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA
| |
Collapse
|
6
|
Chao J, Tang F, Xu L. Developments in Algorithms for Sequence Alignment: A Review. Biomolecules 2022; 12:biom12040546. [PMID: 35454135 PMCID: PMC9024764 DOI: 10.3390/biom12040546] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/29/2022] [Accepted: 03/31/2022] [Indexed: 01/27/2023] Open
Abstract
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized.
Collapse
Affiliation(s)
- Jiannan Chao
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China;
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003, China;
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
- Correspondence:
| |
Collapse
|
7
|
Beck KL, Seabolt E, Agarwal A, Nayar G, Bianco S, Krishnareddy H, Ngo TA, Kunitomi M, Mukherjee V, Kaufman JH. Semi-Supervised Pipeline for Autonomous Annotation of SARS-CoV-2 Genomes. Viruses 2021; 13:2426. [PMID: 34960694 PMCID: PMC8706859 DOI: 10.3390/v13122426] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 11/17/2021] [Accepted: 11/20/2021] [Indexed: 12/12/2022] Open
Abstract
SARS-CoV-2 genomic sequencing efforts have scaled dramatically to address the current global pandemic and aid public health. However, autonomous genome annotation of SARS-CoV-2 genes, proteins, and domains is not readily accomplished by existing methods and results in missing or incorrect sequences. To overcome this limitation, we developed a novel semi-supervised pipeline for automated gene, protein, and functional domain annotation of SARS-CoV-2 genomes that differentiates itself by not relying on the use of a single reference genome and by overcoming atypical genomic traits that challenge traditional bioinformatic methods. We analyzed an initial corpus of 66,000 SARS-CoV-2 genome sequences collected from labs across the world using our method and identified the comprehensive set of known proteins with 98.5% set membership accuracy and 99.1% accuracy in length prediction, compared to proteome references, including Replicase polyprotein 1ab (with its transcriptional slippage site). Compared to other published tools, such as Prokka (base) and VAPiD, we yielded a 6.4- and 1.8-fold increase in protein annotations. Our method generated 13,000,000 gene, protein, and domain sequences-some conserved across time and geography and others representing emerging variants. We observed 3362 non-redundant sequences per protein on average within this corpus and described key D614G and N501Y variants spatiotemporally in the initial genome corpus. For spike glycoprotein domains, we achieved greater than 97.9% sequence identity to references and characterized receptor binding domain variants. We further demonstrated the robustness and extensibility of our method on an additional 4000 variant diverse genomes containing all named variants of concern and interest as of August 2021. In this cohort, we successfully identified all keystone spike glycoprotein mutations in our predicted protein sequences with greater than 99% accuracy as well as demonstrating high accuracy of the protein and domain annotations. This work comprehensively presents the molecular targets to refine biomedical interventions for SARS-CoV-2 with a scalable, high-accuracy method to analyze newly sequenced infections as they arise.
Collapse
Affiliation(s)
- Kristen L. Beck
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Edward Seabolt
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Akshay Agarwal
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Gowri Nayar
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Simone Bianco
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
- NSF Center for Cellular Construction, San Francisco, CA 94158, USA
| | - Harsha Krishnareddy
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Timothy A. Ngo
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Mark Kunitomi
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - Vandana Mukherjee
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| | - James H. Kaufman
- AI and Cognitive Software, IBM Almaden Research Center, San Jose, CA 95120, USA; (A.A.); (G.N.); (S.B.); (H.K.); (T.A.N.); (M.K.); (V.M.); (J.H.K.)
| |
Collapse
|
8
|
Leong LEX, Soubrier J, Turra M, Denehy E, Walters L, Kassahn K, Higgins G, Dodd T, Hall R, D'Onise K, Spurrier N, Bastian I, Lim CK. Whole-Genome Sequencing of SARS-CoV-2 from Quarantine Hotel Outbreak. Emerg Infect Dis 2021; 27:2219-2221. [PMID: 34287141 PMCID: PMC8314830 DOI: 10.3201/eid2708.204875] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Hotel quarantine for international travelers has been used to prevent coronavirus disease spread into Australia. A quarantine hotel–associated community outbreak was detected in South Australia. Real-time genomic sequencing enabled rapid confirmation tracking the outbreak to a recently returned traveler and linked 2 cases of infection in travelers at the same facility.
Collapse
|