1
|
Laskar R, Hoque M, Ali S. Phylogeogenomic analysis of the earliest reported sequences of SARS-CoV-2 from 161 countries. APMIS 2025; 133:e13499. [PMID: 39563179 DOI: 10.1111/apm.13499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 10/31/2024] [Indexed: 11/21/2024]
Abstract
The SARS-CoV-2 is the causative agent of COVID-19 whose evolutionary path with geographical context forms the focus of present study. The first reported sequence from each of the 161 countries was downloaded from the GISAID database. Multiple sequence alignment was performed using MAFFT v.7, and a TCS-based network was constructed using PopART v.1.7. A total of 27 proteins were analyzed including structural and non-structural proteins. NSP3 and NSP12, responsible for viral replication and RNA synthesis, respectively, had the highest mutation incidence and frequency among non-structural proteins. The spike (S) protein, critical for viral attachment and entry, had the highest prevalence and frequency of mutations. ORF3a had the highest mutation incidence and frequency among accessory proteins. The phylogeogenomic network identified six haplogroups containing 35 sequences, while the remaining sequences belonged to different haplotypes. The virus's genetic distinctiveness was higher in European genomes, with four haplogroups dominated by Europe-linked sequences. The triangular-shaped pattern observed in the virus's evolutionary path suggests that it spread to different continents from Asia. Multiple transmission pathways connecting different countries affirm the virus's ability to emerge in multiple countries by early 2020. The possibility of new species emergence through "saltation" due to the pandemic is also discussed.
Collapse
Affiliation(s)
- Rezwanuzzaman Laskar
- Clinical and Applied Genomics (CAG) Laboratory, Department of Biological Sciences, Aliah University, Kolkata, India
| | - Mehboob Hoque
- Applied Bio-Chemistry (ABC) Lab, Department of Biological Sciences, Aliah University, Kolkata, India
| | - Safdar Ali
- Clinical and Applied Genomics (CAG) Laboratory, Department of Biological Sciences, Aliah University, Kolkata, India
| |
Collapse
|
2
|
Fuhrmann L, Jablonski KP, Topolsky I, Batavia AA, Borgsmüller N, Baykal PI, Carrara M, Chen C, Dondi A, Dragan M, Dreifuss D, John A, Langer B, Okoniewski M, du Plessis L, Schmitt U, Singer F, Stadler T, Beerenwinkel N. V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation. Gigascience 2024; 13:giae065. [PMID: 39347649 PMCID: PMC11440432 DOI: 10.1093/gigascience/giae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 06/11/2024] [Accepted: 08/13/2024] [Indexed: 10/01/2024] Open
Abstract
The large amount and diversity of viral genomic datasets generated by next-generation sequencing technologies poses a set of challenges for computational data analysis workflows, including rigorous quality control, scaling to large sample sizes, and tailored steps for specific applications. Here, we present V-pipe 3.0, a computational pipeline designed for analyzing next-generation sequencing data of short viral genomes. It is developed to enable reproducible, scalable, adaptable, and transparent inference of genetic diversity of viral samples. By presenting 2 large-scale data analysis projects, we demonstrate the effectiveness of V-pipe 3.0 in supporting sustainable viral genomic data science.
Collapse
Affiliation(s)
- Lara Fuhrmann
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Kim Philipp Jablonski
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Ivan Topolsky
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Aashil A Batavia
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Nico Borgsmüller
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Pelin Icer Baykal
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Matteo Carrara
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- NEXUS Personalized Health Technologies, ETH Zurich, Basel 4058, Switzerland
| | - Chaoran Chen
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Arthur Dondi
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Monica Dragan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - David Dreifuss
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Anika John
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Benjamin Langer
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
| | | | - Louis du Plessis
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Uwe Schmitt
- Scientific IT Services, ETH Zurich, Zurich 8092, Switzerland
| | - Franziska Singer
- NEXUS Personalized Health Technologies, ETH Zurich, Basel 4058, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| |
Collapse
|
3
|
Tan M, Xia J, Luo H, Meng G, Zhu Z. Applying the digital data and the bioinformatics tools in SARS-CoV-2 research. Comput Struct Biotechnol J 2023; 21:4697-4705. [PMID: 37841328 PMCID: PMC10568291 DOI: 10.1016/j.csbj.2023.09.044] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/29/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023] Open
Abstract
Bioinformatics has been playing a crucial role in the scientific progress to fight against the pandemic of the coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The advances in novel algorithms, mega data technology, artificial intelligence and deep learning assisted the development of novel bioinformatics tools to analyze daily increasing SARS-CoV-2 data in the past years. These tools were applied in genomic analyses, evolutionary tracking, epidemiological analyses, protein structure interpretation, studies in virus-host interaction and clinical performance. To promote the in-silico analysis in the future, we conducted a review which summarized the databases, web services and software applied in SARS-CoV-2 research. Those digital resources applied in SARS-CoV-2 research may also potentially contribute to the research in other coronavirus and non-coronavirus viruses.
Collapse
Affiliation(s)
- Meng Tan
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Jiaxin Xia
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Haitao Luo
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Geng Meng
- College of Veterinary Medicine, China Agricultural University, Beijing, China
| | - Zhenglin Zhu
- School of Life Sciences, Chongqing University, Chongqing, China
| |
Collapse
|
4
|
Sun S, Cheng F, Han D, Wei S, Zhong A, Massoudian S, Johnson AB. Pairwise comparative analysis of six haplotype assembly methods based on users' experience. BMC Genom Data 2023; 24:35. [PMID: 37386408 PMCID: PMC10311811 DOI: 10.1186/s12863-023-01134-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open
Abstract
BACKGROUND A haplotype is a set of DNA variants inherited together from one parent or chromosome. Haplotype information is useful for studying genetic variation and disease association. Haplotype assembly (HA) is a process of obtaining haplotypes using DNA sequencing data. Currently, there are many HA methods with their own strengths and weaknesses. This study focused on comparing six HA methods or algorithms: HapCUT2, MixSIH, PEATH, WhatsHap, SDhaP, and MAtCHap using two NA12878 datasets named hg19 and hg38. The 6 HA algorithms were run on chromosome 10 of these two datasets, each with 3 filtering levels based on sequencing depth (DP1, DP15, and DP30). Their outputs were then compared. RESULT Run time (CPU time) was compared to assess the efficiency of 6 HA methods. HapCUT2 was the fastest HA for 6 datasets, with run time consistently under 2 min. In addition, WhatsHap was relatively fast, and its run time was 21 min or less for all 6 datasets. The other 4 HA algorithms' run time varied across different datasets and coverage levels. To assess their accuracy, pairwise comparisons were conducted for each pair of the six packages by generating their disagreement rates for both haplotype blocks and Single Nucleotide Variants (SNVs). The authors also compared them using switch distance (error), i.e., the number of positions where two chromosomes of a certain phase must be switched to match with the known haplotype. HapCUT2, PEATH, MixSIH, and MAtCHap generated output files with similar numbers of blocks and SNVs, and they had relatively similar performance. WhatsHap generated a much larger number of SNVs in the hg19 DP1 output, which caused it to have high disagreement percentages with other methods. However, for the hg38 data, WhatsHap had similar performance as the other 4 algorithms, except SDhaP. The comparison analysis showed that SDhaP had a much larger disagreement rate when it was compared with the other algorithms in all 6 datasets. CONCLUSION The comparative analysis is important because each algorithm is different. The findings of this study provide a deeper understanding of the performance of currently available HA algorithms and useful input for other users.
Collapse
Affiliation(s)
- Shuying Sun
- Department of Mathematics, Texas State University, San Marcos, TX USA
| | - Flora Cheng
- Carnegie Mellon University, Pittsburgh, PA USA
| | - Daphne Han
- Carnegie Mellon University, Pittsburgh, PA USA
| | - Sarah Wei
- Massachusetts Institute of Technology, Cambridge, MA USA
| | | | | | | |
Collapse
|
5
|
Li Y, Han L, Wang Y, Wang X, Jia L, Li J, Han J, Zhao J, Li H, Li L. Establishment and application of a method of tagged-amplicon deep sequencing for low-abundance drug resistance in HIV-1. Front Microbiol 2022; 13:895227. [PMID: 36071961 PMCID: PMC9444182 DOI: 10.3389/fmicb.2022.895227] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 07/29/2022] [Indexed: 11/16/2022] Open
Abstract
In the latest HIV-1 global drug resistance report released by WHO, countries are advised to strengthen pre-treatment monitoring of drug resistance in AIDS patients. In this study, we established an NGS-based segmented amplification HIV-1 drug resistance mutation detection method. The pol region of HIV-1 was divided into three short fragments for NGS. The entire amplification and sequencing panel were more cost-effective and batched by using the barcode sequence corresponding to the sample. Each parameter was evaluated using samples with known resistance variants frequencies. The nucleotide sequence error rate, amino acid error rate, and noise value of the NGS-based segmented amplification method were both less than 1%. When the threshold was 2%, the consensus sequences of the HIV-1 NL4-3 strain were completely consistent with the Sanger sequences. This method can detect the minimum viral load of the sample at 102 copies/ml, and the input frequency and detection frequency of HIV-1 resistance mutations within the range of 1%–100% had good conformity (R2 = 0.9963; R2 = 0.9955). This method had no non-specific amplification for Hepatitis B and C. Under the 2% threshold, the incidence of surveillance drug resistance mutations in ART-naive HIV-infected patients was 20.69%, among which NRTIs class resistance mutations were mainly.
Collapse
Affiliation(s)
- Yang Li
- School of Basic Medical Sciences, Anhui Medical University, Hefei, Anhui, China
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| | - Leilei Han
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
- School of Public Health, North China University of Science and Technology, Tangshan, Hebei, China
| | - Yanglan Wang
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Xiaolin Wang
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| | - Lei Jia
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| | - Jingyun Li
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| | - Jingwan Han
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
| | - Jin Zhao
- Shenzhen Center for Disease Control and Prevention, Shenzhen, Guangdong, China
| | - Hanping Li
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
- *Correspondence: Hanping Li,
| | - Lin Li
- School of Basic Medical Sciences, Anhui Medical University, Hefei, Anhui, China
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, China
- Lin Li,
| |
Collapse
|
6
|
Kumar S, Kumar GS, Maitra SS, Malý P, Bharadwaj S, Sharma P, Dwivedi VD. Viral informatics: bioinformatics-based solution for managing viral infections. Brief Bioinform 2022; 23:6659740. [PMID: 35947964 DOI: 10.1093/bib/bbac326] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 06/26/2022] [Accepted: 07/18/2022] [Indexed: 11/13/2022] Open
Abstract
Several new viral infections have emerged in the human population and establishing as global pandemics. With advancements in translation research, the scientific community has developed potential therapeutics to eradicate or control certain viral infections, such as smallpox and polio, responsible for billions of disabilities and deaths in the past. Unfortunately, some viral infections, such as dengue virus (DENV) and human immunodeficiency virus-1 (HIV-1), are still prevailing due to a lack of specific therapeutics, while new pathogenic viral strains or variants are emerging because of high genetic recombination or cross-species transmission. Consequently, to combat the emerging viral infections, bioinformatics-based potential strategies have been developed for viral characterization and developing new effective therapeutics for their eradication or management. This review attempts to provide a single platform for the available wide range of bioinformatics-based approaches, including bioinformatics methods for the identification and management of emerging or evolved viral strains, genome analysis concerning the pathogenicity and epidemiological analysis, computational methods for designing the viral therapeutics, and consolidated information in the form of databases against the known pathogenic viruses. This enriched review of the generally applicable viral informatics approaches aims to provide an overview of available resources capable of carrying out the desired task and may be utilized to expand additional strategies to improve the quality of translation viral informatics research.
Collapse
Affiliation(s)
- Sanjay Kumar
- School of Biotechnology, Jawaharlal Nehru University, New Delhi, India.,Center for Bioinformatics, Computational and Systems Biology, Pathfinder Research and Training Foundation, Greater Noida, India
| | - Geethu S Kumar
- Department of Life Science, School of Basic Science and Research, Sharda University, Greater Noida, Uttar Pradesh, India.,Center for Bioinformatics, Computational and Systems Biology, Pathfinder Research and Training Foundation, Greater Noida, India
| | | | - Petr Malý
- Laboratory of Ligand Engineering, Institute of Biotechnology of the Czech Academy of Sciences v.v.i., BIOCEV Research Center, Vestec, Czech Republic
| | - Shiv Bharadwaj
- Laboratory of Ligand Engineering, Institute of Biotechnology of the Czech Academy of Sciences v.v.i., BIOCEV Research Center, Vestec, Czech Republic
| | - Pradeep Sharma
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, India
| | - Vivek Dhar Dwivedi
- Center for Bioinformatics, Computational and Systems Biology, Pathfinder Research and Training Foundation, Greater Noida, India.,Institute of Advanced Materials, IAAM, 59053 Ulrika, Sweden
| |
Collapse
|
7
|
Guang A, Howison M, Ledingham L, D’Antuono M, Chan PA, Lawrence C, Dunn CW, Kantor R. Incorporating Within-Host Diversity in Phylogenetic Analyses for Detecting Clusters of New HIV Diagnoses. Front Microbiol 2022; 12:803190. [PMID: 35250908 PMCID: PMC8891961 DOI: 10.3389/fmicb.2021.803190] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 12/22/2021] [Indexed: 11/29/2022] Open
Abstract
Background Phylogenetic analyses of HIV sequences are used to detect clusters and inform public health interventions. Conventional approaches summarize within-host HIV diversity with a single consensus sequence per host of the pol gene, obtained from Sanger or next-generation sequencing (NGS). There is growing recognition that this approach discards potentially important information about within-host sequence variation, which can impact phylogenetic inference. However, whether alternative summary methods that incorporate intra-host variation impact phylogenetic inference of transmission network features is unknown. Methods We introduce profile sampling, a method to incorporate within-host NGS sequence diversity into phylogenetic HIV cluster inference. We compare this approach to Sanger- and NGS-derived pol and near-whole-genome consensus sequences and evaluate its potential benefits in identifying molecular clusters among all newly-HIV-diagnosed individuals over six months at the largest HIV center in Rhode Island. Results Profile sampling cluster inference demonstrated that within-host viral diversity impacts phylogenetic inference across individuals, and that consensus sequence approaches can obscure both magnitude and effect of these impacts. Clustering differed between Sanger- and NGS-derived consensus and profile sampling sequences, and across gene regions. Discussion Profile sampling can incorporate within-host HIV diversity captured by NGS into phylogenetic analyses. This additional information can improve robustness of cluster detection.
Collapse
Affiliation(s)
- August Guang
- Center for Computational Biology of Human Disease, Brown University, Providence, RI, United States
- Center for Computation and Visualization, Brown University, Providence, RI, United States
- *Correspondence: August Guang,
| | - Mark Howison
- Research Improving People’s Lives, Providence, RI, United States
| | - Lauren Ledingham
- Division of Infectious Diseases, The Alpert Medical School, Brown University, Providence, RI, United States
| | - Matthew D’Antuono
- Division of Infectious Diseases, The Alpert Medical School, Brown University, Providence, RI, United States
| | - Philip A. Chan
- Division of Infectious Diseases, The Alpert Medical School, Brown University, Providence, RI, United States
| | - Charles Lawrence
- Division of Applied Mathematics, Brown University, Providence, RI, United States
| | - Casey W. Dunn
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, United States
| | - Rami Kantor
- Division of Infectious Diseases, The Alpert Medical School, Brown University, Providence, RI, United States
| |
Collapse
|
8
|
Gibson KM, Steiner MC, Rentia U, Bendall ML, Pérez-Losada M, Crandall KA. Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses. Viruses 2020; 12:E758. [PMID: 32674515 PMCID: PMC7412389 DOI: 10.3390/v12070758] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/03/2020] [Accepted: 07/06/2020] [Indexed: 01/04/2023] Open
Abstract
Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resolution of NGS data, we developed HAplotype PHylodynamics PIPEline (HAPHPIPE), an open-source tool for the de novo and reference-based assembly of viral NGS data, with both consensus sequence assembly and a focus on the quantification of intra-host variation through haplotype reconstruction. We validate and compare the consensus sequence assembly methods of HAPHPIPE to those of two alternative software packages, HyDRA and Geneious, using simulated HIV and empirical HIV, HCV, and SARS-CoV-2 datasets. Our validation methods included read mapping, genetic distance, and genetic diversity metrics. In simulated NGS data, HAPHPIPE generated pol consensus sequences significantly closer to the true consensus sequence than those produced by HyDRA and Geneious and performed comparably to Geneious for HIV gp120 sequences. Furthermore, using empirical data from multiple viruses, we demonstrate that HAPHPIPE can analyze larger sequence datasets due to its greater computational speed. Therefore, we contend that HAPHPIPE provides a more user-friendly platform for users with and without bioinformatics experience to implement current best practices for viral NGS assembly than other currently available options.
Collapse
Affiliation(s)
- Keylie M. Gibson
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Margaret C. Steiner
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Uzma Rentia
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Matthew L. Bendall
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
- CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, 4169-007 Vairão, Portugal
| | - Keith A. Crandall
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
| |
Collapse
|
9
|
Gibson KM, Jair K, Castel AD, Bendall ML, Wilbourn B, Jordan JA, Crandall KA, Pérez-Losada M. A cross-sectional study to characterize local HIV-1 dynamics in Washington, DC using next-generation sequencing. Sci Rep 2020; 10:1989. [PMID: 32029767 PMCID: PMC7004982 DOI: 10.1038/s41598-020-58410-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 12/31/2019] [Indexed: 11/08/2022] Open
Abstract
Washington, DC continues to experience a generalized HIV-1 epidemic. We characterized the local phylodynamics of HIV-1 in DC using next-generation sequencing (NGS) data. Viral samples from 68 participants from 2016 through 2017 were sequenced and paired with epidemiological data. Phylogenetic and network inferences, drug resistant mutations (DRMs), subtypes and HIV-1 diversity estimations were completed. Haplotypes were reconstructed to infer transmission clusters. Phylodynamic inferences based on the HIV-1 polymerase (pol) and envelope genes (env) were compared. Higher HIV-1 diversity (n.s.) was seen in men who have sex with men, heterosexual, and male participants in DC. 54.0% of the participants contained at least one DRM. The 40-49 year-olds showed the highest prevalence of DRMs (22.9%). Phylogenetic analysis of pol and env sequences grouped 31.9-33.8% of the participants into clusters. HIV-TRACE grouped 2.9-12.8% of participants when using consensus sequences and 9.0-64.2% when using haplotypes. NGS allowed us to characterize the local phylodynamics of HIV-1 in DC more broadly and accurately, given a better representation of its diversity and dynamics. Reconstructed haplotypes provided novel and deeper phylodynamic insights, which led to networks linking a higher number of participants. Our understanding of the HIV-1 epidemic was expanded with the powerful coupling of HIV-1 NGS data with epidemiological data.
Collapse
Grants
- P30 AI117970 NIAID NIH HHS
- U01 AI069503 NIAID NIH HHS
- UM1 AI069503 NIAID NIH HHS
- This study was supported by the DC Cohort Study (U01 AI69503-03S2), a supplement from the Women’s Interagency Study for HIV-1 (410722_GR410708), a DC D-CFAR pilot award, and a 2015 HIV-1 Phylodynamics Supplement award from the District of Columbia for AIDS Research, an NIH funded program (AI117970), which is supported by the following NIH Co-Funding and Participating Institutes and Centers: NIAID, NCI, NICHD, NHLBI, NIDA, NIMH, NIA, FIC, NIGMS, NIDDK and OAR. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Collapse
Affiliation(s)
- Keylie M Gibson
- Computational Biology Institute, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA.
| | - Kamwing Jair
- Department of Epidemiology, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
| | - Amanda D Castel
- Department of Epidemiology, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
| | - Matthew L Bendall
- Computational Biology Institute, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
| | - Brittany Wilbourn
- Department of Epidemiology, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
| | - Jeanne A Jordan
- Department of Epidemiology, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
| | - Keith A Crandall
- Computational Biology Institute, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
- Department of Biostatistics and Bioinformatics, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
| | - Marcos Pérez-Losada
- Computational Biology Institute, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
- Department of Biostatistics and Bioinformatics, The Milken Institute School of Public Health, The George Washington University, Washington, DC, 20052, USA
- CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão, Portugal
| |
Collapse
|