1
|
Weaver S, Dávila Conn VM, Ji D, Verdonk H, Ávila-Ríos S, Leigh Brown AJ, Wertheim JO, Kosakovsky Pond SL. AUTO-TUNE: selecting the distance threshold for inferring HIV transmission clusters. FRONTIERS IN BIOINFORMATICS 2024; 4:1400003. [PMID: 39086842 PMCID: PMC11289888 DOI: 10.3389/fbinf.2024.1400003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 05/17/2024] [Indexed: 08/02/2024] Open
Abstract
Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained heterosexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
Collapse
Affiliation(s)
- Steven Weaver
- Center for Viral Evolution, Temple University, Philadelphia, PA, United States
| | - Vanessa M. Dávila Conn
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Daniel Ji
- Department of Medicine, University of California San Diego, La Jolla, CA, United States
| | - Hannah Verdonk
- Center for Viral Evolution, Temple University, Philadelphia, PA, United States
| | | | - Andrew J. Leigh Brown
- Department of Medicine, University of California San Diego, La Jolla, CA, United States
| | - Joel O. Wertheim
- Department of Medicine, University of California San Diego, La Jolla, CA, United States
| | | |
Collapse
|
2
|
Weaver S, Dávila-Conn V, Ji D, Verdonk H, Ávila-Ríos S, Leigh Brown AJ, Wertheim JO, Kosakovsky Pond SL. AUTO-TUNE: SELECTING THE DISTANCE THRESHOLD FOR INFERRING HIV TRANSMISSION CLUSTERS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.11.584522. [PMID: 38559140 PMCID: PMC10979987 DOI: 10.1101/2024.03.11.584522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained hetero-sexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
Collapse
Affiliation(s)
- Steven Weaver
- Center for Viral Evolution, Temple University, Philadelphia, PA, USA
| | - Vanessa Dávila-Conn
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Daniel Ji
- Department of Computer Science & Engineering, UC San Diego, La Jolla, CA 92093, USA
| | - Hannah Verdonk
- Center for Viral Evolution, Temple University, Philadelphia, PA, USA
| | - Santiago Ávila-Ríos
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Andrew J Leigh Brown
- School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Joel O Wertheim
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | | |
Collapse
|
3
|
Optimized phylogenetic clustering of HIV-1 sequence data for public health applications. PLoS Comput Biol 2022; 18:e1010745. [PMID: 36449514 PMCID: PMC9744331 DOI: 10.1371/journal.pcbi.1010745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 12/12/2022] [Accepted: 11/17/2022] [Indexed: 12/02/2022] Open
Abstract
Clusters of genetically similar infections suggest rapid transmission and may indicate priorities for public health action or reveal underlying epidemiological processes. However, clusters often require user-defined thresholds and are sensitive to non-epidemiological factors, such as non-random sampling. Consequently the ideal threshold for public health applications varies substantially across settings. Here, we show a method which selects optimal thresholds for phylogenetic (subset tree) clustering based on population. We evaluated this method on HIV-1 pol datasets (n = 14, 221 sequences) from four sites in USA (Tennessee, Washington), Canada (Northern Alberta) and China (Beijing). Clusters were defined by tips descending from an ancestral node (with a minimum bootstrap support of 95%) through a series of branches, each with a length below a given threshold. Next, we used pplacer to graft new cases to the fixed tree by maximum likelihood. We evaluated the effect of varying branch-length thresholds on cluster growth as a count outcome by fitting two Poisson regression models: a null model that predicts growth from cluster size, and an alternative model that includes mean collection date as an additional covariate. The alternative model was favoured by AIC across most thresholds, with optimal (greatest difference in AIC) thresholds ranging 0.007-0.013 across sites. The range of optimal thresholds was more variable when re-sampling 80% of the data by location (IQR 0.008 - 0.016, n = 100 replicates). Our results use prospective phylogenetic cluster growth and suggest that there is more variation in effective thresholds for public health than those typically used in clustering studies.
Collapse
|
4
|
Avila-Rios S, García-Morales C, Reyes-Terán G, González-Rodríguez A, Matías-Florentino M, Mehta SR, Chaillon A. Phylodynamics of HIV in the Mexico City Metropolitan Region. J Virol 2022; 96:e0070822. [PMID: 35762759 PMCID: PMC9327710 DOI: 10.1128/jvi.00708-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 06/06/2022] [Indexed: 12/30/2022] Open
Abstract
Evolutionary analyses of viral sequences can provide insights into transmission dynamics, which in turn can optimize prevention interventions. Here, we characterized the dynamics of HIV transmission within the Mexico City metropolitan area. HIV pol sequences from persons recently diagnosed at the largest HIV clinic in Mexico City (between 2016 and 2021) were annotated with demographic/geographic metadata. A multistep phylogenetic approach was applied to identify putative transmission clades. A data set of publicly available sequences was used to assess international introductions. Clades were analyzed with a discrete phylogeographic model to evaluate the timing and intensity of HIV introductions and transmission dynamics among municipalities in the region. A total of 6,802 sequences across 96 municipalities (5,192 from Mexico City and 1,610 from the neighboring State of Mexico) were included (93.6% cisgender men, 5.0% cisgender women, and 1.3% transgender women); 3,971 of these sequences formed 1,206 clusters, involving 78 municipalities, including 89 clusters of ≥10 sequences. Discrete phylogeographic analysis revealed (i) 1,032 viral introductions into the region, over one-half of which were from the United States, and (ii) 354 migration events between municipalities with high support (adjusted Bayes factor of ≥3). The most frequent viral migrations occurred between northern municipalities within Mexico City, i.e., Cuauhtémoc to Iztapalapa (5.2% of events), Iztapalapa to Gustavo A. Madero (5.4%), and Gustavo A. Madero to Cuauhtémoc (6.5%). Our analysis illustrates the complexity of HIV transmission within the Mexico City metropolitan area but also identifies a spatially active transmission area involving a few municipalities in the north of the city, where targeted interventions could have a more pronounced effect on the entire regional epidemic. IMPORTANCE Phylogeographic investigation of the Mexico City HIV epidemic illustrates the complexity of HIV transmission in the region. An active transmission area involving a few municipalities in the north of the city, with transmission links throughout the region, is identified and could be a location where targeted interventions could have a more pronounced effect on the entire regional epidemic, compared with those dispersed in other manners.
Collapse
Affiliation(s)
- Santiago Avila-Rios
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Claudia García-Morales
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Gustavo Reyes-Terán
- Coordinating Commission of the National Institutes of Health and High Specialty Hospitals, Ministry of Health, Mexico City, Mexico
| | | | | | - Sanjay R. Mehta
- Division of Infectious Diseases and Global Public Health, University of California, San Diego, San Diego, California, USA
- Veterans Affairs Health System, San Diego, California, USA
| | - Antoine Chaillon
- Division of Infectious Diseases and Global Public Health, University of California, San Diego, San Diego, California, USA
| |
Collapse
|
5
|
Louca S, McLaughlin A, MacPherson A, Joy JB, Pennell MW. Fundamental Identifiability Limits in Molecular Epidemiology. Mol Biol Evol 2021; 38:4010-4024. [PMID: 34009339 PMCID: PMC8382926 DOI: 10.1093/molbev/msab149] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Viral phylogenies provide crucial information on the spread of infectious diseases, and many studies fit mathematical models to phylogenetic data to estimate epidemiological parameters such as the effective reproduction ratio (Re) over time. Such phylodynamic inferences often complement or even substitute for conventional surveillance data, particularly when sampling is poor or delayed. It remains generally unknown, however, how robust phylodynamic epidemiological inferences are, especially when there is uncertainty regarding pathogen prevalence and sampling intensity. Here, we use recently developed mathematical techniques to fully characterize the information that can possibly be extracted from serially collected viral phylogenetic data, in the context of the commonly used birth-death-sampling model. We show that for any candidate epidemiological scenario, there exists a myriad of alternative, markedly different, and yet plausible "congruent" scenarios that cannot be distinguished using phylogenetic data alone, no matter how large the data set. In the absence of strong constraints or rate priors across the entire study period, neither maximum-likelihood fitting nor Bayesian inference can reliably reconstruct the true epidemiological dynamics from phylogenetic data alone; rather, estimators can only converge to the "congruence class" of the true dynamics. We propose concrete and feasible strategies for making more robust epidemiological inferences from viral phylogenetic data.
Collapse
Affiliation(s)
- Stilianos Louca
- Department of Biology, University of Oregon, Eugene, OR, USA
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, BC, Canada
- Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Ailene MacPherson
- Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, BC, Canada
- Bioinformatics, University of British Columbia, Vancouver, BC, Canada
- Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Matthew W Pennell
- Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
6
|
Chaillon A, Smith DM. Phylogenetic analyses of SARS-CoV-2 B.1.1.7 lineage suggest a single origin followed by multiple exportation events versus convergent evolution. Clin Infect Dis 2021; 73:2314-2317. [PMID: 33772259 PMCID: PMC8083653 DOI: 10.1093/cid/ciab265] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Indexed: 12/21/2022] Open
Abstract
The emergence of new variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) herald a new phase of the pandemic. This study used state-of-the-art phylodynamic methods to ascertain that the rapid rise of B.1.1.7 “Variant of Concern” most likely occurred by global dispersal rather than convergent evolution from multiple sources.
Collapse
Affiliation(s)
- A Chaillon
- Division of Infectious Diseases and Global Public Health, University of California San Diego, CA, USA
| | - D M Smith
- Division of Infectious Diseases and Global Public Health, University of California San Diego, CA, USA
| |
Collapse
|
7
|
Vrancken B, Zhao B, Li X, Han X, Liu H, Zhao J, Zhong P, Lin Y, Zai J, Liu M, Smith DM, Dellicour S, Chaillon A. Comparative Circulation Dynamics of the Five Main HIV Types in China. J Virol 2020; 94:e00683-20. [PMID: 32938762 PMCID: PMC7654276 DOI: 10.1128/jvi.00683-20] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 09/02/2020] [Indexed: 01/17/2023] Open
Abstract
The HIV epidemic in China accounts for 3% of the global HIV incidence. We compared the patterns and determinants of interprovincial spread of the five most prevalent circulating types. HIV pol sequences sampled across China were used to identify relevant transmission networks of the five most relevant HIV-1 types (B and circulating recombinant forms [CRFs] CRF01_AE, CRF07_BC, CRF08_BC, and CRF55_01B) in China. From these, the dispersal history across provinces was inferred. A generalized linear model (GLM) was used to test the association between migration rates among provinces and several measures of human mobility. A total of 10,707 sequences were collected between 2004 and 2017 across 26 provinces, among which 1,962 are newly reported here. A mean of 18 (minimum and maximum, 1 and 54) independent transmission networks involving up to 17 provinces were identified. Discrete phylogeographic analysis largely recapitulates the documented spread of the HIV types, which in turn, mirrors within-China population migration flows to a large extent. In line with the different spatiotemporal spread dynamics, the identified drivers thereof were also heterogeneous but are consistent with a central role of human mobility. The comparative analysis of the dispersal dynamics of the five main HIV types circulating in China suggests a key role of large population centers and developed transportation infrastructures as hubs of HIV dispersal. This advocates for coordinated public health efforts in addition to local targeted interventions.IMPORTANCE While traditional epidemiological studies are of great interest in describing the dynamics of epidemics, they struggle to fully capture the geospatial dynamics and factors driving the dispersal of pathogens like HIV as they have difficulties capturing linkages between infections. To overcome this, we used a discrete phylogeographic approach coupled to a generalized linear model extension to characterize the dynamics and drivers of the across-province spread of the five main HIV types circulating in China. Our results indicate that large urbanized areas with dense populations and developed transportation infrastructures are facilitators of HIV dispersal throughout China and highlight the need to consider harmonized country-wide public policies to control local HIV epidemics.
Collapse
Affiliation(s)
- Bram Vrancken
- Department of Microbiology, Immunology and Transplantation, Rega Institute, Laboratory for Computational and Evolutionary Virology, KU Leuven, Leuven, Belgium
| | - Bin Zhao
- NHC Key Laboratory of AIDS Immunology (China Medical University), National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, China
| | - Xingguang Li
- Department of Hospital Office, The First People's Hospital of Fangchenggang, Fangchenggang, China
| | - Xiaoxu Han
- NHC Key Laboratory of AIDS Immunology (China Medical University), National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, China
| | - Haizhou Liu
- Centre for Emerging Infectious Diseases, The State Key Laboratory of Virology, Wuhan Institute of Virology, University of Chinese Academy of Sciences, Wuhan, China
| | - Jin Zhao
- Shenzhen Center for Disease Control and Prevention, Shenzhen, China
| | - Ping Zhong
- Department of AIDS and STD, Shanghai Municipal Center for Disease Control and Prevention; Shanghai Municipal Institutes for Preventive Medicine, Shanghai, China
| | - Yi Lin
- Department of AIDS and STD, Shanghai Municipal Center for Disease Control and Prevention; Shanghai Municipal Institutes for Preventive Medicine, Shanghai, China
| | - Junjie Zai
- Immunology innovation Team, School of Medicine, Ningbo University, Ningbo, Zhejiang China
| | - Mingchen Liu
- NHC Key Laboratory of AIDS Immunology (China Medical University), National Clinical Research Center for Laboratory Medicine, The First Affiliated Hospital of China Medical University, Shenyang, China
| | - Davey M Smith
- Division of Infectious Diseases and Global Public Health, University of California San Diego, California, USA
| | - Simon Dellicour
- Department of Microbiology, Immunology and Transplantation, Rega Institute, Laboratory for Computational and Evolutionary Virology, KU Leuven, Leuven, Belgium
- Spatial Epidemiology Lab (SpELL), Université Libre de Bruxelles, Brussels, Belgium
| | - Antoine Chaillon
- Division of Infectious Diseases and Global Public Health, University of California San Diego, California, USA
| |
Collapse
|
8
|
Abstract
PURPOSE OF REVIEW A major goal of public health in relation to HIV/AIDS is to prevent new transmissions in communities. Phylogenetic techniques have improved our understanding of the structure and dynamics of HIV transmissions. However, there is still no consensus about phylogenetic methodology, sampling coverage, gene target and/or minimum fragment size. RECENT FINDINGS Several studies use a combined methodology, which includes both a genetic or patristic distance cut-off and a branching support threshold to identify phylogenetic clusters. However, the choice about these thresholds remains an inherently subjective process, which affects the results of these studies. There is still a lack of consensus about the genomic region and the size of fragments that should be used, although there seems to be emerging a consensus that using longer segments, allied with the use of a realistic model of evolution and a codon alignment, increases the likelihood of inferring true transmission clusters. The pol gene is still the most used genomic region, but recent studies have suggested that whole genomes and/or sequences from nef and gp41 are also good targets for cluster reconstruction. SUMMARY The development and application of standard methodologies for phylogenetic clustering analysis will advance our understanding of factors associated with HIV transmission. This will lead to the design of more precise public health interventions.
Collapse
|
9
|
Hong SL, Dellicour S, Vrancken B, Suchard MA, Pyne MT, Hillyard DR, Lemey P, Baele G. In Search of Covariates of HIV-1 Subtype B Spread in the United States-A Cautionary Tale of Large-Scale Bayesian Phylogeography. Viruses 2020; 12:v12020182. [PMID: 32033422 PMCID: PMC7077180 DOI: 10.3390/v12020182] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 01/24/2020] [Accepted: 01/28/2020] [Indexed: 12/21/2022] Open
Abstract
Infections with HIV-1 group M subtype B viruses account for the majority of the HIV epidemic in the Western world. Phylogeographic studies have placed the introduction of subtype B in the United States in New York around 1970, where it grew into a major source of spread. Currently, it is estimated that over one million people are living with HIV in the US and that most are infected with subtype B variants. Here, we aim to identify the drivers of HIV-1 subtype B dispersal in the United States by analyzing a collection of 23,588 pol sequences, collected for drug resistance testing from 45 states during 2004-2011. To this end, we introduce a workflow to reduce this large collection of data to more computationally-manageable sample sizes and apply the BEAST framework to test which covariates associate with the spread of HIV-1 across state borders. Our results show that we are able to consistently identify certain predictors of spread under reasonable run times across datasets of up to 10,000 sequences. However, the general lack of phylogenetic structure and the high uncertainty associated with HIV trees make it difficult to interpret the epidemiological relevance of the drivers of spread we are able to identify. While the workflow we present here could be applied to other virus datasets of a similar scale, the characteristic star-like shape of HIV-1 phylogenies poses a serious obstacle to reconstructing a detailed evolutionary and spatial history for HIV-1 subtype B in the US.
Collapse
Affiliation(s)
- Samuel L. Hong
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium; (S.D.); (B.V.); (P.L.); (G.B.)
- Correspondence:
| | - Simon Dellicour
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium; (S.D.); (B.V.); (P.L.); (G.B.)
- Spatial Epidemiology Lab (SpELL), Université Libre de Bruxelles, 1050 Brussels, Belgium
| | - Bram Vrancken
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium; (S.D.); (B.V.); (P.L.); (G.B.)
| | - Marc A. Suchard
- Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA 90095, USA;
- Department of Human Genetics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA 90095, USA
- Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, CA 90095, USA
| | - Michael T. Pyne
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT 84108, USA;
| | - David R. Hillyard
- Department of Pathology, University of Utah, Salt Lake City, UT 84112, USA;
| | - Philippe Lemey
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium; (S.D.); (B.V.); (P.L.); (G.B.)
| | - Guy Baele
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium; (S.D.); (B.V.); (P.L.); (G.B.)
| |
Collapse
|
10
|
Chato C, Kalish ML, Poon AFY. Public health in genetic spaces: a statistical framework to optimize cluster-based outbreak detection. Virus Evol 2020; 6:veaa011. [PMID: 32190349 PMCID: PMC7069216 DOI: 10.1093/ve/veaa011] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Genetic clustering is a popular method for characterizing variation in transmission rates for rapidly evolving viruses, and could potentially be used to detect outbreaks in 'near real time'. However, the statistical properties of clustering are poorly understood in this context, and there are no objective guidelines for setting clustering criteria. Here, we develop a new statistical framework to optimize a genetic clustering method based on the ability to forecast new cases. We analysed the pairwise Tamura-Nei (TN93) genetic distances for anonymized HIV-1 subtype B pol sequences from Seattle (n = 1,653) and Middle Tennessee, USA (n = 2,779), and northern Alberta, Canada (n = 809). Under varying TN93 thresholds, we fit two models to the distributions of new cases relative to clusters of known cases: 1, a null model that assumes cluster growth is strictly proportional to cluster size, i.e. no variation in transmission rates among individuals; and 2, a weighted model that incorporates individual-level covariates, such as recency of diagnosis. The optimal threshold maximizes the difference in information loss between models, where covariates are used most effectively. Optimal TN93 thresholds varied substantially between data sets, e.g. 0.0104 in Alberta and 0.016 in Seattle and Tennessee, such that the optimum for one population would potentially misdirect prevention efforts in another. For a given population, the range of thresholds where the weighted model conferred greater predictive accuracy tended to be narrow (±0.005 units), and the optimal threshold tended to be stable over time. Our framework also indicated that variation in the recency of HIV diagnosis among clusters was significantly more predictive of new cases than sample collection dates (ΔAIC > 50). These results suggest that one cannot rely on historical precedence or convention to configure genetic clustering methods for public health applications, especially when translating methods between settings of low-level and generalized epidemics. Our framework not only enables investigators to calibrate a clustering method to a specific public health setting, but also provides a variable selection procedure to evaluate different predictive models of cluster growth.
Collapse
Affiliation(s)
- Connor Chato
- Department of Pathology and Laboratory Medicine, Western University, Dental Sciences Building DSB4044, London N6A 5C1, Canada
| | - Marcia L Kalish
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University School of Medicine, 1161 21st Ave S, Nashville, TN 37232, USA
| | - Art F Y Poon
- Department of Pathology and Laboratory Medicine, Western University, Dental Sciences Building DSB4044, London N6A 5C1, Canada
- Department of Applied Mathematics, Western University, Middlesex College MC255, London N6A 5B7, Canada
- Department of Microbiology and Immunology, Western University, Dental Science Building DSB3014, London N6A 5C1, Canada
| |
Collapse
|
11
|
Pérez AB, Vrancken B, Chueca N, Aguilera A, Reina G, García-del Toro M, Vera F, Von Wichman MA, Arenas JI, Téllez F, Pineda JA, Omar M, Bernal E, Rivero-Juárez A, Fernández-Fuertes E, de la Iglesia A, Pascasio JM, Lemey P, Garcia F, Cuypers L. Increasing importance of European lineages in seeding the hepatitis C virus subtype 1a epidemic in Spain. Euro Surveill 2019; 24:1800227. [PMID: 30862327 PMCID: PMC6402173 DOI: 10.2807/1560-7917.es.2019.24.9.1800227] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
BackgroundReducing the burden of the hepatitis C virus (HCV) requires large-scale deployment of intervention programmes, which can be informed by the dynamic pattern of HCV spread. In Spain, ongoing transmission of HCV is mostly fuelled by people who inject drugs (PWID) infected with subtype 1a (HCV1a).AimOur aim was to map how infections spread within and between populations, which could help formulate more effective intervention programmes to halt the HCV1a epidemic in Spain.MethodsEpidemiological links between HCV1a viruses from a convenience sample of 283 patients in Spain, mostly PWID, collected between 2014 and 2016, and 1,317, 1,291 and 1,009 samples collected abroad between 1989 and 2016 were reconstructed using sequences covering the NS3, NS5A and NS5B genes. To efficiently do so, fast maximum likelihood-based tree estimation was coupled to a flexible Bayesian discrete phylogeographic inference method.ResultsThe transmission network structure of the Spanish HCV1a epidemic was shaped by continuous seeding of HCV1a into Spain, almost exclusively from North America and European countries. The latter became increasingly relevant and have dominated in recent times. Export from Spain to other countries in Europe was also strongly supported, although Spain was a net sink for European HCV1a lineages. Spatial reconstructions showed that the epidemic in Spain is diffuse, without large, dominant within-country networks.ConclusionTo boost the effectiveness of local intervention efforts, concerted supra-national strategies to control HCV1a transmission are needed, with a strong focus on the most important drivers of ongoing transmission, i.e. PWID and other high-risk populations.
Collapse
Affiliation(s)
- Ana Belen Pérez
- Department of Microbiology, Institute of Bio Sanitary Research (IBIS), AIDS Research Network, University Hospital of Granada, Granada, Spain,These authors contributed equally to the article
| | - Bram Vrancken
- These authors contributed equally to the article,KU Leuven, Department of Microbiology and Immunology, Rega Institute for Medical Research, Laboratory of Evolutionary and Computational Virology, Leuven, Belgium
| | - Natalia Chueca
- Department of Microbiology, Institute of Bio Sanitary Research (IBIS), AIDS Research Network, University Hospital of Granada, Granada, Spain
| | - Antonio Aguilera
- Department of Microbiology, University Hospital of Santiago, Santiago de Compostela, Spain
| | - Gabriel Reina
- Department of Microbiology, University Hospital of Navarra, Institute for Health Research (IdisNA), Pamplona, Spain
| | | | - Francisco Vera
- Unit of Infectious Diseases, Internal Medicine, General Hospital of Rosell, Cartagena, Murcia, Spain
| | | | - Juan Ignacio Arenas
- Unit of Infectious Diseases, Hospital Universitario de San Sebastian, San Sebastian, Spain
| | - Francisco Téllez
- Unit of Infectious Diseases and Microbiology, University Hospital of Puerto Real, Cádiz, Spain
| | - Juan A Pineda
- Unit of Infectious Diseases, University Hospital of Valme, Sevilla, Spain (J.A. Pineda)
| | | | - Enrique Bernal
- Unit of Infectious Diseases, General University Hospital, Murcia, Spain
| | - Antonio Rivero-Juárez
- Unit of Infectious Diseases, University Hospital Reina Sofía of Córdoba, Maimonides Institute of Biomedical Research of Córdoba, University of Córdoba, Córdoba, Spain
| | | | | | - Juan Manuel Pascasio
- Clinical Management Unit of Digestive Diseases, University Hospital of Virgen del Rocío, Sevilla, Spain
| | - Philippe Lemey
- KU Leuven, Department of Microbiology and Immunology, Rega Institute for Medical Research, Laboratory of Evolutionary and Computational Virology, Leuven, Belgium
| | - Féderico Garcia
- Department of Microbiology, Institute of Bio Sanitary Research (IBIS), AIDS Research Network, University Hospital of Granada, Granada, Spain,These authors contributed equally to the article
| | - Lize Cuypers
- These authors contributed equally to the article,KU Leuven, Department of Microbiology and Immunology, Rega Institute for Medical Research, Laboratory of Clinical and Epidemiological Virology, Leuven, Belgium
| |
Collapse
|
12
|
Jovanović L, Šiljić M, Ćirković V, Salemović D, Pešić-Pavlović I, Todorović M, Ranin J, Jevtović D, Stanojević M. Exploring Evolutionary and Transmission Dynamics of HIV Epidemic in Serbia: Bridging Socio-Demographic With Phylogenetic Approach. Front Microbiol 2019; 10:287. [PMID: 30858834 PMCID: PMC6397891 DOI: 10.3389/fmicb.2019.00287] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2018] [Accepted: 02/04/2019] [Indexed: 12/04/2022] Open
Abstract
Previous molecular studies of Serbian HIV epidemic identified the dominance of subtype B and presence of clusters related HIV-1 transmission, in particular among men who have sex with men (MSM). In order to get a deeper understanding of the complexities of HIV sub-epidemics in Serbia, epidemic trends, temporal origin and phylodynamic characteristics in general population and subpopulations were analyzed by means of mathematical modeling, phylogenetic analysis and latent class analysis (LCA). Fitting of the logistic curve of trends for a cumulative annual number of new HIV cases in 1984–2016, in general population and MSM transmission group, was performed. Both datasets fitted the logistic growth model, showing the early exponential phase of the growth curve. According to the suggested model, in the year 2030, the number of newly diagnosed HIV cases in Serbia will continue to grow, in particular in the MSM transmission group. Further, a detailed phylogenetic analysis was performed on 385 sequences from the period 1997–2015. Identification of transmission clusters, estimation of population growth (Ne), of the effective reproductive number (Re) and time of the most recent common ancestor (tMRCA) were estimated employing Bayesian and maximum likelihood methods. A substantial proportion of 53% of subtype B sequences was found within transmission clusters/network. Phylodynamic analysis revealed Re over one during the whole period investigated, with the steepest slopes and a recent tMRCA for MSM transmission group subtype B clades, in line with a growing trend in the number of transmissions in years approaching the end of the study period. Contrary, heterosexual clades in both studied subtypes – B and C – showed modest growth and stagnation. LCA analysis identified five latent classes, with transmission clusters dominantly present in 2/5 classes, linked to MSM transmission living in the capital city and with the high prevalence of co-infection with HBV and/or other STIs.Presented findings imply that HIV epidemic in Serbia is still in the exponential growth phase, in particular, related to the MSM transmission, with estimated steep growth curve until 2030. The obtained results imply that an average new HIV patient in Serbia is a young man with concomitant sexually transmitted infection.
Collapse
Affiliation(s)
- Luka Jovanović
- Institute of Microbiology and Immunology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| | - Marina Šiljić
- Institute of Microbiology and Immunology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| | - Valentina Ćirković
- Institute of Microbiology and Immunology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| | - Dubravka Salemović
- Infectious and Tropical Diseases University Hospital, Clinical Centre of Serbia, Belgrade, Serbia
| | - Ivana Pešić-Pavlović
- Virology Laboratory, Microbiology Department, Clinical Centre of Serbia, Belgrade, Serbia
| | - Marija Todorović
- Institute of Microbiology and Immunology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| | - Jovan Ranin
- Infectious and Tropical Diseases University Hospital, Clinical Centre of Serbia, Belgrade, Serbia
| | - Djordje Jevtović
- Infectious and Tropical Diseases University Hospital, Clinical Centre of Serbia, Belgrade, Serbia
| | - Maja Stanojević
- Institute of Microbiology and Immunology, Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| |
Collapse
|