1
|
Li H, Marin M, Farhat MR. Exploring gene content with pangene graphs. ARXIV 2024:arXiv:2402.16185v3. [PMID: 38463499 PMCID: PMC10925376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Motivation The gene content regulates the biology of an organism. It varies between species and between individuals of the same species. Although tools have been developed to identify gene content changes in bacterial genomes, none is applicable to collections of large eukaryotic genomes such as the human pangenome. Results We developed pangene, a computational tool to identify gene orientation, gene order and gene copy-number changes in a collection of genomes. Pangene aligns a set of input protein sequences to the genomes, resolves redundancies between protein sequences and constructs a gene graph with each genome represented as a walk in the graph. It additionally finds subgraphs, which we call bibubbles, that capture gene content changes. Applied to the human pangenome, pangene identifies known gene-level variations and reveals complex haplotypes that are not well studied before. Pangene also works with high-quality bacterial pangenome and reports similar numbers of core and accessory genes in comparison to existing tools. Availability and implementation Source code at https://github.com/lh3/pangene; pre-built pangene graphs can be downloaded from https://zenodo.org/records/8118576 and visualized at https://pangene.bioinweb.org.
Collapse
Affiliation(s)
- Heng Li
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
- Harvard Medical School, 10 Shattuck St, Boston, MA 02215, USA
- Broad Insitute of Harvard and MIT, 415 Main St, Cambridge, MA 02142, USA
| | | | - Maha Reda Farhat
- Harvard Medical School, 10 Shattuck St, Boston, MA 02215, USA
- Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, USA
| |
Collapse
|
2
|
Marin MG, Wippel C, Quinones-Olvera N, Behruznia M, Jeffrey BM, Harris M, Mann BC, Rosenthal A, Jacobson KR, Warren RM, Li H, Meehan CJ, Farhat MR. Analysis of the limited M. tuberculosis accessory genome reveals potential pitfalls of pan-genome analysis approaches. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.21.586149. [PMID: 38585972 PMCID: PMC10996470 DOI: 10.1101/2024.03.21.586149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Pan-genome analysis is a fundamental tool for studying bacterial genome evolution; however, the variety of methods used to define and measure the pan-genome poses challenges to the interpretation and reliability of results. To quantify sources of bias and error related to common pan-genome analysis approaches, we evaluated different approaches applied to curated collection of 151 Mycobacterium tuberculosis ( Mtb ) isolates. Mtb is characterized by its clonal evolution, absence of horizontal gene transfer, and limited accessory genome, making it an ideal test case for this study. Using a state-of-the-art graph-genome approach, we found that a majority of the structural variation observed in Mtb originates from rearrangement, deletion, and duplication of redundant nucleotide sequences. In contrast, we found that pan-genome analyses that focus on comparison of coding sequences (at the amino acid level) can yield surprisingly variable results, driven by differences in assembly quality and the softwares used. Upon closer inspection, we found that coding sequence annotation discrepancies were a major contributor to inflated Mtb accessory genome estimates. To address this, we developed panqc, a software that detects annotation discrepancies and collapses nucleotide redundancy in pan-genome estimates. When applied to Mtb and E. coli pan-genomes, panqc exposed distinct biases influenced by the genomic diversity of the population studied. Our findings underscore the need for careful methodological selection and quality control to accurately map the evolutionary dynamics of a bacterial species.
Collapse
|
3
|
Deng MZ, Liu Q, Cui SJ, Fu H, Gan M, Xu YY, Cai X, Sha W, Zhao GP, Fortune SM, Lyu LD. Mycobacterial DnaQ is an Alternative Proofreader Ensuring DNA Replication Fidelity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.24.563508. [PMID: 37961690 PMCID: PMC10634781 DOI: 10.1101/2023.10.24.563508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Remove of mis-incorporated nucleotides ensures replicative fidelity. Although the ε-exonuclease DnaQ is a well-established proofreader in the model organism Escherichia coli, proofreading in mycobacteria relies on the polymerase and histidinol phosphatase (PHP) domain of replicative polymerase despite the presence of an alternative DnaQ homolog. Here, we show that depletion of DnaQ in Mycolicibacterium smegmatis results in increased mutation rate, leading to AT-biased mutagenesis and elevated insertions/deletions in homopolymer tract. We demonstrated that mycobacterial DnaQ binds to the b-clamp and functions synergistically with the PHP domain to correct replication errors. Further, we found that the mycobacterial DnaQ sustains replicative fidelity upon chromosome topological stress. Intriguingly, we showed that a naturally evolved DnaQ variant prevalent in clinical Mycobacterium tuberculosis isolates enables hypermutability and is associated with extensive drug resistance. These results collectively establish that the alternative DnaQ functions in proofreading, and thus reveal that mycobacteria deploy two proofreaders to maintain replicative fidelity.
Collapse
Affiliation(s)
- Ming-Zhi Deng
- Key Laboratory of Medical Molecular Virology of the Ministry of Education/Ministry of Health (MOE/NHC), School of Basic Medical Sciences, Fudan University, Shanghai 200032, P.R.China
- These authors contributed equally
| | - Qingyun Liu
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115
- These authors contributed equally
| | - Shu-Jun Cui
- Key Laboratory of Medical Molecular Virology of the Ministry of Education/Ministry of Health (MOE/NHC), School of Basic Medical Sciences, Fudan University, Shanghai 200032, P.R.China
- Department of Microbiology and Microbial Engineering, School of Life Sciences, Fudan University, Shanghai, 200433, P.R.China
| | - Han Fu
- Key Laboratory of Medical Molecular Virology of the Ministry of Education/Ministry of Health (MOE/NHC), School of Basic Medical Sciences, Fudan University, Shanghai 200032, P.R.China
- CAS Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences (CAS), Shanghai 200032, P.R.China
- University of Chinese Academy of Sciences, Beijing 100049, P.R.China
| | - Mingyu Gan
- Center for Molecular Medicine, Children’s Hospital of Fudan University, National Children’s Medical Center, Shanghai, 201102, P.R.China
| | - Yuan-Yuan Xu
- Key Laboratory of Medical Molecular Virology of the Ministry of Education/Ministry of Health (MOE/NHC), School of Basic Medical Sciences, Fudan University, Shanghai 200032, P.R.China
| | - Xia Cai
- Key Laboratory of Medical Molecular Virology of the Ministry of Education/Ministry of Health (MOE/NHC), School of Basic Medical Sciences, Fudan University, Shanghai 200032, P.R.China
| | - Wei Sha
- Shanghai Clinical Research Center for Tuberculosis, Shanghai Key Laboratory of Tuberculosis, Shanghai Pulmonary Hospital, Shanghai 200433, P.R.China
| | - Guo-Ping Zhao
- Department of Microbiology and Microbial Engineering, School of Life Sciences, Fudan University, Shanghai, 200433, P.R.China
- CAS Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences (CAS), Shanghai 200032, P.R.China
- University of Chinese Academy of Sciences, Beijing 100049, P.R.China
| | - Sarah M. Fortune
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115
| | - Liang-Dong Lyu
- Key Laboratory of Medical Molecular Virology of the Ministry of Education/Ministry of Health (MOE/NHC), School of Basic Medical Sciences, Fudan University, Shanghai 200032, P.R.China
- Shanghai Clinical Research Center for Tuberculosis, Shanghai Key Laboratory of Tuberculosis, Shanghai Pulmonary Hospital, Shanghai 200433, P.R.China
| |
Collapse
|
4
|
Mejía-Ponce PM, Ramos-González EJ, Ramos-García AA, Lara-Ramírez EE, Soriano-Herrera AR, Medellín-Luna MF, Valdez-Salazar F, Castro-Garay CY, Núñez-Contreras JJ, De Donato-Capote M, Sharma A, Castañeda-Delgado JE, Zenteno-Cuevas R, Enciso-Moreno JA, Licona-Cassani C. Genomic epidemiology analysis of drug-resistant Mycobacterium tuberculosis distributed in Mexico. PLoS One 2023; 18:e0292965. [PMID: 37831695 PMCID: PMC10575498 DOI: 10.1371/journal.pone.0292965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 09/29/2023] [Indexed: 10/15/2023] Open
Abstract
Genomics has significantly revolutionized pathogen surveillance, particularly in epidemiological studies, the detection of drug-resistant strains, and disease control. Despite its potential, the representation of Latin American countries in the genomic catalogues of Mycobacterium tuberculosis (Mtb), the bacteria responsible for Tuberculosis (TB), remains limited. In this study, we present a whole genome sequencing (WGS)-based analysis of 85 Mtb clinical strains from 17 Mexican states, providing insights into local adaptations and drug resistance signatures in the region. Our results reveal that the Euro-American lineage (L4) accounts for 94% of our dataset, showing 4.1.2.1 (Haarlem, n = 32), and 4.1.1.3 (X-type, n = 34) sublineages as the most prevalent. We report the presence of the 4.1.1.3 sublineage, which is endemic to Mexico, in six additional locations beyond previous reports. Phenotypic drug resistance tests showed that 34 out of 85 Mtb samples were resistant, exhibiting a variety of resistance profiles to the first-line antibiotics tested. We observed high levels of discrepancy between phenotype and genotype associated with drug resistance in our dataset, including pyrazinamide-monoresistant Mtb strains lacking canonical variants of drug resistance. Expanding the Latin American Mtb genome databases will enhance our understanding of TB epidemiology and potentially provide new avenues for controlling the disease in the region.
Collapse
Affiliation(s)
- Paulina M. Mejía-Ponce
- Centro de Biotecnología FEMSA, Escuela de Ingeniería y Ciencias, Tecnológico de Monterrey, Nuevo León, México
| | - Elsy J. Ramos-González
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - Axel A. Ramos-García
- Centro de Biotecnología FEMSA, Escuela de Ingeniería y Ciencias, Tecnológico de Monterrey, Nuevo León, México
| | - Edgar E. Lara-Ramírez
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - Alma R. Soriano-Herrera
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - Mitzy F. Medellín-Luna
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
- Posgrado en Ciencias Farmacobiológicas, Universidad Autónoma de San Luis Potosí, San Luis Potosí, México
| | - Fernando Valdez-Salazar
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - Claudia Y. Castro-Garay
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | - José J. Núñez-Contreras
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
| | | | - Ashutosh Sharma
- Escuela de Ingeniería y Ciencias, Tecnológico de Monterrey, Querétaro, México
| | - Julio E. Castañeda-Delgado
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
- Consejo Nacional de Ciencia y Tecnología, CONACYT, Ciudad de México, México
| | - Roberto Zenteno-Cuevas
- Instituto de Salud Pública, Universidad Veracruzana, Veracruz, México
- Red Multidisciplinaria de Investigación en Tuberculosis, Ciudad de México, México
| | - Jose Antonio Enciso-Moreno
- Unidad de Investigación Biomédica de Zacatecas, Instituto Mexicano Del Seguro Social, Zacatecas, México
- Facultad de Química, Universidad Autónoma de Querétaro, Querétaro, México
| | - Cuauhtémoc Licona-Cassani
- Centro de Biotecnología FEMSA, Escuela de Ingeniería y Ciencias, Tecnológico de Monterrey, Nuevo León, México
- Red Multidisciplinaria de Investigación en Tuberculosis, Ciudad de México, México
- Division of Integrative Biology, The Institute for Obesity Research, Tecnológico de Monterrey, Nuevo León, México
| |
Collapse
|
5
|
Gómez-González PJ, Grabowska AD, Tientcheu LD, Tsolaki AG, Hibberd ML, Campino S, Phelan JE, Clark TG. Functional genetic variation in pe/ ppe genes contributes to diversity in Mycobacterium tuberculosis lineages and potential interactions with the human host. Front Microbiol 2023; 14:1244319. [PMID: 37876785 PMCID: PMC10591178 DOI: 10.3389/fmicb.2023.1244319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/21/2023] [Indexed: 10/26/2023] Open
Abstract
Introduction Around 10% of the coding potential of Mycobacterium tuberculosisis constituted by two poorly understood gene families, the pe and ppe loci, thought to be involved in host-pathogen interactions. Their repetitive nature and high GC content have hindered sequence analysis, leading to exclusion from whole-genome studies. Understanding the genetic diversity of pe/ppe families is essential to facilitate their potential translation into tools for tuberculosis prevention and treatment. Methods To investigate the genetic diversity of the 169 pe/ppe genes, we performed a sequence analysis across 73 long-read assemblies representing seven different lineages of M. tuberculosis and M. bovis BCG. Individual pe/ppe gene alignments were extracted and diversity and conservation across the different lineages studied. Results The pe/ppe genes were classified into three groups based on the level of protein sequence conservation relative to H37Rv, finding that >50% were conserved, with indels in pe_pgrs and ppe_mptr sub-families being major drivers of structural variation. Gene rearrangements, such as duplications and gene fusions, were observed between pe and pe_pgrs genes. Inter-lineage diversity revealed lineage-specific SNPs and indels. Discussion The high level of pe/ppe genes conservation, together with the lineage-specific findings, suggest their phylogenetic informativeness. However, structural variants and gene rearrangements differing from the reference were also identified, with potential implications for pathogenicity. Overall, improving our knowledge of these complex gene families may have insights into pathogenicity and inform the development of much-needed tools for tuberculosis control.
Collapse
Affiliation(s)
| | - Anna D. Grabowska
- Department of Biophysics, Physiology and Pathophysiology, Medical University of Warsaw, Warsaw, Poland
| | - Leopold D. Tientcheu
- MRC Unit, The Gambia at the London School of Hygiene and Tropical Medicine, Vaccines and Immunity Theme, Fajara, The Gambia
| | - Anthony G. Tsolaki
- Department of Life Sciences, Brunel University London, Uxbridge, United Kingdom
| | - Martin L. Hibberd
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Susana Campino
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Jody E. Phelan
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Taane G. Clark
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
6
|
Commins N, Sullivan MR, McGowen K, Koch EM, Rubin EJ, Farhat M. Mutation rates and adaptive variation among the clinically dominant clusters of Mycobacterium abscessus. Proc Natl Acad Sci U S A 2023; 120:e2302033120. [PMID: 37216535 PMCID: PMC10235944 DOI: 10.1073/pnas.2302033120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 04/13/2023] [Indexed: 05/24/2023] Open
Abstract
Mycobacterium abscessus (Mab) is a multidrug-resistant pathogen increasingly responsible for severe pulmonary infections. Analysis of whole-genome sequences (WGS) of Mab demonstrates dense genetic clustering of clinical isolates collected from disparate geographic locations. This has been interpreted as supporting patient-to-patient transmission, but epidemiological studies have contradicted this interpretation. Here, we present evidence for a slowing of the Mab molecular clock rate coincident with the emergence of phylogenetic clusters. We performed phylogenetic inference using publicly available WGS from 483 Mab patient isolates. We implement a subsampling approach in combination with coalescent analysis to estimate the molecular clock rate along the long internal branches of the tree, indicating a faster long-term molecular clock rate compared to branches within phylogenetic clusters. We used ancestry simulation to predict the effects of clock rate variation on phylogenetic clustering and found that the degree of clustering in the observed phylogeny is more easily explained by a clock rate slowdown than by transmission. We also find that phylogenetic clusters are enriched in mutations affecting DNA repair machinery and report that clustered isolates have lower spontaneous mutation rates in vitro. We propose that Mab adaptation to the host environment through variation in DNA repair genes affects the organism's mutation rate and that this manifests as phylogenetic clustering. These results challenge the model that phylogenetic clustering in Mab is explained by person-to-person transmission and inform our understanding of transmission inference in emerging, facultative pathogens.
Collapse
Affiliation(s)
- Nicoletta Commins
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA02115
| | - Mark R. Sullivan
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA02115
| | - Kerry McGowen
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA02115
| | - Evan M. Koch
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA02115
| | - Eric J. Rubin
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA02115
- Department of Microbiology, Harvard Medical School, Boston, MA02115
| | - Maha Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA02115
- Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA02114
| |
Collapse
|
7
|
Di Marco F, Spitaleri A, Battaglia S, Batignani V, Cabibbe AM, Cirillo DM. Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome. Front Microbiol 2023; 14:1104456. [PMID: 36819039 PMCID: PMC9932330 DOI: 10.3389/fmicb.2023.1104456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 01/16/2023] [Indexed: 02/05/2023] Open
Abstract
Introduction In the fight to limit the global spread of antibiotic resistance, computational challenges associated with sequencing technology can impact the accuracy of downstream analysis, including drug resistance identification, transmission, and genome resolution. About 10% of Mycobacterium tuberculosis (MTB) genome is constituted by the PE/PPE family, a GC-rich repetitive genome region. Although sequencing using short read technology is widely used, it is well recognized its limit in the PE/PPE regions due to the unambiguously mapping process onto the reference genome. The aim of this study was to compare the performances of short-reads (SRS), long-reads (LRS) and hybrid-reads (HYBR) based analysis over different common investigative tasks: genome coverage estimation, variant calling and cluster analysis, drug resistance detection and de novo assembly. Methods For the study 13 model MTB clinical isolates were sequenced with both SRS and LRS. HYBR were produced correcting the long reads with the short reads. The fastq from the three approaches were then processed using a customized version of MTBseq for genome coverage estimation and variant calling and using two different assemblers for de novo assembly evaluation. Results Estimation of genome coverage performances showed lower 8X breadth coverage for SRS respect to LRS and HYBR: considering the PE/PPE genes, SRS showed low results for the PE_PGRS family, while obtained acceptable coverage in PE and PPE genes; LRS and HYBR reached optimal coverages in PE/PPE genes. For variant calling HYBR showed the highest resolution, detecting the highest percentage of uniquely identified mutations compared to LRS and SRS. All three approaches agreed on the identification of two major clusters, with HYBR identifying an higher number of SNPs between the two clusters. Comparing the quality of the assemblies, HYBR and LRS obtained better results than SRS. Discussion In conclusion, depending on the aim of the investigation, both SRS and LRS present complementary advantages and limitations implying that for a full resolution of MTB genomes, where all the mentioned analyses and both technologies are needed, the use of the HYBR approach represents a valid option and a well-rounded strategy.
Collapse
Affiliation(s)
- Federico Di Marco
- Emerging Bacterial Pathogens Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy,Fondazione Centro San Raffaele, Milan, Italy
| | - Andrea Spitaleri
- Emerging Bacterial Pathogens Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy,Università Vita Salute San Raffaele, Milan, Italy
| | - Simone Battaglia
- Emerging Bacterial Pathogens Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Virginia Batignani
- Emerging Bacterial Pathogens Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | | | - Daniela Maria Cirillo
- Emerging Bacterial Pathogens Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy,*Correspondence: Daniela Maria Cirillo,
| |
Collapse
|
8
|
Liu Q, Zhu J, Dulberger CL, Stanley S, Wilson S, Chung ES, Wang X, Culviner P, Liu YJ, Hicks ND, Babunovic GH, Giffen SR, Aldridge BB, Garner EC, Rubin EJ, Chao MC, Fortune SM. Tuberculosis treatment failure associated with evolution of antibiotic resilience. Science 2022; 378:1111-1118. [PMID: 36480634 PMCID: PMC9968493 DOI: 10.1126/science.abq2787] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The widespread use of antibiotics has placed bacterial pathogens under intense pressure to evolve new survival mechanisms. Genomic analysis of 51,229 Mycobacterium tuberculosis (Mtb)clinical isolates has identified an essential transcriptional regulator, Rv1830, herein called resR for resilience regulator, as a frequent target of positive (adaptive) selection. resR mutants do not show canonical drug resistance or drug tolerance but instead shorten the post-antibiotic effect, meaning that they enable Mtb to resume growth after drug exposure substantially faster than wild-type strains. We refer to this phenotype as antibiotic resilience. ResR acts in a regulatory cascade with other transcription factors controlling cell growth and division, which are also under positive selection in clinical isolates of Mtb. Mutations of these genes are associated with treatment failure and the acquisition of canonical drug resistance.
Collapse
Affiliation(s)
- Qingyun Liu
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Junhao Zhu
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Charles L. Dulberger
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA,Department of Molecular and Cellular Biology, Harvard University, Boston, MA, USA
| | - Sydney Stanley
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Sean Wilson
- Department of Molecular and Cellular Biology, Harvard University, Boston, MA, USA
| | - Eun Seon Chung
- Department of Molecular Biology and Microbiology, Tufts University School of Medicine, Boston, MA 02111, USA,Department of Biomedical Engineering, Tufts University School of Engineering, Medford, MA 02115, USA
| | - Xin Wang
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Peter Culviner
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Yue J. Liu
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Nathan D. Hicks
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Gregory H. Babunovic
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Samantha R. Giffen
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Bree B. Aldridge
- Department of Molecular Biology and Microbiology, Tufts University School of Medicine, Boston, MA 02111, USA,Department of Biomedical Engineering, Tufts University School of Engineering, Medford, MA 02115, USA
| | - Ethan C. Garner
- Department of Molecular and Cellular Biology, Harvard University, Boston, MA, USA
| | - Eric J. Rubin
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Michael C. Chao
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | - Sarah M. Fortune
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA,Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA,Corresponding author.
| |
Collapse
|
9
|
Gómez-González PJ, Campino S, Phelan JE, Clark TG. Portable sequencing of Mycobacterium tuberculosis for clinical and epidemiological applications. Brief Bioinform 2022; 23:6650479. [PMID: 35894606 PMCID: PMC9487601 DOI: 10.1093/bib/bbac256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/23/2022] [Accepted: 06/01/2022] [Indexed: 11/14/2022] Open
Abstract
With >1 million associated deaths in 2020, human tuberculosis (TB) caused by the bacteria Mycobacterium tuberculosis remains one of the deadliest infectious diseases. A plethora of genomic tools and bioinformatics pipelines have become available in recent years to assist the whole genome sequencing of M. tuberculosis. The Oxford Nanopore Technologies (ONT) portable sequencer is a promising platform for cost-effective application in clinics, including personalizing treatment through detection of drug resistance-associated mutations, or in the field, to assist epidemiological and transmission investigations. In this study, we performed a comparison of 10 clinical isolates with DNA sequenced on both long-read ONT and (gold standard) short-read Illumina HiSeq platforms. Our analysis demonstrates the robustness of the ONT variant calling for single nucleotide polymorphisms, despite the high error rate. Moreover, because of improved coverage in repetitive regions where short sequencing reads fail to align accurately, ONT data analysis can incorporate additional regions of the genome usually excluded (e.g. pe/ppe genes). The resulting extra resolution can improve the characterization of transmission clusters and dynamics based on inferring closely related isolates. High concordance in variants in loci associated with drug resistance supports its use for the rapid detection of resistant mutations. Overall, ONT sequencing is a promising tool for TB genomic investigations, particularly to inform clinical and surveillance decision-making to reduce the disease burden.
Collapse
Affiliation(s)
- Paula J Gómez-González
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, WC1E 7HT London, UK
| | - Susana Campino
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, WC1E 7HT London, UK
| | - Jody E Phelan
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, WC1E 7HT London, UK
| | - Taane G Clark
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, WC1E 7HT London, UK.,Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, WC1E 7HT London, UK
| |
Collapse
|
10
|
Green AG, Yoon CH, Chen ML, Ektefaie Y, Fina M, Freschi L, Gröschel MI, Kohane I, Beam A, Farhat M. A convolutional neural network highlights mutations relevant to antimicrobial resistance in Mycobacterium tuberculosis. Nat Commun 2022; 13:3817. [PMID: 35780211 PMCID: PMC9250494 DOI: 10.1038/s41467-022-31236-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 06/10/2022] [Indexed: 11/30/2022] Open
Abstract
Long diagnostic wait times hinder international efforts to address antibiotic resistance in M. tuberculosis. Pathogen whole genome sequencing, coupled with statistical and machine learning models, offers a promising solution. However, generalizability and clinical adoption have been limited by a lack of interpretability, especially in deep learning methods. Here, we present two deep convolutional neural networks that predict antibiotic resistance phenotypes of M. tuberculosis isolates: a multi-drug CNN (MD-CNN), that predicts resistance to 13 antibiotics based on 18 genomic loci, with AUCs 82.6-99.5% and higher sensitivity than state-of-the-art methods; and a set of 13 single-drug CNNs (SD-CNN) with AUCs 80.1-97.1% and higher specificity than the previous state-of-the-art. Using saliency methods to evaluate the contribution of input sequence features to the SD-CNN predictions, we identify 18 sites in the genome not previously associated with resistance. The CNN models permit functional variant discovery, biologically meaningful interpretation, and clinical applicability.
Collapse
Affiliation(s)
- Anna G Green
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
| | - Chang Ho Yoon
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
- Big Data Institute, Nuffield Department of Population Health, University of Oxford, Oxford, OX37LF, UK
| | - Michael L Chen
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
- Stanford University School of Medicine, 291 Campus Dr, Stanford, CA, 94305, USA
| | - Yasha Ektefaie
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
| | - Mack Fina
- Harvard College, Cambridge, MA, 02138, USA
| | - Luca Freschi
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
| | - Matthias I Gröschel
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
| | - Isaac Kohane
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA
| | - Andrew Beam
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA.
| | - Maha Farhat
- Department of Biomedical Informatics, Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, USA.
- Division of Pulmonary & Critical Care, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| |
Collapse
|