1
|
Gurdasani D, Carstensen T, Fatumo S, Chen G, Franklin CS, Prado-Martinez J, Bouman H, Abascal F, Haber M, Tachmazidou I, Mathieson I, Ekoru K, DeGorter MK, Nsubuga RN, Finan C, Wheeler E, Chen L, Cooper DN, Schiffels S, Chen Y, Ritchie GRS, Pollard MO, Fortune MD, Mentzer AJ, Garrison E, Bergström A, Hatzikotoulas K, Adeyemo A, Doumatey A, Elding H, Wain LV, Ehret G, Auer PL, Kooperberg CL, Reiner AP, Franceschini N, Maher D, Montgomery SB, Kadie C, Widmer C, Xue Y, Seeley J, Asiki G, Kamali A, Young EH, Pomilla C, Soranzo N, Zeggini E, Pirie F, Morris AP, Heckerman D, Tyler-Smith C, Motala AA, Rotimi C, Kaleebu P, Barroso I, Sandhu MS. Uganda Genome Resource Enables Insights into Population History and Genomic Discovery in Africa. Cell 2020; 179:984-1002.e36. [PMID: 31675503 DOI: 10.1016/j.cell.2019.10.004] [Citation(s) in RCA: 112] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 04/03/2019] [Accepted: 10/02/2019] [Indexed: 12/19/2022]
Abstract
Genomic studies in African populations provide unique opportunities to understand disease etiology, human diversity, and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans was best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, hematological, lipid, and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region.
Collapse
Affiliation(s)
- Deepti Gurdasani
- William Harvey Research Institute, Queen Mary's University of London, London, UK
| | | | - Segun Fatumo
- London School of Hygiene and Tropical Medicine, London, UK; Uganda Medical Informatics Centre (UMIC), MRC/UVRI and LSHTM (Uganda Research Unit), Entebbe, Uganda; H3Africa Bioinformatics Network (H3ABioNet) Node, Center for Genomics Research and Innovation (CGRI)/National Biotechnology Development Agency CGRI/NABDA, Abuja, Nigeria
| | - Guanjie Chen
- Center for Research on Genomics and Global Health, National Institute of Health, Bethesda, MD, USA
| | | | | | | | | | - Marc Haber
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - Ioanna Tachmazidou
- GSK Medicines Research Centre, Gunnels Wood Road, Stevenage Hertfordshire SG1 2NY, UK
| | - Iain Mathieson
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Kenneth Ekoru
- Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda; Department of Medicine, University of Cambridge, Cambridge, UK
| | - Marianne K DeGorter
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Rebecca N Nsubuga
- Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda
| | - Chris Finan
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - Eleanor Wheeler
- Wellcome Sanger Institute, Hinxton, Cambridge, UK; MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | - Li Chen
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, UK
| | - Stephan Schiffels
- Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Yuan Chen
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | | | | | | | - Alex J Mentzer
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | | | | | - Konstantinos Hatzikotoulas
- Wellcome Sanger Institute, Hinxton, Cambridge, UK; Institute of Translational Genomics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Adebowale Adeyemo
- Center for Research on Genomics and Global Health, National Institute of Health, Bethesda, MD, USA
| | - Ayo Doumatey
- Center for Research on Genomics and Global Health, National Institute of Health, Bethesda, MD, USA
| | | | - Louise V Wain
- Department of Health Sciences, University of Leicester, Leicester, UK; National Institute for Health Research, Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Georg Ehret
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1211 Genève 14, Switzerland
| | - Paul L Auer
- Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI, USA
| | - Charles L Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA, USA; Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Dermot Maher
- Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda
| | - Stephen B Montgomery
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | | | - Yali Xue
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - Janet Seeley
- London School of Hygiene and Tropical Medicine, London, UK; Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda
| | - Gershim Asiki
- Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda
| | - Anatoli Kamali
- Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda
| | - Elizabeth H Young
- Wellcome Sanger Institute, Hinxton, Cambridge, UK; Department of Medicine, University of Cambridge, Cambridge, UK
| | - Cristina Pomilla
- Wellcome Sanger Institute, Hinxton, Cambridge, UK; Department of Medicine, University of Cambridge, Cambridge, UK
| | - Nicole Soranzo
- Wellcome Sanger Institute, Hinxton, Cambridge, UK; Department of Haematology, University of Cambridge, Cambridge, UK; The National Institute for Health Research Blood and Transplant Unit (NIHR BTRU) in Donor Health and Genomics, University of Cambridge, Cambridge, UK
| | - Eleftheria Zeggini
- Institute of Translational Genomics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Fraser Pirie
- Department of Diabetes and Endocrinology, University of KwaZulu-Natal, Durban, South Africa
| | - Andrew P Morris
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK; Department of Biostatistics, University of Liverpool, Liverpool, UK
| | | | | | - Ayesha A Motala
- Department of Diabetes and Endocrinology, University of KwaZulu-Natal, Durban, South Africa.
| | - Charles Rotimi
- Center for Research on Genomics and Global Health, National Institute of Health, Bethesda, MD, USA.
| | - Pontiano Kaleebu
- London School of Hygiene and Tropical Medicine, London, UK; Uganda Medical Informatics Centre (UMIC), MRC/UVRI and LSHTM (Uganda Research Unit), Entebbe, Uganda; Medical Research Council/Uganda Virus Research Institute (MRC/UVRI) and London School of Hygiene & Tropical Medicine Uganda Research Unit on AIDS, Entebbe, Uganda.
| | - Inês Barroso
- Wellcome Sanger Institute, Hinxton, Cambridge, UK; MRC Epidemiology Unit, University of Cambridge, Cambridge, UK.
| | - Manj S Sandhu
- Department of Medicine, University of Cambridge, Cambridge, UK.
| |
Collapse
|
2
|
Widmer C, Lippert C, Weissbrod O, Fusi N, Kadie C, Davidson R, Listgarten J, Heckerman D. Further improvements to linear mixed models for genome-wide association studies. Sci Rep 2014; 4:6874. [PMID: 25387525 PMCID: PMC4230738 DOI: 10.1038/srep06874] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2013] [Accepted: 10/14/2014] [Indexed: 11/09/2022] Open
Abstract
We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science.
Collapse
Affiliation(s)
- Christian Widmer
- eScience Group, Microsoft Research, 1100 Glendon Avenue, Suite
PH1, Los Angeles, CA, 90024, United States
| | - Christoph Lippert
- eScience Group, Microsoft Research, 1100 Glendon Avenue, Suite
PH1, Los Angeles, CA, 90024, United States
| | - Omer Weissbrod
- Computer Science Department, Technion - Israel Institute of
Technology, Haifa 32000, Israel
| | - Nicolo Fusi
- eScience Group, Microsoft Research, 1100 Glendon Avenue, Suite
PH1, Los Angeles, CA, 90024, United States
| | - Carl Kadie
- eScience Group, Microsoft Research, One Microsoft Way, Redmond,
WA, 98052, United States
| | - Robert Davidson
- eScience Group, Microsoft Research, One Microsoft Way, Redmond,
WA, 98052, United States
| | - Jennifer Listgarten
- eScience Group, Microsoft Research, 1100 Glendon Avenue, Suite
PH1, Los Angeles, CA, 90024, United States
| | - David Heckerman
- eScience Group, Microsoft Research, 1100 Glendon Avenue, Suite
PH1, Los Angeles, CA, 90024, United States
| |
Collapse
|
3
|
Lippert C, Xiang J, Horta D, Widmer C, Kadie C, Heckerman D, Listgarten J. Greater power and computational efficiency for kernel-based association testing of sets of genetic variants. ACTA ACUST UNITED AC 2014; 30:3206-14. [PMID: 25075117 PMCID: PMC4221116 DOI: 10.1093/bioinformatics/btu504] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test—a score test—with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. Results: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test—up to 23 more associations—whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene–gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Availability: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. Contact:heckerma@microsoft.com Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christoph Lippert
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Jing Xiang
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Danilo Horta
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Christian Widmer
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Carl Kadie
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - David Heckerman
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Jennifer Listgarten
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| |
Collapse
|
4
|
Zhang X, Cheng W, Listgarten J, Kadie C, Huang S, Wang W, Heckerman D. Learning transcriptional regulatory relationships using sparse graphical models. PLoS One 2012; 7:e35762. [PMID: 22586449 PMCID: PMC3346750 DOI: 10.1371/journal.pone.0035762] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Accepted: 03/21/2012] [Indexed: 11/19/2022] Open
Abstract
Understanding the organization and function of transcriptional regulatory networks by analyzing high-throughput gene expression profiles is a key problem in computational biology. The challenges in this work are 1) the lack of complete knowledge of the regulatory relationship between the regulators and the associated genes, 2) the potential for spurious associations due to confounding factors, and 3) the number of parameters to learn is usually larger than the number of available microarray experiments. We present a sparse (L1 regularized) graphical model to address these challenges. Our model incorporates known transcription factors and introduces hidden variables to represent possible unknown transcription and confounding factors. The expression level of a gene is modeled as a linear combination of the expression levels of known transcription factors and hidden factors. Using gene expression data covering 39,296 oligonucleotide probes from 1109 human liver samples, we demonstrate that our model better predicts out-of-sample data than a model with no hidden variables. We also show that some of the gene sets associated with hidden variables are strongly correlated with Gene Ontology categories. The software including source code is available at http://grnl1.codeplex.com.
Collapse
Affiliation(s)
- Xiang Zhang
- Microsoft Research, Los Angeles, California, United States of America
- Case Western Reserve University, Cleveland, Ohio, United States of America
- University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Wei Cheng
- University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | | | - Carl Kadie
- Microsoft Research, Los Angeles, California, United States of America
| | - Shunping Huang
- University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Wei Wang
- University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - David Heckerman
- Microsoft Research, Los Angeles, California, United States of America
- * E-mail:
| |
Collapse
|
5
|
Carlson JM, Listgarten J, Pfeifer N, Tan V, Kadie C, Walker BD, Ndung'u T, Shapiro R, Frater J, Brumme ZL, Goulder PJR, Heckerman D. Widespread impact of HLA restriction on immune control and escape pathways of HIV-1. J Virol 2012; 86:5230-43. [PMID: 22379086 PMCID: PMC3347390 DOI: 10.1128/jvi.06728-11] [Citation(s) in RCA: 99] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 02/20/2012] [Indexed: 11/20/2022] Open
Abstract
The promiscuous presentation of epitopes by similar HLA class I alleles holds promise for a universal T-cell-based HIV-1 vaccine. However, in some instances, cytotoxic T lymphocytes (CTL) restricted by HLA alleles with similar or identical binding motifs are known to target epitopes at different frequencies, with different functional avidities and with different apparent clinical outcomes. Such differences may be illuminated by the association of similar HLA alleles with distinctive escape pathways. Using a novel computational method featuring phylogenetically corrected odds ratios, we systematically analyzed differential patterns of immune escape across all optimally defined epitopes in Gag, Pol, and Nef in 2,126 HIV-1 clade C-infected adults. Overall, we identified 301 polymorphisms in 90 epitopes associated with HLA alleles belonging to shared supertypes. We detected differential escape in 37 of 38 epitopes restricted by more than one allele, which included 278 instances of differential escape at the polymorphism level. The majority (66 to 97%) of these resulted from the selection of unique HLA-specific polymorphisms rather than differential epitope targeting rates, as confirmed by gamma interferon (IFN-γ) enzyme-linked immunosorbent spot assay (ELISPOT) data. Discordant associations between HLA alleles and viral load were frequently observed between allele pairs that selected for differential escape. Furthermore, the total number of associated polymorphisms strongly correlated with average viral load. These studies confirm that differential escape is a widespread phenomenon and may be the norm when two alleles present the same epitope. Given the clinical correlates of immune escape, such heterogeneity suggests that certain epitopes will lead to discordant outcomes if applied universally in a vaccine.
Collapse
|
6
|
Almeida CAM, Bronke C, Roberts SG, McKinnon E, Keane NM, Chopra A, Kadie C, Carlson J, Haas DW, Riddler SA, Haubrich R, Heckerman D, Mallal S, John M. Translation of HLA-HIV associations to the cellular level: HIV adapts to inflate CD8 T cell responses against Nef and HLA-adapted variant epitopes. J Immunol 2011; 187:2502-13. [PMID: 21821798 DOI: 10.4049/jimmunol.1100691] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Strong statistical associations between polymorphisms in HIV-1 population sequences and carriage of HLA class I alleles have been widely used to identify possible sites of CD8 T cell immune selection in vivo. However, there have been few attempts to prospectively and systematically test these genetic hypotheses arising from population-based studies at a cellular, functional level. We assayed CD8 T cell epitope-specific IFN-γ responses in 290 individuals from the same cohort, which gave rise to 874 HLA-HIV associations in genetic analyses, taking into account autologous viral sequences and individual HLA genotypes. We found immunological evidence for 58% of 374 associations tested as sites of primary immune selection and identified up to 50 novel HIV-1 epitopes using this reverse-genomics approach. Many HLA-adapted epitopes elicited equivalent or higher-magnitude IFN-γ responses than did the nonadapted epitopes, particularly in Nef. At a population level, inclusion of all of the immunoreactive variant CD8 T cell epitopes in Gag, Pol, Nef, and Env suggested that HIV adaptation leads to an inflation of Nef-directed immune responses relative to other proteins. We concluded that HLA-HIV associations mark viral epitopes subject to CD8 T cell selection. These results can be used to guide functional studies of specific epitopes and escape mutations, as well as to test, train, and evaluate analytical models of viral escape and fitness. The inflation of Nef and HLA-adapted variant responses may have negative effects on natural and vaccine immunity against HIV and, therefore, has implications for diversity coverage approaches in HIV vaccine design.
Collapse
Affiliation(s)
- Coral-Ann M Almeida
- Centre for Clinical Immunology and Biomedical Statistics, Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, Western Australia 6150, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Huang JC, Meek C, Kadie C, Heckerman D. Conditional random fields for fast, large-scale genome-wide association studies. PLoS One 2011; 6:e21591. [PMID: 21765897 PMCID: PMC3134455 DOI: 10.1371/journal.pone.0021591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2011] [Accepted: 06/03/2011] [Indexed: 11/18/2022] Open
Abstract
Understanding the role of genetic variation in human diseases remains an important problem to be solved in genomics. An important component of such variation consist of variations at single sites in DNA, or single nucleotide polymorphisms (SNPs). Typically, the problem of associating particular SNPs to phenotypes has been confounded by hidden factors such as the presence of population structure, family structure or cryptic relatedness in the sample of individuals being analyzed. Such confounding factors lead to a large number of spurious associations and missed associations. Various statistical methods have been proposed to account for such confounding factors such as linear mixed-effect models (LMMs) or methods that adjust data based on a principal components analysis (PCA), but these methods either suffer from low power or cease to be tractable for larger numbers of individuals in the sample. Here we present a statistical model for conducting genome-wide association studies (GWAS) that accounts for such confounding factors. Our method scales in runtime quadratic in the number of individuals being studied with only a modest loss in statistical power as compared to LMM-based and PCA-based methods when testing on synthetic data that was generated from a generalized LMM. Applying our method to both real and synthetic human genotype/phenotype data, we demonstrate the ability of our model to correct for confounding factors while requiring significantly less runtime relative to LMMs. We have implemented methods for fitting these models, which are available at http://www.microsoft.com/science.
Collapse
Affiliation(s)
- Jim C. Huang
- Microsoft Research, Redmond, Washington, United States of America
| | - Christopher Meek
- Microsoft Research, Redmond, Washington, United States of America
| | - Carl Kadie
- Microsoft Research, Redmond, Washington, United States of America
| | - David Heckerman
- Microsoft Research, Redmond, Washington, United States of America
- * E-mail:
| |
Collapse
|
8
|
Lazaro E, Kadie C, Stamegna P, Zhang SC, Gourdain P, Lai NY, Zhang M, Martinez SA, Heckerman D, Le Gall S. Variable HIV peptide stability in human cytosol is critical to epitope presentation and immune escape. J Clin Invest 2011; 121:2480-92. [PMID: 21555856 DOI: 10.1172/jci44932] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2010] [Accepted: 03/16/2011] [Indexed: 11/17/2022] Open
Abstract
Induction of virus-specific CD8⁺ T cell responses is critical for the success of vaccines against chronic viral infections. Despite the large number of potential MHC-I-restricted epitopes located in viral proteins, MHC-I-restricted epitope generation is inefficient, and factors defining the production and presentation of MHC-I-restricted viral epitopes are poorly understood. Here, we have demonstrated that the half-lives of HIV-derived peptides in cytosol from primary human cells were highly variable and sequence dependent, and significantly affected the efficiency of cell recognition by CD8⁺ T cells. Furthermore, multiple clinical isolates of HLA-associated HIV epitope variants displayed reduced half-lives relative to consensus sequence. This decreased cytosolic peptide stability diminished epitope presentation and CTL recognition, illustrating a mechanism of immune escape. Chaperone complexes including Hsp90 and histone deacetylase HDAC6 enhanced peptide stability by transient protection from peptidase degradation. Based on empirical results with 166 peptides, we developed a computational approach utilizing a sequence-based algorithm to estimate the cytosolic stability of antigenic peptides. Our results identify sequence motifs able to alter the amount of peptide available for loading onto MHC-I, suggesting potential new strategies to modulate epitope production from vaccine immunogens.
Collapse
Affiliation(s)
- Estibaliz Lazaro
- Ragon Institute of MGH, MIT and Harvard, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts 02129, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Almeida CAM, Roberts SG, Laird R, McKinnon E, Ahmad I, Keane NM, Chopra A, Kadie C, Heckerman D, Mallal S, John M. Exploiting knowledge of immune selection in HIV-1 to detect HIV-specific CD8 T-cell responses. Vaccine 2010; 28:6052-7. [PMID: 20619380 DOI: 10.1016/j.vaccine.2010.06.091] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2010] [Revised: 06/23/2010] [Accepted: 06/25/2010] [Indexed: 02/09/2023]
Abstract
Since HLA-restricted cytotoxic T-cell responses select specific polymorphisms in HIV-1 sequences and HLA diversity is relatively static in human populations, we investigated the use of peptide epitopes based on sites of HLA-associated adaptation in HIV-1 sequences to stimulate and detect T-cell responses ex vivo. These "HLA-optimised" peptides captured more HIV-1 Nef-specific responses compared with overlapping peptides of a single consensus sequence, in interferon-gamma enzyme linked immunospot assays. Sites of immune selection can reveal more immunogenic epitopes in HLA-diverse populations and offer insights into the nature of HLA-epitope targeting, which could be applied in vaccine design.
Collapse
Affiliation(s)
- Coral-Ann M Almeida
- Centre for Clinical Immunology and Biomedical Statistics, Institute of Immunology and Infectious Diseases, Murdoch University, Perth, Western Australia, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Berger CT, Carlson J, Brumme CJ, Brumme ZL, Hartman K, Henry LM, Kadie C, Brockman MA, Harrigan R, Heckerman D, Brander C. P16-41. Evidence for in vivo immune selection pressure exerted by HLA class I restricted CTL responses to anti-sense encoded HIV sequences. Retrovirology 2009. [PMCID: PMC2767771 DOI: 10.1186/1742-4690-6-s3-p270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
11
|
Rousseau CM, Lockhart DW, Listgarten J, Maley SN, Kadie C, Learn GH, Nickle DC, Heckerman DE, Deng W, Brander C, Ndung'u T, Coovadia H, Goulder PJ, Korber BT, Walker BD, Mullins JI. Rare HLA drive additional HIV evolution compared to more frequent alleles. AIDS Res Hum Retroviruses 2009; 25:297-303. [PMID: 19327049 DOI: 10.1089/aid.2008.0208] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
HIV-1 can evolve HLA-specific escape variants in response to HLA-mediated cellular immunity. HLA alleles that are common in the host population may increase the frequency of such escape variants at the population level. When loss of viral fitness is caused by immune escape variation, these variants may revert upon infection of a new host who does not have the corresponding HLA allele. Furthermore, additional escape variants may appear in response to the nonconcordant HLA alleles. Because individuals with rare HLA alleles are less likely to be infected by a partner with concordant HLA alleles, viral populations infecting hosts with rare HLA alleles may undergo a greater amount of evolution than those infecting hosts with common alleles due to the loss of preexisting escape variants followed by new immune escape. This hypothesis was evaluated using maximum likelihood phylogenetic trees of each gene from 272 full-length HIV-1 sequences. Recent viral evolution, as measured by the external branch length, was found to be inversely associated with HLA frequency in nef (p < 0.02), env (p < 0.03), and pol (p < or = 0.05), suggesting that rare HLA alleles provide a disproportionate force driving viral evolution compared to common alleles, likely due to the loss of preexisting escape variants during early stages postinfection.
Collapse
Affiliation(s)
| | - David W. Lockhart
- Department of Biostatistics, University of Washington, Seattle Washington 98103
| | | | - Stephen N. Maley
- Department of Microbiology, University of Washington, Seattle Washington 98103
| | - Carl Kadie
- eScience Research Group, Microsoft Research, Redmond, Washington 98052
| | - Gerald H. Learn
- Department of Microbiology, University of Washington, Seattle Washington 98103
| | - David C. Nickle
- Department of Microbiology, University of Washington, Seattle Washington 98103
| | | | - Wenjie Deng
- Department of Microbiology, University of Washington, Seattle Washington 98103
| | - Christian Brander
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114
| | - Thumbi Ndung'u
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114
- HIV Pathogenesis Program, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
| | - Hoosen Coovadia
- HIV Pathogenesis Program, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
| | - Philip J.R. Goulder
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114
- HIV Pathogenesis Program, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
- Department of Pediatrics, Nuffield Department of Medicine, Oxford, England
| | - Bette T. Korber
- Los Alamos National Laboratory, Los Alamos, New Mexico 87544
- Santa Fe Institute, Santa Fe, New Mexico 87501
| | - Bruce D. Walker
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114
- HIV Pathogenesis Program, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
- Howard Hughes Medical Institute, Chevy Chase, Maryland 20815
| | - James I. Mullins
- Department of Microbiology, University of Washington, Seattle Washington 98103
- Department of Medicine, University of Washington, Seattle, Washington 98103
| |
Collapse
|
12
|
Yerly D, Heckerman D, Allen T, Suscovich TJ, Jojic N, Kadie C, Pichler WJ, Cerny A, Brander C. Design, expression, and processing of epitomized hepatitis C virus-encoded CTL epitopes. J Immunol 2009; 181:6361-70. [PMID: 18941227 DOI: 10.4049/jimmunol.181.9.6361] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Hepatitis C virus (HCV) vaccine efficacy may crucially depend on immunogen length and coverage of viral sequence diversity. However, covering a considerable proportion of the circulating viral sequence variants would likely require long immunogens, which for the conserved portions of the viral genome, would contain unnecessarily redundant sequence information. In this study, we present the design and in vitro performance analysis of a novel "epitome" approach that compresses frequent immune targets of the cellular immune response against HCV into a shorter immunogen sequence. Compression of immunological information is achieved by partial overlapping shared sequence motifs between individual epitopes. At the same time, sequence diversity coverage is provided by taking advantage of emerging cross-reactivity patterns among epitope variants so that epitope variants associated with the broadest variant cross-recognition are preferentially included. The processing and presentation analysis of specific epitopes included in such a compressed, in vitro-expressed HCV epitome indicated effective processing of a majority of tested epitopes, although re-presentation of some epitopes may require refined sequence design. Together, the present study establishes the epitome approach as a potential powerful tool for vaccine immunogen design, especially suitable for the induction of cellular immune responses against highly variable pathogens.
Collapse
Affiliation(s)
- Daniel Yerly
- Clinic for Rheumatology and Clinical Immunology, University of Bern, Bern, Switzerland
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Carlson JM, Brumme ZL, Rousseau CM, Brumme CJ, Matthews P, Kadie C, Mullins JI, Walker BD, Harrigan PR, Goulder PJR, Heckerman D. Phylogenetic dependency networks: inferring patterns of CTL escape and codon covariation in HIV-1 Gag. PLoS Comput Biol 2008; 4:e1000225. [PMID: 19023406 PMCID: PMC2579584 DOI: 10.1371/journal.pcbi.1000225] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2008] [Accepted: 10/09/2008] [Indexed: 11/18/2022] Open
Abstract
HIV avoids elimination by cytotoxic T-lymphocytes (CTLs) through the evolution of escape mutations. Although there is mounting evidence that these escape pathways are broadly consistent among individuals with similar human leukocyte antigen (HLA) class I alleles, previous population-based studies have been limited by the inability to simultaneously account for HIV codon covariation, linkage disequilibrium among HLA alleles, and the confounding effects of HIV phylogeny when attempting to identify HLA-associated viral evolution. We have developed a statistical model of evolution, called a phylogenetic dependency network, that accounts for these three sources of confounding and identifies the primary sources of selection pressure acting on each HIV codon. Using synthetic data, we demonstrate the utility of this approach for identifying sites of HLA-mediated selection pressure and codon evolution as well as the deleterious effects of failing to account for all three sources of confounding. We then apply our approach to a large, clinically-derived dataset of Gag p17 and p24 sequences from a multicenter cohort of 1144 HIV-infected individuals from British Columbia, Canada (predominantly HIV-1 clade B) and Durban, South Africa (predominantly HIV-1 clade C). The resulting phylogenetic dependency network is dense, containing 149 associations between HLA alleles and HIV codons and 1386 associations among HIV codons. These associations include the complete reconstruction of several recently defined escape and compensatory mutation pathways and agree with emerging data on patterns of epitope targeting. The phylogenetic dependency network adds to the growing body of literature suggesting that sites of escape, order of escape, and compensatory mutations are largely consistent even across different clades, although we also identify several differences between clades. As recent case studies have demonstrated, understanding both the complexity and the consistency of immune escape has important implications for CTL-based vaccine design. Phylogenetic dependency networks represent a major step toward systematically expanding our understanding of CTL escape to diverse populations and whole viral genes.
Collapse
Affiliation(s)
- Jonathan M. Carlson
- eScience Group, Microsoft Research, Redmond, Washington, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
- * E-mail: (JMC); (DH)
| | - Zabrina L. Brumme
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Christine M. Rousseau
- Department of Microbiology, University of Washington, Seattle, Washington, United States of America
| | - Chanson J. Brumme
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Philippa Matthews
- Department of Paediatrics, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Carl Kadie
- eScience Group, Microsoft Research, Redmond, Washington, United States of America
| | - James I. Mullins
- Department of Microbiology, University of Washington, Seattle, Washington, United States of America
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Bruce D. Walker
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| | - P. Richard Harrigan
- B.C. Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
- Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Philip J. R. Goulder
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Paediatrics, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
- HIV Pathogenesis Programme, The Doris Duke Medical Research Institute, University of KwaZulu-Natal, Durban, South Africa
| | - David Heckerman
- eScience Group, Microsoft Research, Redmond, Washington, United States of America
- * E-mail: (JMC); (DH)
| |
Collapse
|
14
|
Listgarten J, Brumme Z, Kadie C, Xiaojiang G, Walker B, Carrington M, Goulder P, Heckerman D. 168-P: In silico resolution of ambiguous HLA typing data. Hum Immunol 2008. [DOI: 10.1016/j.humimm.2008.08.187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
15
|
Rousseau CM, Daniels MG, Carlson JM, Kadie C, Crawford H, Prendergast A, Matthews P, Payne R, Rolland M, Raugi DN, Maust BS, Learn GH, Nickle DC, Coovadia H, Ndung'u T, Frahm N, Brander C, Walker BD, Goulder PJR, Bhattacharya T, Heckerman DE, Korber BT, Mullins JI. HLA class I-driven evolution of human immunodeficiency virus type 1 subtype c proteome: immune escape and viral load. J Virol 2008; 82:6434-46. [PMID: 18434400 PMCID: PMC2447109 DOI: 10.1128/jvi.02455-07] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2007] [Accepted: 04/11/2008] [Indexed: 01/02/2023] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) mutations that confer escape from cytotoxic T-lymphocyte (CTL) recognition can sometimes result in lower viral fitness. These mutations can then revert upon transmission to a new host in the absence of CTL-mediated immune selection pressure restricted by the HLA alleles of the prior host. To identify these potentially critical recognition points on the virus, we assessed HLA-driven viral evolution using three phylogenetic correction methods across full HIV-1 subtype C proteomes from a cohort of 261 South Africans and identified amino acids conferring either susceptibility or resistance to CTLs. A total of 558 CTL-susceptible and -resistant HLA-amino acid associations were identified and organized into 310 immunological sets (groups of individual associations related to a single HLA/epitope combination). Mutations away from seven susceptible residues, including four in Gag, were associated with lower plasma viral-RNA loads (q < 0.2 [where q is the expected false-discovery rate]) in individuals with the corresponding HLA alleles. The ratio of susceptible to resistant residues among those without the corresponding HLA alleles varied in the order Vpr > Gag > Rev > Pol > Nef > Vif > Tat > Env > Vpu (Fisher's exact test; P < or = 0.0009 for each comparison), suggesting the same ranking of fitness costs by genes associated with CTL escape. Significantly more HLA-B (chi(2); P = 3.59 x 10(-5)) and HLA-C (chi(2); P = 4.71 x 10(-6)) alleles were associated with amino acid changes than HLA-A, highlighting their importance in driving viral evolution. In conclusion, specific HIV-1 residues (enriched in Vpr, Gag, and Rev) and HLA alleles (particularly B and C) confer susceptibility to the CTL response and are likely to be important in the development of vaccines targeted to decrease the viral load.
Collapse
Affiliation(s)
- Christine M Rousseau
- Department of Microbiology, University of Washington, 1959 NE Pacific Street, Box 358070, Seattle, WA 98195-8070, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Brumme ZL, Brumme CJ, Heckerman D, Korber BT, Daniels M, Carlson J, Kadie C, Bhattacharya T, Chui C, Szinger J, Mo T, Hogg RS, Montaner JSG, Frahm N, Brander C, Walker BD, Harrigan PR. Evidence of differential HLA class I-mediated viral evolution in functional and accessory/regulatory genes of HIV-1. PLoS Pathog 2008; 3:e94. [PMID: 17616974 PMCID: PMC1904471 DOI: 10.1371/journal.ppat.0030094] [Citation(s) in RCA: 141] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2007] [Accepted: 05/17/2007] [Indexed: 12/20/2022] Open
Abstract
Despite the formidable mutational capacity and sequence diversity of HIV-1, evidence suggests that viral evolution in response to specific selective pressures follows generally predictable mutational pathways. Population-based analyses of clinically derived HIV sequences may be used to identify immune escape mutations in viral genes; however, prior attempts to identify such mutations have been complicated by the inability to discriminate active immune selection from virus founder effects. Furthermore, the association between mutations arising under in vivo immune selection and disease progression for highly variable pathogens such as HIV-1 remains incompletely understood. We applied a viral lineage-corrected analytical method to investigate HLA class I-associated sequence imprinting in HIV protease, reverse transcriptase (RT), Vpr, and Nef in a large cohort of chronically infected, antiretrovirally naïve individuals. A total of 478 unique HLA-associated polymorphisms were observed and organized into a series of "escape maps," which identify known and putative cytotoxic T lymphocyte (CTL) epitopes under selection pressure in vivo. Our data indicate that pathways to immune escape are predictable based on host HLA class I profile, and that epitope anchor residues are not the preferred sites of CTL escape. Results reveal differential contributions of immune imprinting to viral gene diversity, with Nef exhibiting far greater evidence for HLA class I-mediated selection compared to other genes. Moreover, these data reveal a significant, dose-dependent inverse correlation between HLA-associated polymorphisms and HIV disease stage as estimated by CD4(+) T cell count. Identification of specific sites and patterns of HLA-associated polymorphisms across HIV protease, RT, Vpr, and Nef illuminates regions of the genes encoding these products under active immune selection pressure in vivo. The high density of HLA-associated polymorphisms in Nef compared to other genes investigated indicates differential HLA class I-driven evolution in different viral genes. The relationship between HLA class I-associated polymorphisms and lower CD4(+) cell count suggests that immune escape correlates with disease status, supporting an essential role of maintenance of effective CTL responses in immune control of HIV-1. The design of preventative and therapeutic CTL-based vaccine approaches could incorporate information on predictable escape pathways.
Collapse
MESH Headings
- Amino Acid Sequence
- CD4 Lymphocyte Count
- CD4-Positive T-Lymphocytes/immunology
- CD4-Positive T-Lymphocytes/metabolism
- Epitopes, T-Lymphocyte/genetics
- Epitopes, T-Lymphocyte/immunology
- Evolution, Molecular
- Gene Expression Regulation, Viral
- Genes, MHC Class I/physiology
- HIV-1/genetics
- HIV-1/immunology
- Histocompatibility Antigens Class I/genetics
- Human Immunodeficiency Virus Proteins/genetics
- Human Immunodeficiency Virus Proteins/metabolism
- Humans
- Minor Histocompatibility Antigens
- Molecular Sequence Data
- Mutation
- Phylogeny
- Polymorphism, Genetic
- Selection, Genetic
- Viral Regulatory and Accessory Proteins/genetics
- Viral Regulatory and Accessory Proteins/metabolism
- nef Gene Products, Human Immunodeficiency Virus/genetics
- nef Gene Products, Human Immunodeficiency Virus/metabolism
- vpr Gene Products, Human Immunodeficiency Virus/genetics
- vpr Gene Products, Human Immunodeficiency Virus/metabolism
Collapse
Affiliation(s)
- Zabrina L Brumme
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
- * To whom correspondence should be addressed. E-mail:
| | - Chanson J Brumme
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
| | - David Heckerman
- Microsoft Research, Redmond, Washington, United States of America
| | - Bette T Korber
- Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| | - Marcus Daniels
- Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Jonathan Carlson
- Microsoft Research, Redmond, Washington, United States of America
- Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Carl Kadie
- Microsoft Research, Redmond, Washington, United States of America
| | - Tanmoy Bhattacharya
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
| | - Celia Chui
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
| | - James Szinger
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| | - Theresa Mo
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
| | - Robert S Hogg
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Julio S. G Montaner
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
- Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Nicole Frahm
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Christian Brander
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Bruce D Walker
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| | - P. Richard Harrigan
- British Columbia Centre for Excellence in HIV/AIDS, Vancouver, British Columbia, Canada
- Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
17
|
Listgarten J, Brumme Z, Kadie C, Xiaojiang G, Walker B, Carrington M, Goulder P, Heckerman D. Statistical resolution of ambiguous HLA typing data. PLoS Comput Biol 2008; 4:e1000016. [PMID: 18392148 PMCID: PMC2289775 DOI: 10.1371/journal.pcbi.1000016] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2007] [Accepted: 01/30/2008] [Indexed: 11/18/2022] Open
Abstract
High-resolution HLA typing plays a central role in many areas of immunology, such as in identifying immunogenetic risk factors for disease, in studying how the genomes of pathogens evolve in response to immune selection pressures, and also in vaccine design, where identification of HLA-restricted epitopes may be used to guide the selection of vaccine immunogens. Perhaps one of the most immediate applications is in direct medical decisions concerning the matching of stem cell transplant donors to unrelated recipients. However, high-resolution HLA typing is frequently unavailable due to its high cost or the inability to re-type historical data. In this paper, we introduce and evaluate a method for statistical, in silico refinement of ambiguous and/or low-resolution HLA data. Our method, which requires an independent, high-resolution training data set drawn from the same population as the data to be refined, uses linkage disequilibrium in HLA haplotypes as well as four-digit allele frequency data to probabilistically refine HLA typings. Central to our approach is the use of haplotype inference. We introduce new methodology to this area, improving upon the Expectation-Maximization (EM)-based approaches currently used within the HLA community. Our improvements are achieved by using a parsimonious parameterization for haplotype distributions and by smoothing the maximum likelihood (ML) solution. These improvements make it possible to scale the refinement to a larger number of alleles and loci in a more computationally efficient and stable manner. We also show how to augment our method in order to incorporate ethnicity information (as HLA allele distributions vary widely according to race/ethnicity as well as geographic area), and demonstrate the potential utility of this experimentally. A tool based on our approach is freely available for research purposes at http://microsoft.com/science. At the core of the human adaptive immune response is the train-to-kill mechanism in which specialized immune cells are sensitized to recognize small peptides from foreign sources (e.g., from HIV or bacteria). Following this sensitization, these immune cells are then activated to kill other cells which display this same peptide (and which contain this same foreign peptide). However, in order for sensitization and killing to occur, the foreign peptide must be “paired up” with one of the infected person's other specialized immune molecules—an HLA molecule. The way in which peptides interact with these HLA molecules defines if and how an immune response will be generated. There is a huge repertoire of such HLA molecules, with almost no two people having the same set. Furthermore, a person's HLA type can determine their susceptibility to disease, or the success of a transplant, for example. However, obtaining high quality HLA data for patients is often difficult because of the great cost and specialized laboratories required, or because the data are historical and cannot be retyped with modern methods. Therefore, we introduce a statistical model which can make use of existing high-quality HLA data, to infer higher-quality HLA data from lower-quality data.
Collapse
Affiliation(s)
- Jennifer Listgarten
- Microsoft Research, Redmond, Washington, United States of America
- * E-mail: (JL); (DH)
| | - Zabrina Brumme
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Carl Kadie
- Microsoft Research, Redmond, Washington, United States of America
| | - Gao Xiaojiang
- SAIC-Frederick, National Cancer Institute, Frederick, Maryland, United States of America
| | - Bruce Walker
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Howard Hughes Medical Institute, Frederick, Maryland, United States of America
| | - Mary Carrington
- SAIC-Frederick, National Cancer Institute, Frederick, Maryland, United States of America
| | - Philip Goulder
- Partners AIDS Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Paediatrics, University of Oxford, Oxford, United Kingdom
| | - David Heckerman
- Microsoft Research, Redmond, Washington, United States of America
- * E-mail: (JL); (DH)
| |
Collapse
|
18
|
Listgarten J, Frahm N, Kadie C, Brander C, Heckerman D. A statistical framework for modeling HLA-dependent T cell response data. PLoS Comput Biol 2007; 3:1879-86. [PMID: 17937494 PMCID: PMC2014793 DOI: 10.1371/journal.pcbi.0030188] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2007] [Accepted: 08/14/2007] [Indexed: 12/18/2022] Open
Abstract
The identification of T cell epitopes and their HLA (human leukocyte antigen) restrictions is important for applications such as the design of cellular vaccines for HIV. Traditional methods for such identification are costly and time-consuming. Recently, a more expeditious laboratory technique using ELISpot assays has been developed that allows for rapid screening of specific responses. However, this assay does not directly provide information concerning the HLA restriction of a response, a critical piece of information for vaccine design. Thus, we introduce, apply, and validate a statistical model for identifying HLA-restricted epitopes from ELISpot data. By looking at patterns across a broad range of donors, in conjunction with our statistical model, we can determine (probabilistically) which of the HLA alleles are likely to be responsible for the observed reactivities. Additionally, we can provide a good estimate of the number of false positives generated by our analysis (i.e., the false discovery rate). This model allows us to learn about new HLA-restricted epitopes from ELISpot data in an efficient, cost-effective, and high-throughput manner. We applied our approach to data from donors infected with HIV and identified many potential new HLA restrictions. Among 134 such predictions, six were confirmed in the lab and the remainder could not be ruled as invalid. These results shed light on the extent of HLA class I promiscuity, which has significant implications for the understanding of HLA class I antigen presentation and vaccine development. At the core of the human adaptive immune response is the train-to-kill mechanism in which specialized immune cells are sensitized to recognize small peptides from foreign pathogens (e.g., HIV virus). Following this sensitization, these cells are then activated to kill other cells that display this same peptide (and that are infected by this same pathogen). However, for sensitization and killing to occur, the pathogen peptide must be “paired up” with one of the infected person's other specialized immune molecules—an HLA (human leukocyte antigen) molecule. The way in which pathogen peptides interact with these HLA molecules defines if and how an immune response will be generated, which has implications for vaccine design where one may artificially introduce select peptides to pre-train the immune system. Furthermore, there is a huge repertoire of such HLA molecules, with almost no two people having the same set. We introduce a statistical approach for identifying which HLA molecules interact with which pathogen peptides, given a particular kind of laboratory data. Our approach takes as input, data that tells us only which pathogen peptides generate a response, but not which HLA molecules support the response. Our statistical approach fills in this missing information.
Collapse
Affiliation(s)
| | - Nicole Frahm
- Partners AIDS Research Center, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America
| | - Carl Kadie
- Microsoft Research, Redmond, Washington, United States of America
| | - Christian Brander
- Partners AIDS Research Center, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America
| | - David Heckerman
- Microsoft Research, Redmond, Washington, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
19
|
Frahm N, Yusim K, Suscovich TJ, Adams S, Sidney J, Hraber P, Hewitt HS, Linde CH, Kavanagh DG, Woodberry T, Henry LM, Faircloth K, Listgarten J, Kadie C, Jojic N, Sango K, Brown NV, Pae E, Zaman MT, Bihl F, Khatri A, John M, Mallal S, Marincola FM, Walker BD, Sette A, Heckerman D, Korber BT, Brander C. Extensive HLA class I allele promiscuity among viral CTL epitopes. Eur J Immunol 2007; 37:2419-33. [PMID: 17705138 PMCID: PMC2628559 DOI: 10.1002/eji.200737365] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Promiscuous binding of T helper epitopes to MHC class II molecules has been well established, but few examples of promiscuous class I-restricted epitopes exist. To address the extent of promiscuity of HLA class I peptides, responses to 242 well-defined viral epitopes were tested in 100 subjects regardless of the individuals' HLA type. Surprisingly, half of all detected responses were seen in the absence of the originally reported restricting HLA class I allele, and only 3% of epitopes were recognized exclusively in the presence of their original allele. Functional assays confirmed the frequent recognition of HLA class I-restricted T cell epitopes on several alternative alleles across HLA class I supertypes and encoded on different class I loci. These data have significant implications for the understanding of MHC class I-restricted antigen presentation and vaccine development.
Collapse
Affiliation(s)
- Nicole Frahm
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Karina Yusim
- Theoretical Biophysics, Los Alamos National Laboratory, Los Alamos, NM
| | - Todd J. Suscovich
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | | | - John Sidney
- La Jolla Institute of Allergy and Immunology, Redmond, WA
| | - Peter Hraber
- Theoretical Biophysics, Los Alamos National Laboratory, Los Alamos, NM
| | - Hannah S. Hewitt
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Caitlyn H. Linde
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Daniel G. Kavanagh
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Tonia Woodberry
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Leah M. Henry
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Kellie Faircloth
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | | | | | | | - Kaori Sango
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Nancy V. Brown
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Eunice Pae
- Fenway Community Health Center, Boston, MA
| | | | - Florian Bihl
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Ashok Khatri
- Endocrine Unit, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | - Mina John
- Centre for Clinical Immunology and Biomedical Statistics, Royal Perth Hospital and Murdoch University, Perth, Australia
| | - Simon Mallal
- Centre for Clinical Immunology and Biomedical Statistics, Royal Perth Hospital and Murdoch University, Perth, Australia
| | | | - Bruce D. Walker
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| | | | | | - Bette T. Korber
- Theoretical Biophysics, Los Alamos National Laboratory, Los Alamos, NM
- Santa Fe Institute, Santa Fe, NM, USA
| | - Christian Brander
- Partners AIDS Research Center, Massachusetts General Hospital and Division of AIDS, Harvard Medical School, Boston, MA
| |
Collapse
|
20
|
Abstract
We present a model for predicting HLA class I restricted CTL epitopes. In contrast to almost all other work in this area, we train a single model on epitopes from all HLA alleles and supertypes, yet retain the ability to make epitope predictions for specific HLA alleles. We are therefore able to leverage data across all HLA alleles and/or their supertypes, automatically learning what information should be shared and also how to combine allele-specific, supertype-specific, and global information in a principled way. We show that this leveraging can improve prediction of epitopes having HLA alleles with known supertypes, and dramatically increases our ability to predict epitopes having alleles which do not fall into any of the known supertypes. Our model, which is based on logistic regression, is simple to implement and understand, is solved by finding a single global maximum, and is more accurate (to our knowledge) than any other model.
Collapse
|
21
|
Abstract
Population structure can confound the identification of correlations in biological data. Such confounding has been recognized in multiple biological disciplines, resulting in a disparate collection of proposed solutions. We examine several methods that correct for confounding on discrete data with hierarchical population structure and identify two distinct confounding processes, which we call coevolution and conditional influence. We describe these processes in terms of generative models and show that these generative models can be used to correct for the confounding effects. Finally, we apply the models to three applications: identification of escape mutations in HIV-1 in response to specific HLA-mediated immune pressure, prediction of coevolving residues in an HIV-1 peptide, and a search for genotypes that are associated with bacterial resistance traits in Arabidopsis thaliana. We show that coevolution is a better description of confounding in some applications and conditional influence is better in others. That is, we show that no single method is best for addressing all forms of confounding. Analysis tools based on these models are available on the internet as both web based applications and downloadable source code at http://atom.research.microsoft.com/bio/phylod.aspx.
Collapse
Affiliation(s)
- Jonathan Carlson
- Machine Learning and Applied Statistics Group, Microsoft Research, Redmond, Washington, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
| | - Carl Kadie
- Machine Learning and Applied Statistics Group, Microsoft Research, Redmond, Washington, United States of America
| | - Simon Mallal
- Center for Clinical Immunology and Biomedical Statistics, Royal Perth Hospital, Perth, Australia
| | - David Heckerman
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
22
|
Bhattacharya T, Daniels M, Heckerman D, Foley B, Frahm N, Kadie C, Carlson J, Yusim K, McMahon B, Gaschen B, Mallal S, Mullins JI, Nickle DC, Herbeck J, Rousseau C, Learn GH, Miura T, Brander C, Walker B, Korber B. Founder effects in the assessment of HIV polymorphisms and HLA allele associations. Science 2007; 315:1583-6. [PMID: 17363674 DOI: 10.1126/science.1131528] [Citation(s) in RCA: 210] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Escape from T cell-mediated immune responses affects the ongoing evolution of rapidly evolving viruses such as HIV. By applying statistical approaches that account for phylogenetic relationships among viral sequences, we show that viral lineage effects rather than immune escape often explain apparent human leukocyte antigen (HLA)-mediated immune-escape mutations defined by older analysis methods. Phylogenetically informed methods identified immune-susceptible locations with greatly improved accuracy, and the associations we identified with these methods were experimentally validated. This approach has practical implications for understanding the impact of host immunity on pathogen evolution and for defining relevant variants for inclusion in vaccine antigens.
Collapse
|
23
|
Abstract
MOTIVATION AND RESULTS Motivated by the ability of a simple threading approach to predict MHC I--peptide binding, we developed a new and improved structure-based model for which parameters can be estimated from additional sources of data about MHC-peptide binding. In addition to the known 3D structures of a small number of MHC-peptide complexes that were used in the original threading approach, we included three other sources of information on peptide-MHC binding: (1) MHC class I sequences; (2) known binding energies for a large number of MHC-peptide complexes; and (3) an even larger binary dataset that contains information about strong binders (epitopes) and non-binders (peptides that have a low affinity for a particular MHC molecule). Our model significantly outperforms the standard threading approach in binding energy prediction. In our approach, which we call adaptive double threading, the parameters of the threading model are learnable, and both MHC and peptide sequences can be threaded onto structures of other alleles. These two properties make our model appropriate for predicting binding for alleles for which very little data (if any) is available beyond just their sequence, including prediction for alleles for which 3D structures are not available. The ability of our model to generalize beyond the MHC types for which training data is available also separates our approach from epitope prediction methods which treat MHC alleles as symbolic types, rather than biological sequences. We used the trained binding energy predictor to study viral infections in 246 HIV patients from the West Australian cohort, and over 1000 sequences in HIV clade B from Los Alamos National Laboratory database, capturing the course of HIV evolution over the last 20 years. Finally, we illustrate short-, medium-, and long-term adaptation of HIV to the human immune system. AVAILABILITY http://www.research.microsoft.com/~jojic/hlaBinding.html.
Collapse
|