1
|
Di Francia R, Crisci S, De Monaco A, Cafiero C, Re A, Iaccarino G, De Filippi R, Frigeri F, Corazzelli G, Micera A, Pinto A. Response and Toxicity to Cytarabine Therapy in Leukemia and Lymphoma: From Dose Puzzle to Pharmacogenomic Biomarkers. Cancers (Basel) 2021; 13:cancers13050966. [PMID: 33669053 PMCID: PMC7956511 DOI: 10.3390/cancers13050966] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 02/18/2021] [Accepted: 02/19/2021] [Indexed: 01/04/2023] Open
Abstract
Simple Summary In this review, the authors propose a crosswise examination of cytarabine-related issues ranging from the spectrum of clinical activity and severe toxicities, through updated cellular pharmacology and drug formulations, to the genetic variants associated with drug-induced phenotypes. Cytarabine (cytosine arabinoside; Ara-C) in multiagent chemotherapy regimens is often used for leukemia or lymphoma treatments, as well as neoplastic meningitis. Chemotherapy regimens can induce a suboptimal clinical outcome in a fraction of patients. The individual variability in clinical response to Leukemia & Lymphoma treatments among patients appears to be associated with intracellular accumulation of Ara-CTP due to genetic variants related to metabolic enzymes. The review provides exhaustive information on the effects of Ara-C-based therapies, the adverse drug reaction will also be provided including bone pain, ocular toxicity (corneal pain, keratoconjunctivitis, and blurred vision), maculopapular rash, and occasional chest pain. Evidence for predicting the response to cytarabine-based treatments will be highlighted, pointing at their significant impact on the routine management of blood cancers. Abstract Cytarabine is a pyrimidine nucleoside analog, commonly used in multiagent chemotherapy regimens for the treatment of leukemia and lymphoma, as well as for neoplastic meningitis. Ara-C-based chemotherapy regimens can induce a suboptimal clinical outcome in a fraction of patients. Several studies suggest that the individual variability in clinical response to Leukemia & Lymphoma treatments among patients, underlying either Ara-C mechanism resistance or toxicity, appears to be associated with the intracellular accumulation and retention of Ara-CTP due to genetic variants related to metabolic enzymes. Herein, we reported (a) the latest Pharmacogenomics biomarkers associated with the response to cytarabine and (b) the new drug formulations with optimized pharmacokinetics. The purpose of this review is to provide readers with detailed and comprehensive information on the effects of Ara-C-based therapies, from biological to clinical practice, maintaining high the interest of both researcher and clinical hematologist. This review could help clinicians in predicting the response to cytarabine-based treatments.
Collapse
Affiliation(s)
- Raffaele Di Francia
- Italian Association of Pharmacogenomics and Molecular Diagnostics, 60126 Ancona, Italy;
| | - Stefania Crisci
- Hematology-Oncology and Stem Cell transplantation Unit, National Cancer Institute, Fondazione “G. Pascale” IRCCS, 80131 Naples, Italy; (S.C.); (G.I.); (R.D.F.); (G.C.); (A.P.)
| | - Angela De Monaco
- Clinical Patology, ASL Napoli 2 Nord, “S.M. delle Grazie Hospital”, 80078 Pozzuoli, Italy;
| | - Concetta Cafiero
- Medical Oncology, S.G. Moscati, Statte, 74010 Taranto, Italy
- Correspondence: or (C.C.); (A.M.); Tel.:+39-34-0101-2002 (C.C.); +39-06-4554-1191 (A.M.)
| | - Agnese Re
- Università Cattolica del Sacro Cuore, 00168 Rome, Italy;
| | - Giancarla Iaccarino
- Hematology-Oncology and Stem Cell transplantation Unit, National Cancer Institute, Fondazione “G. Pascale” IRCCS, 80131 Naples, Italy; (S.C.); (G.I.); (R.D.F.); (G.C.); (A.P.)
| | - Rosaria De Filippi
- Hematology-Oncology and Stem Cell transplantation Unit, National Cancer Institute, Fondazione “G. Pascale” IRCCS, 80131 Naples, Italy; (S.C.); (G.I.); (R.D.F.); (G.C.); (A.P.)
- Department of Clinical Medicine and Surgery, Federico II University, 80131 Naples, Italy
| | | | - Gaetano Corazzelli
- Hematology-Oncology and Stem Cell transplantation Unit, National Cancer Institute, Fondazione “G. Pascale” IRCCS, 80131 Naples, Italy; (S.C.); (G.I.); (R.D.F.); (G.C.); (A.P.)
| | - Alessandra Micera
- Research and Development Laboratory for Biochemical, Molecular and Cellular Applications in Ophthalmological Sciences, IRCCS—Fondazione Bietti, 00184 Rome, Italy
- Correspondence: or (C.C.); (A.M.); Tel.:+39-34-0101-2002 (C.C.); +39-06-4554-1191 (A.M.)
| | - Antonio Pinto
- Hematology-Oncology and Stem Cell transplantation Unit, National Cancer Institute, Fondazione “G. Pascale” IRCCS, 80131 Naples, Italy; (S.C.); (G.I.); (R.D.F.); (G.C.); (A.P.)
| |
Collapse
|
2
|
Abstract
Since the initial success of genome-wide association studies (GWAS) in 2005, tens of thousands of genetic variants have been identified for hundreds of human diseases and traits. In a GWAS, genotype information at up to millions of genetic markers is collected from up to hundreds of thousands of individuals, together with their phenotype information. Several scientific goals can be accomplished through the analysis of GWAS data, including the identification of variants, genes, and pathways associated with diseases and traits of interest; the inference of the genetic architecture of these traits; and the development of genetic risk prediction models. In this review, we provide an overview of the statistical challenges in achieving these goals and recent progress in statistical methodology to address these challenges.
Collapse
Affiliation(s)
- Ning Sun
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520, USA
| |
Collapse
|
3
|
Leveraging genome characteristics to improve gene discovery for putamen subcortical brain structure. Sci Rep 2017; 7:15736. [PMID: 29147026 PMCID: PMC5691156 DOI: 10.1038/s41598-017-15705-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 10/31/2017] [Indexed: 12/21/2022] Open
Abstract
Discovering genetic variants associated with human brain structures is an on-going effort. The ENIGMA consortium conducted genome-wide association studies (GWAS) with standard multi-study analytical methodology and identified several significant single nucleotide polymorphisms (SNPs). Here we employ a novel analytical approach that incorporates functional genome annotations (e.g., exon or 5′UTR), total linkage disequilibrium (LD) scores and heterozygosity to construct enrichment scores for improved identification of relevant SNPs. The method provides increased power to detect associated SNPs by estimating stratum-specific false discovery rate (FDR), where strata are classified according to enrichment scores. Applying this approach to the GWAS summary statistics of putamen volume in the ENIGMA cohort, a total of 15 independent significant SNPs were identified (conditional FDR < 0.05). In contrast, 4 SNPs were found based on standard GWAS analysis (P < 5 × 10−8). These 11 novel loci include GATAD2B, ASCC3, DSCAML1, and HELZ, which are previously implicated in various neural related phenotypes. The current findings demonstrate the boost in power with the annotation-informed FDR method, and provide insight into the genetic architecture of the putamen.
Collapse
|
4
|
Zhang J, Feng JY, Ni YL, Wen YJ, Niu Y, Tamba CL, Yue C, Song Q, Zhang YM. pLARmEB: integration of least angle regression with empirical Bayes for multilocus genome-wide association studies. Heredity (Edinb) 2017; 118:517-524. [PMID: 28295030 PMCID: PMC5436030 DOI: 10.1038/hdy.2017.8] [Citation(s) in RCA: 117] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 01/14/2017] [Accepted: 01/20/2017] [Indexed: 02/06/2023] Open
Abstract
Multilocus genome-wide association studies (GWAS) have become the state-of-the-art procedure to identify quantitative trait nucleotides (QTNs) associated with complex traits. However, implementation of multilocus model in GWAS is still difficult. In this study, we integrated least angle regression with empirical Bayes to perform multilocus GWAS under polygenic background control. We used an algorithm of model transformation that whitened the covariance matrix of the polygenic matrix K and environmental noise. Markers on one chromosome were included simultaneously in a multilocus model and least angle regression was used to select the most potentially associated single-nucleotide polymorphisms (SNPs), whereas the markers on the other chromosomes were used to calculate kinship matrix as polygenic background control. The selected SNPs in multilocus model were further detected for their association with the trait by empirical Bayes and likelihood ratio test. We herein refer to this method as the pLARmEB (polygenic-background-control-based least angle regression plus empirical Bayes). Results from simulation studies showed that pLARmEB was more powerful in QTN detection and more accurate in QTN effect estimation, had less false positive rate and required less computing time than Bayesian hierarchical generalized linear model, efficient mixed model association (EMMA) and least angle regression plus empirical Bayes. pLARmEB, multilocus random-SNP-effect mixed linear model and fast multilocus random-SNP-effect EMMA methods had almost equal power of QTN detection in simulation experiments. However, only pLARmEB identified 48 previously reported genes for 7 flowering time-related traits in Arabidopsis thaliana.
Collapse
Affiliation(s)
- J Zhang
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - J-Y Feng
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - Y-L Ni
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - Y-J Wen
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - Y Niu
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - C L Tamba
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - C Yue
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
| | - Q Song
- Soybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD, USA
| | - Y-M Zhang
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
- Statistical Genomics Lab, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
5
|
Affiliation(s)
- Ang Li
- Department of Statistics, University of Chicago, Chicago, IL
| | | |
Collapse
|
6
|
Pereira M, Thompson JR, Weichenberger CX, Thomas DC, Minelli C. Inclusion of biological knowledge in a Bayesian shrinkage model for joint estimation of SNP effects. Genet Epidemiol 2017; 41:320-331. [PMID: 28393391 DOI: 10.1002/gepi.22038] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Revised: 12/18/2016] [Accepted: 12/26/2016] [Indexed: 01/04/2023]
Abstract
With the aim of improving detection of novel single-nucleotide polymorphisms (SNPs) in genetic association studies, we propose a method of including prior biological information in a Bayesian shrinkage model that jointly estimates SNP effects. We assume that the SNP effects follow a normal distribution centered at zero with variance controlled by a shrinkage hyperparameter. We use biological information to define the amount of shrinkage applied on the SNP effects distribution, so that the effects of SNPs with more biological support are less shrunk toward zero, thus being more likely detected. The performance of the method was tested in a simulation study (1,000 datasets, 500 subjects with ∼200 SNPs in 10 linkage disequilibrium (LD) blocks) using a continuous and a binary outcome. It was further tested in an empirical example on body mass index (continuous) and overweight (binary) in a dataset of 1,829 subjects and 2,614 SNPs from 30 blocks. Biological knowledge was retrieved using the bioinformatics tool Dintor, which queried various databases. The joint Bayesian model with inclusion of prior information outperformed the standard analysis: in the simulation study, the mean ranking of the true LD block was 2.8 for the Bayesian model versus 3.6 for the standard analysis of individual SNPs; in the empirical example, the mean ranking of the six true blocks was 8.5 versus 9.3 in the standard analysis. These results suggest that our method is more powerful than the standard analysis. We expect its performance to improve further as more biological information about SNPs becomes available.
Collapse
Affiliation(s)
- Miguel Pereira
- National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | - John R Thompson
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - Christian X Weichenberger
- Center for Biomedicine, European Academy of Bolzano/Bozen (EURAC), Bolzano, Italy, Affiliated to the University of Lübeck, Lübeck, Germany
| | - Duncan C Thomas
- Biostatistics Division, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Cosetta Minelli
- National Heart and Lung Institute, Imperial College London, London, United Kingdom
| |
Collapse
|
7
|
O'Brien KM, Cole SR, Poole C, Bensen JT, Herring AH, Engel LS, Millikan RC. Replication of breast cancer susceptibility loci in whites and African Americans using a Bayesian approach. Am J Epidemiol 2014; 179:382-94. [PMID: 24218030 DOI: 10.1093/aje/kwt258] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Genome-wide association studies (GWAS) and candidate gene analyses have led to the discovery of several dozen genetic polymorphisms associated with breast cancer susceptibility, many of which are considered well-established risk factors for the disease. Despite attempts to replicate these same variant-disease associations in African Americans, the evaluable populations are often too small to produce precise or consistent results. We estimated the associations between 83 previously identified single nucleotide polymorphisms (SNPs) and breast cancer among Carolina Breast Cancer Study (1993-2001) participants using maximum likelihood, Bayesian, and hierarchical methods. The selected SNPs were previous GWAS hits (n = 22), near-hits (n = 19), otherwise well-established risk loci (n = 5), or located in the same genes as selected variants (n = 37). We successfully replicated 18 GWAS-identified SNPs in whites (n = 2,352) and 10 in African Americans (n = 1,447). SNPs in the fibroblast growth factor receptor 2 gene (FGFR2) and the TOC high mobility group box family member 3 gene (TOX3) were strongly associated with breast cancer in both races. SNPs in the mitochondrial ribosomal protein S30 gene (MRPS30), mitogen-activated protein kinase kinase kinase 1 gene (MAP3K1), zinc finger, MIZ-type containing 1 gene (ZMIZ1), and H19, imprinted maternally expressed transcript gene (H19) were associated with breast cancer in whites, and SNPs in the estrogen receptor 1 gene (ESR1) and H19 gene were associated with breast cancer in African Americans. We provide precise and well-informed race-stratified odds ratios for key breast cancer-related SNPs. Our results demonstrate the utility of Bayesian methods in genetic epidemiology and provide support for their application in small, etiologically driven investigations.
Collapse
|
8
|
Carbonetto P, Stephens M. Integrated enrichment analysis of variants and pathways in genome-wide association studies indicates central role for IL-2 signaling genes in type 1 diabetes, and cytokine signaling genes in Crohn's disease. PLoS Genet 2013; 9:e1003770. [PMID: 24098138 PMCID: PMC3789883 DOI: 10.1371/journal.pgen.1003770] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Accepted: 07/22/2013] [Indexed: 12/17/2022] Open
Abstract
Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study. Genome-wide association studies have helped locate gene variants that affect our susceptibility to diseases. The analysis of these studies is typically straightforward: test each genetic variant whether it is correlated with predisposition to disease. This approach often works well for identifying commonly occurring variants with moderate effects on disease risk. However, the effects of many variants are so small they fail to register statistically significant correlations. This is a concern because many diseases are modulated by many genetic factors with small effects on disease risk. An alternative is to examine groups of variants, such as variants sharing a common pathway, and assess whether these groups are “enriched” for correlations with disease. This can be a more effective approach to identifying genetic factors relevant to disease. However, it does not tell us which genes are associated with disease. To address this limitation, we describe an approach that integrates enrichment analysis with tests for disease-variant correlations within a single framework. We illustrate this approach in genome-wide studies of seven complex diseases. We show that our approach supports enriched pathways in several diseases, and uncovers disease-susceptibility genes in these pathways not identified in conventional analyses of the same data.
Collapse
Affiliation(s)
- Peter Carbonetto
- Dept. of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| | - Matthew Stephens
- Dept. of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Dept. of Statistics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
9
|
Abstract
Summary: Recent technological developments in measuring genetic variation have ushered in an era of genome-wide association studies which have discovered many genes involved in human disease. Current methods to perform association studies collect genetic information and compare the frequency of variants in individuals with and without the disease. Standard approaches do not take into account any information on whether or not a given variant is likely to have an effect on the disease. We propose a novel method for computing an association statistic which takes into account prior information. Our method improves both power and resolution by 8% and 27%, respectively, over traditional methods for performing association studies when applied to simulations using the HapMap data. Advantages of our method are that it is as simple to apply to association studies as standard methods, the results of the method are interpretable as the method reports p-values, and the method is optimal in its use of prior information in regards to statistical power. Availability: The method presented herein is available at http://masa.cs.ucla.edu Contact:eeskin@cs.ucla.edu
Collapse
Affiliation(s)
- Gregory Darnell
- Department of Computer Science, University of California, Los Angeles, CA 90095, USA
| | | | | | | |
Collapse
|
10
|
Abstract
Genetic variation influences the response of an individual to drug treatments. Understanding this variation has the potential to make therapy safer and more effective by determining selection and dosing of drugs for an individual patient. In the context of cancer, tumours may have specific disease-defining mutations, but a patient's germline genetic variation will also affect drug response (both efficacy and toxicity), and here we focus on how to study this variation. Advances in sequencing technologies, statistical genetics analysis methods and clinical trial designs have shown promise for the discovery of variants associated with drug response. We discuss the application of germline genetics analysis methods to cancer pharmacogenomics with a focus on the special considerations for study design.
Collapse
|
11
|
Zheng G, Yuan A, Jeffries N. Hybrid Bayes factors for genome-wide association studies when a robust test is used. Comput Stat Data Anal 2011. [DOI: 10.1016/j.csda.2011.03.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
12
|
Cabras S, Castellanos ME, Biino G, Persico I, Sassu A, Casula L, Del Giacco S, Bertolino F, Pirastu M, Pirastu N. A strategy analysis for genetic association studies with known inbreeding. BMC Genet 2011; 12:63. [PMID: 21767363 PMCID: PMC3155486 DOI: 10.1186/1471-2156-12-63] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2011] [Accepted: 07/18/2011] [Indexed: 11/10/2022] Open
Abstract
Background Association studies consist in identifying the genetic variants which are related to a specific disease through the use of statistical multiple hypothesis testing or segregation analysis in pedigrees. This type of studies has been very successful in the case of Mendelian monogenic disorders while it has been less successful in identifying genetic variants related to complex diseases where the insurgence depends on the interactions between different genes and the environment. The current technology allows to genotype more than a million of markers and this number has been rapidly increasing in the last years with the imputation based on templates sets and whole genome sequencing. This type of data introduces a great amount of noise in the statistical analysis and usually requires a great number of samples. Current methods seldom take into account gene-gene and gene-environment interactions which are fundamental especially in complex diseases. In this paper we propose to use a non-parametric additive model to detect the genetic variants related to diseases which accounts for interactions of unknown order. Although this is not new to the current literature, we show that in an isolated population, where the most related subjects share also most of their genetic code, the use of additive models may be improved if the available genealogical tree is taken into account. Specifically, we form a sample of cases and controls with the highest inbreeding by means of the Hungarian method, and estimate the set of genes/environmental variables, associated with the disease, by means of Random Forest. Results We have evidence, from statistical theory, simulations and two applications, that we build a suitable procedure to eliminate stratification between cases and controls and that it also has enough precision in identifying genetic variants responsible for a disease. This procedure has been successfully used for the beta-thalassemia, which is a well known Mendelian disease, and also to the common asthma where we have identified candidate genes that underlie to the susceptibility of the asthma. Some of such candidate genes have been also found related to common asthma in the current literature. Conclusions The data analysis approach, based on selecting the most related cases and controls along with the Random Forest model, is a powerful tool for detecting genetic variants associated to a disease in isolated populations. Moreover, this method provides also a prediction model that has accuracy in estimating the unknown disease status and that can be generally used to build kit tests for a wide class of Mendelian diseases.
Collapse
Affiliation(s)
- Stefano Cabras
- Department of Mathematics and Informatics, University of Cagliari, Cagliari, Italy.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Huang H, Chanda P, Alonso A, Bader JS, Arking DE. Gene-based tests of association. PLoS Genet 2011; 7:e1002177. [PMID: 21829371 PMCID: PMC3145613 DOI: 10.1371/journal.pgen.1002177] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Accepted: 05/25/2011] [Indexed: 11/19/2022] Open
Abstract
Genome-wide association studies (GWAS) are now used routinely to identify SNPs associated with complex human phenotypes. In several cases, multiple variants within a gene contribute independently to disease risk. Here we introduce a novel Gene-Wide Significance (GWiS) test that uses greedy Bayesian model selection to identify the independent effects within a gene, which are combined to generate a stronger statistical signal. Permutation tests provide p-values that correct for the number of independent tests genome-wide and within each genetic locus. When applied to a dataset comprising 2.5 million SNPs in up to 8,000 individuals measured for various electrocardiography (ECG) parameters, this method identifies more validated associations than conventional GWAS approaches. The method also provides, for the first time, systematic assessments of the number of independent effects within a gene and the fraction of disease-associated genes housing multiple independent effects, observed at 35%-50% of loci in our study. This method can be generalized to other study designs, retains power for low-frequency alleles, and provides gene-based p-values that are directly compatible for pathway-based meta-analysis.
Collapse
Affiliation(s)
- Hailiang Huang
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- High Throughput Biology Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Pritam Chanda
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- High Throughput Biology Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Alvaro Alonso
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Joel S. Bader
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- High Throughput Biology Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Dan E. Arking
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|