1
|
Liu L, Ren D, Li K, Ji L, Feng M, Li Z, Meng L, He G, Shi Y. Unraveling schizophrenia's genetic complexity through advanced causal inference and chromatin 3D conformation. Schizophr Res 2024; 270:476-485. [PMID: 38996525 DOI: 10.1016/j.schres.2024.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 07/01/2024] [Accepted: 07/03/2024] [Indexed: 07/14/2024]
Abstract
Schizophrenia is a polygenic complex disease with a heritability as high as 80 %, yet the mechanism of polygenic interaction in its pathogenesis remains unclear. Studying the interaction and regulation of schizophrenia susceptibility genes is crucial for unraveling the pathogenesis of schizophrenia and developing antipsychotic drugs. Therefore, we developed a bioinformatics method named GRACI (Gene Regulation Analysis based on Causal Inference) based on the principles of information theory, a causal inference model, and high order chromatin 3D conformation. GRACI captures the interaction and regulatory relationships between schizophrenia susceptibility genes by analyzing genotyping data. Two datasets, comprising 1459 and 2065 samples respectively, were analyzed, and the gene networks from both datasets were constructed. GRACI showcased superior accuracy when compared to widely adopted methods for detecting gene-gene interactions and intergenic regulation. This alignment was further substantiated by its correlation with chromatin high-order conformation patterns. Using GRACI, we identified three potential genes-KCNN3, KCNH1, and KCND3-that are directly associated with schizophrenia pathogenesis. Furthermore, the results of GRACI on the standalone dataset illustrated the method's applicability to other complex diseases. GRACI download: https://github.com/liuliangjie19/GRACI.
Collapse
Affiliation(s)
- Liangjie Liu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Decheng Ren
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Keyi Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Lei Ji
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Mofan Feng
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Zhuoheng Li
- Department of Electrical Engineering and Computer Science, University of Michigan, 1301 Beal Avenue, Ann Arbor, MI 48109, USA
| | - Luming Meng
- Key Laboratory for Biobased Materials and Energy of Ministry of Education, College of Materials and Energy, South China Agricultural University, Guangzhou 510630, China
| | - Guang He
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Yi Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Research Institute for Doping Control, Shanghai University of Sport, Shanghai 200438, China.
| |
Collapse
|
2
|
Chongtham J, Pandey N, Sharma LK, Mohan A, Srivastava T. SNP rs9387478 at ROS1-DCBLD1 Locus is Significantly Associated with Lung Cancer Risk and Poor Survival in Indian Population. Asian Pac J Cancer Prev 2022; 23:3553-3561. [PMID: 36308382 PMCID: PMC9924343 DOI: 10.31557/apjcp.2022.23.10.3553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Indexed: 02/18/2023] Open
Abstract
OBJECTIVE Receptor tyrosine kinases (RTK) are relevant therapeutic targets in the treatment of lung cancer. Germline susceptibility variants that influence these RTKs may provide new insights into their regulation. rs9387478 is located in the genomic interval between two RTK-genes ROS1/DCBLD1, of which ROS1 alterations are implicated in lung carcinogenesis and treatment response while the latter remains poorly understood. MATERIALS AND METHODS Venous blood was drawn from 100 control and 231 case subjects. Genotype was scored by restriction fragment length polymorphism (RFLP), PCR amplification followed by HindIII digestion. Logistic regression was applied to compare the association between variables. Survival curve was plotted to draw a correlation between the genotype and overall survival. Also, eQTL and chromatin state changes were analyzed and correlated with the survival of patients using available datasets. RESULTS In our population smoking correlated significantly with lung cancer [OR= 2.607] with the presence of the minor allele 'A' enhancing the nicotine dependence [CA (OR=3.23)]. Individuals with homozygous risk allele 'A' had a higher chance of developing lung cancer [OR=2.65] than individuals with CA/CC implying a recessive model of association. Patients with CC/CA genotype had better overall survival than patients with AA genotype [161 days/142 days vs 54 days, p=0.005]. The homozygous risk allele was significantly associated with increased DCBLD1 and ROS1 expression in lung cancer, with enriched active histone marks due to the polymorphism. Interestingly, increased DCBLD1 expression was associated with poor outcomes in lung cancer. CONCLUSION Overall, our study provides strong evidence that rs9387478 is significantly associated with both nicotine dependence and lung cancer in our North Indian cohort. The association of the SNP with prognostic genes, DCBLD1 and ROS1 make rs9387478 a promising prognostic marker in the North Indian population. The results obtained are significant, however, the study needs to be performed in a larger sample size.
Collapse
Affiliation(s)
- Jonita Chongtham
- Department of Genetics, University of Delhi South Campus, New Delhi, India.
| | - Namita Pandey
- Department of Genetics, University of Delhi South Campus, New Delhi, India.,Current affiliation: Clinical Genomic Knowledgebase, PerianDx, Pune, Maharashtra, India.
| | | | - Anant Mohan
- Department of Pulmonary, Critical Care and Sleep Medicine, All India Institute of Medical Sciences (AIIMS), New Delhi, India.
| | - Tapasya Srivastava
- Department of Genetics, University of Delhi South Campus, New Delhi, India.,For Correspondence:
| |
Collapse
|
3
|
Fishman CE, Mohebnasab M, van Setten J, Zanoni F, Wang C, Deaglio S, Amoroso A, Callans L, van Gelder T, Lee S, Kiryluk K, Lanktree MB, Keating BJ. Genome-Wide Study Updates in the International Genetics and Translational Research in Transplantation Network (iGeneTRAiN). Front Genet 2019; 10:1084. [PMID: 31803228 PMCID: PMC6873800 DOI: 10.3389/fgene.2019.01084] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 10/09/2019] [Indexed: 12/14/2022] Open
Abstract
The prevalence of end-stage renal disease (ESRD) and the number of kidney transplants performed continues to rise every year, straining the procurement of deceased and living kidney allografts and health systems. Genome-wide genotyping and sequencing of diseased populations have uncovered genetic contributors in substantial proportions of ESRD patients. A number of these discoveries are beginning to be utilized in risk stratification and clinical management of patients. Specifically, genetics can provide insight into the primary cause of chronic kidney disease (CKD), the risk of progression to ESRD, and post-transplant outcomes, including various forms of allograft rejection. The International Genetics & Translational Research in Transplantation Network (iGeneTRAiN), is a multi-site consortium that encompasses >45 genetic studies with genome-wide genotyping from over 51,000 transplant samples, including genome-wide data from >30 kidney transplant cohorts (n = 28,015). iGeneTRAiN is statistically powered to capture both rare and common genetic contributions to ESRD and post-transplant outcomes. The primary cause of ESRD is often difficult to ascertain, especially where formal biopsy diagnosis is not performed, and is unavailable in ∼2% to >20% of kidney transplant recipients in iGeneTRAiN studies. We overview our current copy number variant (CNV) screening approaches from genome-wide genotyping datasets in iGeneTRAiN, in attempts to discover and validate genetic contributors to CKD and ESRD. Greater aggregation and analyses of well phenotyped patients with genome-wide datasets will undoubtedly yield insights into the underlying pathophysiological mechanisms of CKD, leading the way to improved diagnostic precision in nephrology.
Collapse
Affiliation(s)
- Claire E Fishman
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| | - Maede Mohebnasab
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| | - Jessica van Setten
- Department of Cardiology, University Medical Center Utrecht, University of Utrecht, Utrecht, Netherlands
| | - Francesca Zanoni
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, United States
| | - Chen Wang
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, United States
| | - Silvia Deaglio
- Immunogenetics and Biology of Transplantation, Città della Salute e della Scienza, University Hospital of Turin, Turin, Italy.,Medical Genetics, Department of Medical Sciences, University Turin, Turin, Italy
| | - Antonio Amoroso
- Immunogenetics and Biology of Transplantation, Città della Salute e della Scienza, University Hospital of Turin, Turin, Italy.,Medical Genetics, Department of Medical Sciences, University Turin, Turin, Italy
| | - Lauren Callans
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| | - Teun van Gelder
- Department of Hospital Pharmacy, University Medical Center Rotterdam, Rotterdam, Netherlands
| | - Sangho Lee
- Department of Nephrology, Khung Hee University, Seoul, South Korea
| | - Krzysztof Kiryluk
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, United States
| | - Matthew B Lanktree
- Division of Nephrology, St. Joseph's Healthcare Hamilton, McMaster University, Hamilton, ON, Canada
| | - Brendan J Keating
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
4
|
Burgess S, Zuber V, Valdes-Marquez E, Sun BB, Hopewell JC. Mendelian randomization with fine-mapped genetic data: Choosing from large numbers of correlated instrumental variables. Genet Epidemiol 2017; 41:714-725. [PMID: 28944551 PMCID: PMC5725678 DOI: 10.1002/gepi.22077] [Citation(s) in RCA: 122] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Revised: 08/11/2017] [Accepted: 08/16/2017] [Indexed: 11/08/2022]
Abstract
Mendelian randomization uses genetic variants to make causal inferences about the effect of a risk factor on an outcome. With fine-mapped genetic data, there may be hundreds of genetic variants in a single gene region any of which could be used to assess this causal relationship. However, using too many genetic variants in the analysis can lead to spurious estimates and inflated Type 1 error rates. But if only a few genetic variants are used, then the majority of the data is ignored and estimates are highly sensitive to the particular choice of variants. We propose an approach based on summarized data only (genetic association and correlation estimates) that uses principal components analysis to form instruments. This approach has desirable theoretical properties: it takes the totality of data into account and does not suffer from numerical instabilities. It also has good properties in simulation studies: it is not particularly sensitive to varying the genetic variants included in the analysis or the genetic correlation matrix, and it does not have greatly inflated Type 1 error rates. Overall, the method gives estimates that are less precise than those from variable selection approaches (such as using a conditional analysis or pruning approach to select variants), but are more robust to seemingly arbitrary choices in the variable selection step. Methods are illustrated by an example using genetic associations with testosterone for 320 genetic variants to assess the effect of sex hormone related pathways on coronary artery disease risk, in which variable selection approaches give inconsistent inferences.
Collapse
Affiliation(s)
- Stephen Burgess
- MRC Biostatistics Unit, Cambridge, United Kingdom.,Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Verena Zuber
- MRC Biostatistics Unit, Cambridge, United Kingdom.,European Bioinformatics Institute, Hinxton, nr Duxford, United Kingdom
| | - Elsa Valdes-Marquez
- Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
| | - Benjamin B Sun
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Jemma C Hopewell
- Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
5
|
Liu L, Zhang L, Li HM, Wang ZR, Xie XF, Mei JP, Jin JL, Shi J, Sun L, Li SC, Tan YL, Yang L, Wang J, Yang HM, Qian QJ, Wang YF. The SNP-set based association study identifies ITGA1 as a susceptibility gene of attention-deficit/hyperactivity disorder in Han Chinese. Transl Psychiatry 2017; 7:e1201. [PMID: 28809852 PMCID: PMC5611725 DOI: 10.1038/tp.2017.156] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 05/20/2017] [Accepted: 06/07/2017] [Indexed: 01/02/2023] Open
Abstract
Genome-wide association studies, which detect the association between single-nucleotide polymorphisms (SNPs) and disease susceptibility, have been extensively applied to study attention-deficit/hyperactivity disorder (ADHD), but genome-wide significant associations have not been found yet. Genetic heterogeneity and insufficient genomic coverage may account for the missing heritability. We performed a two-stage association study for ADHD in the Han Chinese population. In the discovery stage, 1033 ADHD patients and 950 healthy controls were genotyped using both the Affymetrix Genome-Wide Human SNP Array 6.0 and the Illumina Infinium HumanExome BeadChip. The genotyped SNPs were combined to generate a powerful SNP set with better genomic coverage especially for the nonsynonymous variants. In addition to the association of single SNPs, we collected adjacent SNPs as SNP sets, which were determined by either genes or successive sliding windows, to evaluate their synergetic effect. The candidate susceptibility SNPs were further replicated in an independent cohort of 1441 ADHD patients and 1447 healthy controls. No genome-wide significant SNPs or gene-based SNP sets were found to be associated with ADHD. However, two continuous sliding windows located in ITGA1 (P-value=8.33E-7 and P-value=8.43E-7) were genome-wide significant. The quantitative trait analyses also demonstrated their association with ADHD core symptoms and executive functions. The association was further validated by follow-up replications for four selected SNPs: rs1979398 (P-value=2.64E-6), rs16880453 (P-value=3.58E-4), rs1531545 (P-value=7.62E-4) and rs4074793 (P-value=2.03E-4). Our results suggest that genetic variants in ITGA1 may be involved in the etiology of ADHD and the SNP-set based analysis is a promising strategy for the detection of underlying genetic risk factors.
Collapse
Affiliation(s)
- L Liu
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China
| | - L Zhang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China,Department of Computer Science, City University of Hong Kong, Hong Kong, China,Department of Computer Science, Stanford University, Stanford, CA, USA
| | - H M Li
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China
| | - Z R Wang
- Psychiatry Research Center, Beijing HuiLongGuan Hospital, Peking University, Beijing, China
| | - X F Xie
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - J P Mei
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - J L Jin
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China
| | - J Shi
- Psychiatry Research Center, Beijing HuiLongGuan Hospital, Peking University, Beijing, China
| | - L Sun
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China
| | - S C Li
- Department of Computer Science, City University of Hong Kong, Hong Kong, China
| | - Y L Tan
- Psychiatry Research Center, Beijing HuiLongGuan Hospital, Peking University, Beijing, China
| | - L Yang
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China
| | - J Wang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China,James D. Watson Institute of Genome Sciences, Hangzhou, China
| | - H M Yang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China,James D. Watson Institute of Genome Sciences, Hangzhou, China
| | - Q J Qian
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China,Peking University Sixth Hospital/Institute of Mental Health, No. 51, Hua Yuan Bei Lu, Haidian Disrtrict, Beijing 100191, China. E-mail: or
| | - Y F Wang
- Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China,National Clinical Research Center for Mental Disorders & Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China,Peking University Sixth Hospital/Institute of Mental Health, No. 51, Hua Yuan Bei Lu, Haidian Disrtrict, Beijing 100191, China. E-mail: or
| |
Collapse
|
6
|
Woo HJ, Yu C, Kumar K, Gold B, Reifman J. Genotype distribution-based inference of collective effects in genome-wide association studies: insights to age-related macular degeneration disease mechanism. BMC Genomics 2016; 17:695. [PMID: 27576376 PMCID: PMC5006276 DOI: 10.1186/s12864-016-2871-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Accepted: 07/01/2016] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Genome-wide association studies provide important insights to the genetic component of disease risks. However, an existing challenge is how to incorporate collective effects of interactions beyond the level of independent single nucleotide polymorphism (SNP) tests. While methods considering each SNP pair separately have provided insights, a large portion of expected heritability may reside in higher-order interaction effects. RESULTS We describe an inference approach (discrete discriminant analysis; DDA) designed to probe collective interactions while treating both genotypes and phenotypes as random variables. The genotype distributions in case and control groups are modeled separately based on empirical allele frequency and covariance data, whose differences yield disease risk parameters. We compared pairwise tests and collective inference methods, the latter based both on DDA and logistic regression. Analyses using simulated data demonstrated that significantly higher sensitivity and specificity can be achieved with collective inference in comparison to pairwise tests, and with DDA in comparison to logistic regression. Using age-related macular degeneration (AMD) data, we demonstrated two possible applications of DDA. In the first application, a genome-wide SNP set is reduced into a small number (∼100) of variants via filtering and SNP pairs with significant interactions are identified. We found that interactions between SNPs with highest AMD association were epigenetically active in the liver, adipocytes, and mesenchymal stem cells. In the other application, multiple groups of SNPs were formed from the genome-wide data and their relative strengths of association were compared using cross-validation. This analysis allowed us to discover novel collections of loci for which interactions between SNPs play significant roles in their disease association. In particular, we considered pathway-based groups of SNPs containing up to ∼10, 000 variants in each group. In addition to pathways related to complement activation, our collective inference pointed to pathway groups involved in phospholipid synthesis, oxidative stress, and apoptosis, consistent with the AMD pathogenesis mechanism where the dysfunction of retinal pigment epithelium cells plays central roles. CONCLUSIONS The simultaneous inference of collective interaction effects within a set of SNPs has the potential to reveal novel aspects of disease association.
Collapse
Affiliation(s)
- Hyung Jun Woo
- Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA
| | - Chenggang Yu
- Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA
| | - Kamal Kumar
- Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA
| | - Bert Gold
- Laboratory of Genomic Diversity, National Cancer Institute, Frederick, Maryland, USA
| | - Jaques Reifman
- Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA.
| |
Collapse
|
7
|
Zhang Q, Zhao Y, Zhang R, Wei Y, Yi H, Shao F, Chen F. A Comparative Study of Five Association Tests Based on CpG Set for Epigenome-Wide Association Studies. PLoS One 2016; 11:e0156895. [PMID: 27258058 PMCID: PMC4892473 DOI: 10.1371/journal.pone.0156895] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Accepted: 05/20/2016] [Indexed: 11/19/2022] Open
Abstract
An epigenome-wide association study (EWAS) is a large-scale study of human disease-associated epigenetic variation, specifically variation in DNA methylation. High throughput technologies enable simultaneous epigenetic profiling of DNA methylation at hundreds of thousands of CpGs across the genome. The clustering of correlated DNA methylation at CpGs is reportedly similar to that of linkage-disequilibrium (LD) correlation in genetic single nucleotide polymorphisms (SNP) variation. However, current analysis methods, such as the t-test and rank-sum test, may be underpowered to detect differentially methylated markers. We propose to test the association between the outcome (e.g case or control) and a set of CpG sites jointly. Here, we compared the performance of five CpG set analysis approaches: principal component analysis (PCA), supervised principal component analysis (SPCA), kernel principal component analysis (KPCA), sequence kernel association test (SKAT), and sliced inverse regression (SIR) with Hotelling's T2 test and t-test using Bonferroni correction. The simulation results revealed that the first six methods can control the type I error at the significance level, while the t-test is conservative. SPCA and SKAT performed better than other approaches when the correlation among CpG sites was strong. For illustration, these methods were also applied to a real methylation dataset.
Collapse
Affiliation(s)
- Qiuyi Zhang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Yang Zhao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Ruyang Zhang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Yongyue Wei
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Honggang Yi
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Fang Shao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Feng Chen
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| |
Collapse
|
8
|
Polymorphism of rs9387478 correlates with overall survival in female nonsmoking patients with lung cancer. Int J Biol Markers 2016; 31:e144-52. [PMID: 26689248 DOI: 10.5301/jbm.5000180] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/08/2015] [Indexed: 12/25/2022]
Abstract
BACKGROUND Our previous study identified rs9387478 as a new susceptibility locus associated with lung cancer in never-smoking women in Asia; however, the clinical and prognostic significance of this finding is not known. METHODS We analyzed the relationship between the rs9387478 single nucleotide polymorphism and i) clinical parameters and ii) overall survival time in 505 female nonsmoking lung cancer patients, using the chi-square test and Kaplan-Meier analysis with the log-rank test, respectively. We further established the epidermal growth factor receptor (EGFR) mutation status and assessed its association with rs9387478 genotypes as well as the efficacy of EGFR tyrosine kinase inhibitors. RESULTS The frequency of the AA genotype was significantly higher in the EGFR-mutation-negative group than in the EGFR-mutation-positive group (32% vs. 16%, χ2 = 13.025, p = 0.011). Patients with the CC genotype had a better overall survival time than patients with the AA/AC genotype (median survival time: 54.2 vs. 32.9 months, χ2 = 4.593, p = 0.032). The distribution of rs9387478 genotypes differed according to the clinical disease stage. CONCLUSIONS This study indicates that the rs9387478 genotype was associated with overall survival in nonsmoking female patients with lung cancer, although it was not significant after adjusting for multiple testing. The identification of the location of the rs9387478 single nucleotide polymorphism in the genomic interval containing the DCBLD1 and ROS1 genes, together with the finding that the rs9387478 polymorphism correlates with EGFR mutation status, may have important implications for therapeutic approaches targeting EGFR or ROS1 in patients with lung cancer.
Collapse
|
9
|
Su YC, Gauderman WJ, Berhane K, Lewinger JP. Adaptive Set-Based Methods for Association Testing. Genet Epidemiol 2015; 40:113-22. [PMID: 26707371 DOI: 10.1002/gepi.21950] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 11/02/2015] [Accepted: 11/17/2015] [Indexed: 12/31/2022]
Abstract
With a typical sample size of a few thousand subjects, a single genome-wide association study (GWAS) using traditional one single nucleotide polymorphism (SNP)-at-a-time methods can only detect genetic variants conferring a sizable effect on disease risk. Set-based methods, which analyze sets of SNPs jointly, can detect variants with smaller effects acting within a gene, a pathway, or other biologically relevant sets. Although self-contained set-based methods (those that test sets of variants without regard to variants not in the set) are generally more powerful than competitive set-based approaches (those that rely on comparison of variants in the set of interest with variants not in the set), there is no consensus as to which self-contained methods are best. In particular, several self-contained set tests have been proposed to directly or indirectly "adapt" to the a priori unknown proportion and distribution of effects of the truly associated SNPs in the set, which is a major determinant of their power. A popular adaptive set-based test is the adaptive rank truncated product (ARTP), which seeks the set of SNPs that yields the best-combined evidence of association. We compared the standard ARTP, several ARTP variations we introduced, and other adaptive methods in a comprehensive simulation study to evaluate their performance. We used permutations to assess significance for all the methods and thus provide a level playing field for comparison. We found the standard ARTP test to have the highest power across our simulations followed closely by the global model of random effects (GMRE) and a least absolute shrinkage and selection operator (LASSO)-based test.
Collapse
Affiliation(s)
- Yu-Chen Su
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - William James Gauderman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Kiros Berhane
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Juan Pablo Lewinger
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
10
|
Mehla K, Ramana J. DBDiaSNP: An Open-Source Knowledgebase of Genetic Polymorphisms and Resistance Genes Related to Diarrheal Pathogens. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2015; 19:354-60. [PMID: 25978092 PMCID: PMC4486150 DOI: 10.1089/omi.2015.0030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Diarrhea is a highly common infection among children, responsible for significant morbidity and mortality rate worldwide. After pneumonia, diarrhea remains the second leading cause of neonatal deaths. Numerous viral, bacterial, and parasitic enteric pathogens are associated with diarrhea. With increasing antibiotic resistance among enteric pathogens, there is an urgent need for global surveillance of the mutations and resistance genes primarily responsible for resistance to antibiotic treatment. Single Nucleotide Polymorphisms are important in this regard as they have a vast potential to be utilized as molecular diagnostics for gene-disease or pharmacogenomics association studies linking genotype to phenotype. DBDiaSNP is a comprehensive repository of mutations and resistance genes among various diarrheal pathogens and hosts to advance breakthroughs that will find applications from development of sequence-based diagnostic tools to drug discovery. It contains information about 946 mutations and 326 resistance genes compiled from literature and various web resources. As of March 2015, it houses various pathogen genes and the mutations responsible for antibiotic resistance. The pathogens include, for example, DEC (Diarrheagenic E.coli), Salmonella spp., Campylobacter spp., Shigella spp., Clostridium difficile, Aeromonas spp., Helicobacter pylori, Entamoeba histolytica, Vibrio cholera, and viruses. It also includes mutations from hosts (e.g., humans, pigs, others) that render them either susceptible or resistant to a certain type of diarrhea. DBDiaSNP is therefore intended as an integrated open access database for researchers and clinicians working on diarrheal diseases. Additionally, we note that the DBDiaSNP is one of the first antibiotic resistance databases for the diarrheal pathogens covering mutations and resistance genes that have clinical relevance from a broad range of pathogens and hosts. For future translational research involving integrative biology and global health, the database offers veritable potentials, particularly for developing countries and worldwide monitoring and personalized effective treatment of pathogens associated with diarrhea. The database is accessible on the public domain at http://www.juit.ac.in/attachments/dbdiasnp/ .
Collapse
Affiliation(s)
- Kusum Mehla
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology , Solan, Himachal Pradesh, India
| | - Jayashree Ramana
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology , Solan, Himachal Pradesh, India
| |
Collapse
|
11
|
A strategy to identify dominant point mutant modifiers of a quantitative trait. G3-GENES GENOMES GENETICS 2014; 4:1113-21. [PMID: 24747760 PMCID: PMC4065254 DOI: 10.1534/g3.114.010595] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
A central goal in the analysis of complex traits is to identify genes that modify a phenotype. Modifiers of a cancer phenotype may act either intrinsically or extrinsically on the salient cell lineage. Germline point mutagenesis by ethylnitrosourea can provide alleles for a gene of interest that include loss-, gain-, or alteration-of-function. Unlike strain polymorphisms, point mutations with heterozygous quantitative phenotypes are detectable in both essential and nonessential genes and are unlinked from other variants that might confound their identification and analysis. This report analyzes strategies seeking quantitative mutational modifiers of ApcMin in the mouse. To identify a quantitative modifier of a phenotype of interest, a cluster of test progeny is needed. The cluster size can be increased as necessary for statistical significance if the founder is a male whose sperm is cryopreserved. A second critical element in this identification is a mapping panel free of polymorphic modifiers of the phenotype, to enable low-resolution mapping followed by targeted resequencing to identify the causative mutation. Here, we describe the development of a panel of six “isogenic mapping partner lines” for C57BL/6J, carrying single-nucleotide markers introduced by mutagenesis. One such derivative, B6.SNVg, shown to be phenotypically neutral in combination with ApcMin, is an appropriate mapping partner to locate induced mutant modifiers of the ApcMin phenotype. The evolved strategy can complement four current major initiatives in the genetic analysis of complex systems: the Genome-wide Association Study; the Collaborative Cross; the Knockout Mouse Project; and The Cancer Genome Atlas.
Collapse
|