151
|
Selecting variants of unknown significance through network-based gene-association significantly improves risk prediction for disease-control cohorts. Sci Rep 2019; 9:3266. [PMID: 30824863 PMCID: PMC6397233 DOI: 10.1038/s41598-019-39796-w] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 01/31/2019] [Indexed: 12/12/2022] Open
Abstract
Variants of unknown/uncertain significance (VUS) pose a huge dilemma in current genetic variation screening methods and genetic counselling. Driven by methods of next generation sequencing (NGS) such as whole exome sequencing (WES), a plethora of VUS are being detected in research laboratories as well as in the health sector. Motivated by this overabundance of VUS, we propose a novel computational methodology, termed VariantClassifier (VarClass), which utilizes gene-association networks and polygenic risk prediction models to shed light into this grey area of genetic variation in association with disease. VarClass has been evaluated using numerous validation steps and proves to be very successful in assigning significance to VUS in association with specific diseases of interest. Notably, using VUS that are deemed significant by VarClass, we improved risk prediction accuracy in four large case-studies involving disease-control cohorts from GWAS as well as WES, when compared to traditional odds ratio analysis. Biological interpretation of selected high scoring VUS revealed interesting biological themes relevant to the diseases under investigation. VarClass is available as a standalone tool for large-scale data analyses, as well as a web-server with additional functionalities through a user-friendly graphical interface.
Collapse
|
152
|
Palmer M, Venter SN, Coetzee MP, Steenkamp ET. Prokaryotic species are sui generis evolutionary units. Syst Appl Microbiol 2019; 42:145-158. [DOI: 10.1016/j.syapm.2018.10.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Revised: 10/02/2018] [Accepted: 10/03/2018] [Indexed: 12/25/2022]
|
153
|
Yu G, Miller DJ, Wu CT, Hoffman EP, Liu C, Herrington DM, Wang Y. Asymmetric independence modeling identifies novel gene-environment interactions. Sci Rep 2019; 9:2455. [PMID: 30792419 PMCID: PMC6385186 DOI: 10.1038/s41598-019-38983-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 01/04/2019] [Indexed: 11/09/2022] Open
Abstract
Most genetic or environmental factors work together in determining complex disease risk. Detecting gene-environment interactions may allow us to elucidate novel and targetable molecular mechanisms on how environmental exposures modify genetic effects. Unfortunately, standard logistic regression (LR) assumes a convenient mathematical structure for the null hypothesis that however results in both poor detection power and type 1 error, and is also susceptible to missing factor, imperfect surrogate, and disease heterogeneity confounding effects. Here we describe a new baseline framework, the asymmetric independence model (AIM) in case-control studies, and provide mathematical proofs and simulation studies verifying its validity across a wide range of conditions. We show that AIM mathematically preserves the asymmetric nature of maintaining health versus acquiring a disease, unlike LR, and thus is more powerful and robust to detect synergistic interactions. We present examples from four clinically discrete domains where AIM identified interactions that were previously either inconsistent or recognized with less statistical certainty.
Collapse
Affiliation(s)
- Guoqiang Yu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, 22203, USA.
| | - David J Miller
- Department of Electrical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Chiung-Ting Wu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, 22203, USA
| | - Eric P Hoffman
- School of Pharmacy and Pharmaceutical Sciences, State University of New York, Binghamton, NY, 13902, USA
| | - Chunyu Liu
- Psychiatry and Behavioral Sciences, Upstate Medical University, Syracuse, NY, 13210, USA
| | - David M Herrington
- Department of Medicine, Wake Forest University, Winston-Salem, NC, 27157, USA
| | - Yue Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, 22203, USA
| |
Collapse
|
154
|
Abstract
Identifying gene-gene and gene-environment interactions may help us to better describe the genetic architecture for complex traits. While advances have been made in identifying genetic variants associated with complex traits through more dense panels of genetic variants and larger sample sizes, genome-wide interaction analyses are still limited in power to detect interactions with small effect sizes, rare frequencies, and higher order interactions. This chapter outlines methods for detecting both gene-gene and gene-environment interactions both through explicit tests for interactions (i.e., ones in which the interaction is tested directly) and non-explicit tests (i.e., ones in which an interaction is allowed for in the test, but does not test for the interaction directly) as well as approaches for increasing power by reducing the search space. Issues relating to multiple test correction, replication, and the reporting of interaction results in publications.
Collapse
Affiliation(s)
- Andrew T DeWan
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, USA.
| |
Collapse
|
155
|
Guan B, Zhao Y. Self-Adjusting Ant Colony Optimization Based on Information Entropy for Detecting Epistatic Interactions. Genes (Basel) 2019; 10:genes10020114. [PMID: 30717303 PMCID: PMC6409693 DOI: 10.3390/genes10020114] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 01/21/2019] [Accepted: 01/28/2019] [Indexed: 12/15/2022] Open
Abstract
The epistatic interactions of single nucleotide polymorphisms (SNPs) are considered to be an important factor in determining the susceptibility of individuals to complex diseases. Although many methods have been proposed to detect such interactions, the development of detection algorithm is still ongoing due to the computational burden in large-scale association studies. In this paper, to deal with the intensive computing problem of detecting epistatic interactions in large-scale datasets, a self-adjusting ant colony optimization based on information entropy (IEACO) is proposed. The algorithm can automatically self-adjust the path selection strategy according to the real-time information entropy. The performance of IEACO is compared with that of ant colony optimization (ACO), AntEpiSeeker, AntMiner, and epiACO on a set of simulated datasets and a real genome-wide dataset. The results of extensive experiments show that the proposed method is superior to the other methods.
Collapse
Affiliation(s)
- Boxin Guan
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, and School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China.
| | - Yuhai Zhao
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, and School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China.
| |
Collapse
|
156
|
Sun L, Liu G, Su L, Wang R. SEE: a novel multi-objective evolutionary algorithm for identifying SNP epistasis in genome-wide association studies. BIOTECHNOL BIOTEC EQ 2019. [DOI: 10.1080/13102818.2019.1593052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022] Open
Affiliation(s)
- Liyan Sun
- Department of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, P.R. China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, P.R. China
| | - Guixia Liu
- Department of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, P.R. China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, P.R. China
| | - Lingtao Su
- Department of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, P.R. China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, P.R. China
| | - Rongquan Wang
- Department of Computational Intelligence, College of Computer Science and Technology, Jilin University, Changchun, P.R. China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, P.R. China
| |
Collapse
|
157
|
Bhattacharya D, Bhattacharya S. A Bayesian semiparametric approach to learning about gene–gene interactions in case-control studies. J Appl Stat 2018. [DOI: 10.1080/02664763.2018.1444741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Durba Bhattacharya
- St. Xavier's College, Kolkata, India
- Interdisciplinary Statistical Research Unit, Indian Statistical Institute, Kolkata, India
| | - Sourabh Bhattacharya
- Interdisciplinary Statistical Research Unit, Indian Statistical Institute, Kolkata, India
| |
Collapse
|
158
|
Guan B, Zhao Y, Sun W. Ant colony optimization with an automatic adjustment mechanism for detecting epistatic interactions. Comput Biol Chem 2018; 77:354-362. [DOI: 10.1016/j.compbiolchem.2018.11.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 10/01/2018] [Accepted: 11/05/2018] [Indexed: 12/13/2022]
|
159
|
Dasmeh P, Serohijos AWR. Estimating the contribution of folding stability to nonspecific epistasis in protein evolution. Proteins 2018; 86:1242-1250. [DOI: 10.1002/prot.25588] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 06/28/2018] [Accepted: 07/18/2018] [Indexed: 12/28/2022]
Affiliation(s)
- Pouria Dasmeh
- Department of BiochemistryUniversity of Montreal Montreal Quebec Canada
- Cedergren Center for Bioinformatics and GenomicsUniversity of Montreal Montreal, Quebec Canada
- Department of Biochemistry and Institute for Data Valorization (IVADO)University of Montreal Montreal, Quebec Canada
| | - Adrian W. R. Serohijos
- Department of BiochemistryUniversity of Montreal Montreal Quebec Canada
- Cedergren Center for Bioinformatics and GenomicsUniversity of Montreal Montreal, Quebec Canada
| |
Collapse
|
160
|
Campbell RF, McGrath PT, Paaby AB. Analysis of Epistasis in Natural Traits Using Model Organisms. Trends Genet 2018; 34:883-898. [PMID: 30166071 PMCID: PMC6541385 DOI: 10.1016/j.tig.2018.08.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 06/06/2018] [Accepted: 08/03/2018] [Indexed: 12/16/2022]
Abstract
The ability to detect and understand epistasis in natural populations is important for understanding how biological traits are influenced by genetic variation. However, identification and characterization of epistasis in natural populations remains difficult due to statistical issues that arise as a result of multiple comparisons, and the fact that most genetic variants segregate at low allele frequencies. In this review, we discuss how model organisms may be used to manipulate genotypic combinations to power the detection of epistasis as well as test interactions between specific genes. Findings from a number of species indicate that statistical epistasis is pervasive between natural genetic variants. However, the properties of experimental systems that enable analysis of epistasis also constrain extrapolation of these results back into natural populations.
Collapse
Affiliation(s)
- Richard F Campbell
- Department of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332 USA
| | - Patrick T McGrath
- Department of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332 USA; Department of Physics, Georgia Institute of Technology, Atlanta, GA, 30332 USA.
| | - Annalise B Paaby
- Department of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332 USA
| |
Collapse
|
161
|
Zhou R, Wang M, Li W, Wang S, Zhou Z, Li J, Wu T, Zhu H, Beaty TH. Gene-Gene Interactions among SPRYs for Nonsyndromic Cleft Lip/Palate. J Dent Res 2018; 98:180-185. [PMID: 30273098 DOI: 10.1177/0022034518801537] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is a common birth defect with a complex genetic architecture. Gene-gene interactions have been increasingly regarded as contributing to the etiology of NSCL/P. A recent genome-wide association study revealed that a novel single-nucleotide polymorphism at SPRY1 in 4q28.1 showed a significant association with NSCL/P. In the current study, we explored the role of 3 SPRY genes in the etiology of NSCL/P by detecting gene-gene interactions: SPRY1, SPRY2, and SPRY4-with SPRY3 excluded due to its special location on the X chromosome. We selected markers in 3 SPRY genes to test for gene-gene interactions using 1,908 case-parent trios recruited from an international consortium established for a genome-wide association study of nonsyndromic oral clefts. As the trios came from populations with different ancestries, subgroup analyses were conducted among Europeans and Asians. Cordell's method based on conditional logistic regression models was applied to test for potential gene-gene interactions via the statistical package TRIO in R software. Gene-gene interaction analyses yielded 10 pairs of SNPs in Europeans and 6 pairs in Asians that achieved significance after Bonferroni correction. The significant interactions were confirmed in the 10,000-permutation tests (empirical P = 0.003 for the most significant interaction). The study identified gene-gene interactions among SPRY genes among 1,908 NSCL/P trios, which revealed the importance of potential gene-gene interactions for understanding the genetic architecture of NSCL/P. The evidence of gene-gene interactions in this study also provided clues for future biological studies to further investigate the mechanism of how SPRY genes participate in the development of NSCL/P.
Collapse
Affiliation(s)
- R Zhou
- 1 School of Public Health, Peking University, Beijing, China
| | - M Wang
- 1 School of Public Health, Peking University, Beijing, China
| | - W Li
- 1 School of Public Health, Peking University, Beijing, China
| | - S Wang
- 1 School of Public Health, Peking University, Beijing, China
| | - Z Zhou
- 2 School of Stomatology, Peking University, Beijing, China
| | - J Li
- 2 School of Stomatology, Peking University, Beijing, China
| | - T Wu
- 1 School of Public Health, Peking University, Beijing, China.,3 Key Laboratory of Reproductive Health, Ministry of Health, Beijing, China
| | - H Zhu
- 2 School of Stomatology, Peking University, Beijing, China
| | - T H Beaty
- 4 School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
162
|
Zhou X, Chan KCC. Detecting gene-gene interactions for complex quantitative traits using generalized fuzzy classification. BMC Bioinformatics 2018; 19:329. [PMID: 30227829 PMCID: PMC6145205 DOI: 10.1186/s12859-018-2361-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 09/09/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Quantitative traits or continuous outcomes related to complex diseases can provide more information and therefore more accurate analysis for identifying gene-gene and gene- environment interactions associated with complex diseases. Multifactor Dimensionality Reduction (MDR) is originally proposed to identify gene-gene and gene- environment interactions associated with binary status of complex diseases. Some efforts have been made to extend it to quantitative traits (QTs) and ordinal traits. However these and other methods are still not computationally efficient or effective. RESULTS Generalized Fuzzy Quantitative trait MDR (GFQMDR) is proposed in this paper to strengthen identification of gene-gene interactions associated with a quantitative trait by first transforming it to an ordinal trait and then selecting best sets of genetic markers, mainly single nucleotide polymorphisms (SNPs) or simple sequence length polymorphic markers (SSLPs), as having strong association with the trait through generalized fuzzy classification using extended member functions. Experimental results on simulated datasets and real datasets show that our algorithm has better success rate, classification accuracy and consistency in identifying gene-gene interactions associated with QTs. CONCLUSION The proposed algorithm provides a more effective way to identify gene-gene interactions associated with quantitative traits.
Collapse
Affiliation(s)
- Xiangdong Zhou
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou, Fujian China
| | - Keith C. C. Chan
- Department of Computing, the Hong Kong Polytechnic University, Kowloon, Hong Kong China
| |
Collapse
|
163
|
Pedruzzi G, Barlukova A, Rouzine IM. Evolutionary footprint of epistasis. PLoS Comput Biol 2018; 14:e1006426. [PMID: 30222748 PMCID: PMC6177197 DOI: 10.1371/journal.pcbi.1006426] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 10/09/2018] [Accepted: 08/09/2018] [Indexed: 11/18/2022] Open
Abstract
Variation of an inherited trait across a population cannot be explained by additive contributions of relevant genes, due to epigenetic effects and biochemical interactions (epistasis). Detecting epistasis in genomic data still represents a significant challenge that requires a better understanding of epistasis from the mechanistic point of view. Using a standard Wright-Fisher model of bi-allelic asexual population, we study how compensatory epistasis affects the process of adaptation. The main result is a universal relationship between four haplotype frequencies of a single site pair in a genome, which depends only on the epistasis strength of the pair defined regarding Darwinian fitness. We demonstrate the existence, at any time point, of a quasi-equilibrium between epistasis and disorder (entropy) caused by random genetic drift and mutation. We verify the accuracy of these analytic results by Monte-Carlo simulation over a broad range of parameters, including the topology of the interacting network. Thus, epistasis assists the evolutionary transit through evolutionary hurdles leaving marks at the level of haplotype disequilibrium. The method allows determining selection coefficient for each site and the epistasis strength of each pair from a sequence set. The resulting ability to detect clusters of deleterious mutation close to full compensation is essential for biomedical applications. These findings help to understand the role of epistasis in multiple compensatory mutations in viral resistance to antivirals and immune response.
Collapse
Affiliation(s)
- Gabriele Pedruzzi
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, Paris, France
| | - Ayuna Barlukova
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, Paris, France
| | - Igor M. Rouzine
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, Paris, France
- * E-mail:
| |
Collapse
|
164
|
The classification, genetic diagnosis and modelling of monogenic autoinflammatory disorders. Clin Sci (Lond) 2018; 132:1901-1924. [PMID: 30185613 PMCID: PMC6123071 DOI: 10.1042/cs20171498] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 07/30/2018] [Accepted: 08/07/2018] [Indexed: 12/13/2022]
Abstract
Monogenic autoinflammatory disorders are an increasingly heterogeneous group of conditions characterised by innate immune dysregulation. Improved genetic sequencing in recent years has led not only to the discovery of a plethora of conditions considered to be 'autoinflammatory', but also the broadening of the clinical and immunological phenotypic spectra seen in these disorders. This review outlines the classification strategies that have been employed for monogenic autoinflammatory disorders to date, including the primary innate immune pathway or the dominant cytokine implicated in disease pathogenesis, and highlights some of the advantages of these models. Furthermore, the use of the term 'autoinflammatory' is discussed in relation to disorders that cross the innate and adaptive immune divide. The utilisation of next-generation sequencing (NGS) in this population is examined, as are potential in vivo and in vitro methods of modelling to determine pathogenicity of novel genetic findings. Finally, areas where our understanding can be improved are highlighted, such as phenotypic variability and genotype-phenotype correlations, with the aim of identifying areas of future research.
Collapse
|
165
|
Urbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Inform 2018; 85:168-188. [PMID: 30030120 PMCID: PMC6299838 DOI: 10.1016/j.jbi.2018.07.015] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 06/30/2018] [Accepted: 07/14/2018] [Indexed: 11/23/2022]
Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the 'Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this study (1) support the assertion that RBAs are particularly flexible, efficient, and powerful feature selection methods that differentiate relevant features having univariate, multivariate, epistatic, or heterogeneous associations, (2) confirm the efficacy of expansions for classification vs. regression, discrete vs. continuous features, missing data, multiple classes, or class imbalance, (3) identify previously unknown limitations of specific RBAs, and (4) suggest that while MultiSURF∗ performs best for explicitly identifying pure 2-way interactions, MultiSURF yields the most reliable feature selection performance across a wide range of problem types.
Collapse
Affiliation(s)
- Ryan J Urbanowicz
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Randal S Olson
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Peter Schmitt
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | - Jason H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
166
|
Vélez JI, Lopera F, Creagh PK, Piñeros LB, Das D, Cervantes-Henríquez ML, Acosta-López JE, Isaza-Ruget MA, Espinosa LG, Easteal S, Quintero GA, Silva CT, Mastronardi CA, Arcos-Burgos M. Targeting Neuroplasticity, Cardiovascular, and Cognitive-Associated Genomic Variants in Familial Alzheimer's Disease. Mol Neurobiol 2018; 56:3235-3243. [PMID: 30112632 PMCID: PMC6476862 DOI: 10.1007/s12035-018-1298-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Accepted: 08/02/2018] [Indexed: 11/24/2022]
Abstract
The identification of novel genetic variants contributing to the widespread in the age of onset (AOO) of Alzheimer’s disease (AD) could aid in the prognosis and/or development of new therapeutic strategies focused on early interventions. We recruited 78 individuals with AD from the Paisa genetic isolate in Antioquia, Colombia. These individuals belong to the world largest multigenerational and extended pedigree segregating AD as a consequence of a dominant fully penetrant mutation in the PSEN1 gene and exhibit an AOO ranging from the early 1930s to the late 1970s. To shed light on the genetic underpinning that could explain the large spread of the age of onset (AOO) of AD, 64 single nucleotide polymorphisms (SNP) associated with neuroanatomical, cardiovascular, and cognitive measures in AD were genotyped. Standard quality control and filtering procedures were applied, and single- and multi-locus linear mixed-effects models were used to identify AOO-associated SNPs. A full two-locus interaction model was fitted to define how identified SNPs interact to modulate AOO. We identified two key epistatic interactions between the APOE*E2 allele and SNPs ASTN2-rs7852878 and SNTG1-rs16914781 that delay AOO by up to ~ 8 years (95% CI 3.2–12.7, P = 1.83 × 10−3) and ~ 7.6 years (95% CI 3.3–11.8, P = 8.69 × 10−4), respectively, and validated our previous finding indicating that APOE*E2 delays AOO of AD in PSEN1 E280 mutation carriers. This new evidence involving APOE*E2 as an AOO delayer could be used for developing precision medicine approaches and predictive genomics models to potentially determine AOO in individuals genetically predisposed to AD.
Collapse
Affiliation(s)
- Jorge I. Vélez
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
- Universidad del Norte, Barranquilla, Colombia
| | - Francisco Lopera
- Neuroscience Research Group, University of Antioquia, Medellín, Colombia
| | - Penelope K. Creagh
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
| | - Laura B. Piñeros
- GENIUROS, Center for Research in Genetics and Genomics, Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Debjani Das
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, ACT, Canberra, 2600 Australia
| | - Martha L. Cervantes-Henríquez
- Universidad del Norte, Barranquilla, Colombia
- Grupo de Neurociencias del Caribe, Universidad Simón Bolívar, Barranquilla, Colombia
| | - Johan E. Acosta-López
- Grupo de Neurociencias del Caribe, Universidad Simón Bolívar, Barranquilla, Colombia
| | | | - Lady G. Espinosa
- INPAC Research Group, Fundación Universitaria Sanitas, Bogotá, Colombia
| | - Simon Easteal
- Genome Diversity and Health Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, ACT, Canberra, 2600 Australia
| | - Gustavo A. Quintero
- Studies in Translational Microbiology and Emerging Diseases (MICROS) Research Group, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Claudia Tamar Silva
- GENIUROS, Center for Research in Genetics and Genomics, Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Claudio A. Mastronardi
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
- Neuroscience Group (NeUROS), Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Mauricio Arcos-Burgos
- Genomics and Predictive Medicine Group, Department of Genome Sciences, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600 Australia
- GENIUROS, Center for Research in Genetics and Genomics, Institute of Translational Medicine, School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| |
Collapse
|
167
|
Schrauwen I, Chakchouk I, Acharya A, Liaqat K, Irfanullah, Nickerson DA, Bamshad MJ, Shah K, Ahmad W, Leal SM. Novel digenic inheritance of PCDH15 and USH1G underlies profound non-syndromic hearing impairment. BMC MEDICAL GENETICS 2018; 19:122. [PMID: 30029624 PMCID: PMC6053831 DOI: 10.1186/s12881-018-0618-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 05/24/2018] [Indexed: 11/10/2022]
Abstract
BACKGROUND Digenic inheritance is the simplest model of oligenic disease. It can be observed when there is a strong epistatic interaction between two loci. For both syndromic and non-syndromic hearing impairment, several forms of digenic inheritance have been reported. METHODS We performed exome sequencing in a Pakistani family with profound non-syndromic hereditary hearing impairment to identify the genetic cause of disease. RESULTS We found that this family displays digenic inheritance for two trans heterozygous missense mutations, one in PCDH15 [p.(Arg1034His)] and another in USH1G [p.(Asp365Asn)]. Both of these genes are known to cause autosomal recessive non-syndromic hearing impairment and Usher syndrome. The protein products of PCDH15 and USH1G function together at the stereocilia tips in the hair cells and are necessary for proper mechanotransduction. Epistasis between Pcdh15 and Ush1G has been previously reported in digenic heterozygous mice. The digenic mice displayed a significant decrease in hearing compared to age-matched heterozygous animals. Until now no human examples have been reported. CONCLUSIONS The discovery of novel digenic inheritance mechanisms in hereditary hearing impairment will aid in understanding the interaction between defective proteins and further define inner ear function and its interactome.
Collapse
Affiliation(s)
- Isabelle Schrauwen
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza 700D, Houston, TX, 77030, USA
| | - Imen Chakchouk
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza 700D, Houston, TX, 77030, USA
| | - Anushree Acharya
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza 700D, Houston, TX, 77030, USA
| | - Khurram Liaqat
- Department of Biotechnology, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Irfanullah
- Department of Biochemistry, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Deborah A Nickerson
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Michael J Bamshad
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
- Department of Pediatrics, University of Washington, Seattle, Washington, USA
| | - Khadim Shah
- Department of Biochemistry, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Wasim Ahmad
- Department of Biochemistry, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad, Pakistan
| | - Suzanne M Leal
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza 700D, Houston, TX, 77030, USA.
| |
Collapse
|
168
|
The Contributions of ‘Diet’, ‘Genes’, and Physical Activity to the Etiology of Obesity: Contrary Evidence and Consilience. Prog Cardiovasc Dis 2018; 61:89-102. [DOI: 10.1016/j.pcad.2018.06.002] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Accepted: 06/10/2018] [Indexed: 12/12/2022]
|
169
|
Chatelain C, Durand G, Thuillier V, Augé F. Performance of epistasis detection methods in semi-simulated GWAS. BMC Bioinformatics 2018; 19:231. [PMID: 29914375 PMCID: PMC6006572 DOI: 10.1186/s12859-018-2229-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2017] [Accepted: 06/04/2018] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Part of the missing heritability in Genome Wide Association Studies (GWAS) is expected to be explained by interactions between genetic variants, also called epistasis. Various statistical methods have been developed to detect epistasis in case-control GWAS. These methods face major statistical challenges due to the number of tests required, the complexity of the Linkage Disequilibrium (LD) structure, and the lack of consensus regarding the definition of epistasis. Their limited impact in terms of uncovering new biological knowledge might be explained in part by the limited amount of experimental data available to validate their statistical performances in a realistic GWAS context. In this paper, we introduce a simulation pipeline for generating real scale GWAS data, including epistasis and realistic LD structure. We evaluate five exhaustive bivariate interaction methods, fastepi, GBOOST, SHEsisEpi, DSS, and IndOR. Two hundred thirty four different disease scenarios are considered in extensive simulations. We report the performances of each method in terms of false positive rate control, power, area under the ROC curve (AUC), and computation time using a GPU. Finally we compare the result of each methods on a real GWAS of type 2 diabetes from the Welcome Trust Case Control Consortium. RESULTS GBOOST, SHEsisEpi and DSS allow a satisfactory control of the false positive rate. fastepi and IndOR present an increase in false positive rate in presence of LD between causal SNPs, with our definition of epistasis. DSS performs best in terms of power and AUC in most scenarios with no or weak LD between causal SNPs. All methods can exhaustively analyze a GWAS with 6.105 SNPs and 15,000 samples in a couple of hours using a GPU. CONCLUSION This study confirms that computation time is no longer a limiting factor for performing an exhaustive search of epistasis in large GWAS. For this task, using DSS on SNP pairs with limited LD seems to be a good strategy to achieve the best statistical performance. A combination approach using both DSS and GBOOST is supported by the simulation results and the analysis of the WTCCC dataset demonstrated that this approach can detect distinct genes in epistasis. Finally, weak epistasis between common variants will be detectable with existing methods when GWAS of a few tens of thousands cases and controls are available.
Collapse
Affiliation(s)
| | - Guillermo Durand
- Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie, 4, place Jussieu, Paris Cedex 05, 75252 France
| | - Vincent Thuillier
- SANOFI R&D, Biostatistics & Programming, Chilly Mazarin, 91385 France
| | - Franck Augé
- SANOFI R&D, Translational Sciences, Chilly Mazarin, 91385 France
| |
Collapse
|
170
|
Dutta S, Eckmann JP, Libchaber A, Tlusty T. Green function of correlated genes in a minimal mechanical model of protein evolution. Proc Natl Acad Sci U S A 2018; 115:E4559-E4568. [PMID: 29712824 PMCID: PMC5960285 DOI: 10.1073/pnas.1716215115] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The function of proteins arises from cooperative interactions and rearrangements of their amino acids, which exhibit large-scale dynamical modes. Long-range correlations have also been revealed in protein sequences, and this has motivated the search for physical links between the observed genetic and dynamic cooperativity. We outline here a simplified theory of protein, which relates sequence correlations to physical interactions and to the emergence of mechanical function. Our protein is modeled as a strongly coupled amino acid network with interactions and motions that are captured by the mechanical propagator, the Green function. The propagator describes how the gene determines the connectivity of the amino acids and thereby, the transmission of forces. Mutations introduce localized perturbations to the propagator that scatter the force field. The emergence of function is manifested by a topological transition when a band of such perturbations divides the protein into subdomains. We find that epistasis-the interaction among mutations in the gene-is related to the nonlinearity of the Green function, which can be interpreted as a sum over multiple scattering paths. We apply this mechanical framework to simulations of protein evolution and observe long-range epistasis, which facilitates collective functional modes.
Collapse
Affiliation(s)
- Sandipan Dutta
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, Korea
| | - Jean-Pierre Eckmann
- Département de Physique Théorique and Section de Mathématiques, Université de Genève, CH-1211 Geneva 4, Switzerland
| | - Albert Libchaber
- Center for Studies in Physics and Biology, The Rockefeller University, New York, NY 10021;
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, Korea;
- Department of Physics, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea
| |
Collapse
|
171
|
Pecanka J, Jonker MA, Bochdanovits Z, Van Der Vaart AW. A powerful and efficient two-stage method for detecting gene-to-gene interactions in GWAS. Biostatistics 2018; 18:477-494. [PMID: 28334077 DOI: 10.1093/biostatistics/kxw060] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 11/05/2016] [Indexed: 11/13/2022] Open
Abstract
For over a decade functional gene-to-gene interaction (epistasis) has been suspected to be a determinant in the "missing heritability" of complex traits. However, searching for epistasis on the genome-wide scale has been challenging due to the prohibitively large number of tests which result in a serious loss of statistical power as well as computational challenges. In this article, we propose a two-stage method applicable to existing case-control data sets, which aims to lessen both of these problems by pre-assessing whether a candidate pair of genetic loci is involved in epistasis before it is actually tested for interaction with respect to a complex phenotype. The pre-assessment is based on a two-locus genotype independence test performed in the sample of cases. Only the pairs of loci that exhibit non-equilibrium frequencies are analyzed via a logistic regression score test, thereby reducing the multiple testing burden. Since only the computationally simple independence tests are performed for all pairs of loci while the more demanding score tests are restricted to the most promising pairs, genome-wide association study (GWAS) for epistasis becomes feasible. By design our method provides strong control of the type I error. Its favourable power properties especially under the practically relevant misspecification of the interaction model are illustrated. Ready-to-use software is available. Using the method we analyzed Parkinson's disease in four cohorts and identified possible interactions within several SNP pairs in multiple cohorts.
Collapse
Affiliation(s)
- Jakub Pecanka
- Leiden University Medical Center, Department of Medical Statistics and Bioinformatics, Leiden, The Netherlands and VU University, Department of Mathematics, Amsterdam, the Netherlands
| | - Marianne A Jonker
- VU University Medical Center, Department of Epidemiology and Biostatistics, Amsterdam, The Netherlands and Radboud University medical center, Radboud Institute for Health Sciences, Nijmegen, The Netherlands
| | | | - Zoltan Bochdanovits
- VU University Medical Center, Department of Clinical Genetics, Amsterdam, The Netherlands
| | | |
Collapse
|
172
|
Awdeh A, Phenix H, Karn M, Perkins TJ. Dynamics in Epistasis Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:878-891. [PMID: 28092574 DOI: 10.1109/tcbb.2017.2653110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Finding regulatory relationships between genes, including the direction and nature of influence between them, is a fundamental challenge in the field of molecular genetics. One classical approach to this problem is epistasis analysis. Broadly speaking, epistasis analysis infers the regulatory relationships between a pair of genes in a genetic pathway by considering the patterns of change in an observable trait resulting from single and double deletion of genes. While classical epistasis analysis has yielded deep insights on numerous genetic pathways, it is not without limitations. Here, we explore the possibility of dynamic epistasis analysis, in which, in addition to performing genetic perturbations of a pathway, we drive the pathway by a time-varying upstream signal. We explore the theoretical power of dynamical epistasis analysis by conducting an identifiability analysis of Boolean models of genetic pathways, comparing static and dynamic approaches. We find that even relatively simple input dynamics greatly increases the power of epistasis analysis to discriminate alternative network structures. Further, we explore the question of experiment design, and show that a subset of short time-varying signals, which we call dynamic primitives, allow maximum discriminative power with a reduced number of experiments.
Collapse
|
173
|
Suazo J, Santos JL, Colombo A, Pardo R. Gene-gene interaction for nonsyndromic cleft lip with or without cleft palate in Chilean case-parent trios. Arch Oral Biol 2018; 91:91-95. [PMID: 29694940 DOI: 10.1016/j.archoralbio.2018.04.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 04/12/2018] [Accepted: 04/13/2018] [Indexed: 01/07/2023]
Abstract
OBJECTIVE Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is a birth defect for which several genes susceptibility genes been proposed. Consequently, it has been suggested that many of these genes belong to common inter-related pathways during craniofacial development gene-gene interaction. We evaluated the presence of gene-gene interaction for single nucleotide polymorphisms within interferon regulatory factor 6 (IRF6), muscle segment homeobox 1 (MSX1), bone morphogenetic protein 4 (BMP4) and transforming growth factor 3 (TGFB3) genes in NSCL/P risk in Chilean case-parent trios. DESIGN From previous studies, we retrieved genotypes for 13 polymorphic variants within these four genes in 152 case-parent trios. Using the trio package (R) we evaluate the gene-gen interaction in genetic markers pairs applying a 1°-of-freedom test (1df) and a confirmatory 4°-of-freedom (4df) test for epistasis followed by both a permutation test and a Benjamini-Hochberg test for multiple comparisons adjustment. RESULTS We found evidence of gene-gene interaction for rs6446693 (MSX1) and rs2268625 (TGFB3) (4df p = 0.024; permutation p = 0.015, Benjamini-Hochberg p = 0.001). CONCLUSIONS A significant gene-gene interaction was detected for rs6446693 (MSX1) and rs2268625 (TGFB3). This finding is concordant with research in animal models showing that MSX1 and TGFB3 are expressed in common molecular pathways acting in an epistatic manner during maxillofacial development.
Collapse
Affiliation(s)
- José Suazo
- Instituto de Investigación en Ciencias Odontológicas, Facultad de Odontología, Universidad de Chile, Sergio Livingstone #943, Santiago, Chile.
| | - José Luis Santos
- Departamento de Nutrición, Diabetes y Metabolismo, Escuela de Medicina, Pontificia Universidad Católica de Chile, Lira #44, Santiago, Chile
| | - Alicia Colombo
- Programa de Anatomía y Biología del Desarrollo, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Independencia #1027, Santiago, Chile; Servicio de Anatomía Patológica, Hospital Clínico de la Universidad de Chile, Santos Dumont #999, Santiago, Chile
| | - Rosa Pardo
- Sección de Genética, Hospital Clínico Universidad de Chile, Santos Dumont #999, Santiago, Chile; Unidad de Neonatología, Hospital Clínico Universidad de Chile, Santos Dumont #999, Santiago, Chile; Unidad de Genética, Hospital Dr. Sótero del Río, Concha y Toro #3459, Santiago, Chile
| |
Collapse
|
174
|
Wang M, Liu D, Schwender H, Wang H, Wang P, Zhou Z, Li J, Wu T, Zhu H, Beaty TH. Evaluating the effect of nicotinic cholinergic receptor genes on the risk of nonsyndromic cleft lip with or without cleft palate. Oral Dis 2018; 24:1068-1072. [PMID: 29688589 DOI: 10.1111/odi.12879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 03/31/2018] [Accepted: 04/16/2018] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Multiple studies have suggested nonsyndromic cleft lip with or without cleft palate (NSCL/P), and lung cancer may have common genetic etiology. Previous studies have showed genetic variants in nicotinic cholinergic receptor genes (CHRNs) may influence risk of lung cancer. We aimed to explore the effect of CHRNs on risk of NSCL/P considering gene-gene (GxG) interaction for these genes. SUBJECTS AND METHODS We selected 120 markers in 14 CHRNs to test for GxG interaction using 806 Chinese case-parent trios recruited from an international consortium established for a GWAS of oral clefts. RESULTS Totally, two pairs of SNPs yielded significant GxG interactions after Bonferroni correction (rs935865 and rs2337980 with p = 4.04 × 10-5 , rs2741335 and rs3743077 with p = 4.80 × 10-4 ), and these pairwise interactions were confirmed in permutation tests. In addition, the relative risk (RR) of the putative interaction between rs935865 and rs2337980 was 1.10 (95% CI: 0.92~1.31). CONCLUSIONS While the single SNP association and the gene-environment interaction analysis of 14 CHRN genes yielded no signal, this study did demonstrate the importance of considering potential GxG interaction for exploring etiology of NSCL/P. This study suggests an important role for particular combinations of SNPs in CHRN genes in influencing risk to NSCL/P, which needs further study.
Collapse
Affiliation(s)
- Mengying Wang
- School of Public Health, Peking University, Beijing, China
| | - Dongjing Liu
- School of Public Health, Peking University, Beijing, China
| | - Holger Schwender
- Mathematical Institute, Heinrich Heine University Duesseldorf, Duesseldorf, Germany
| | - Hong Wang
- School of Public Health, Peking University, Beijing, China
| | - Ping Wang
- Beijing Center for Disease Prevention and Control, Beijing, China
| | - Zhibo Zhou
- School of Stomatology, Peking University, Beijing, China
| | - Jing Li
- School of Stomatology, Peking University, Beijing, China
| | - Tao Wu
- School of Public Health, Peking University, Beijing, China.,Key Laboratory of Reproductive Health, Ministry of Health, Beijing, China
| | - Hongping Zhu
- School of Stomatology, Peking University, Beijing, China
| | - Terri H Beaty
- School of Public Health, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
175
|
Wang P, Wu T, Schwender H, Wang H, Shi B, Wang ZQ, Yuan Y, Liu DJ, Wang MY, Li J, Zhou ZB, Zhu HP, Beaty TH. Evidence of interaction between genes in the folate/homocysteine metabolic pathway in controlling risk of non-syndromic oral cleft. Oral Dis 2018; 24:820-828. [PMID: 29356306 DOI: 10.1111/odi.12831] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Revised: 12/20/2017] [Accepted: 01/09/2018] [Indexed: 01/09/2023]
Abstract
OBJECTIVE Little consistent evidence is available for the association between the risk of non-syndromic cleft lip with or without cleft palate (NSCL/P) and any of the individual genes in the folate/homocysteine metabolic pathway. We investigated the genes in the folate pathway to further clarify its potential influence on the risk of NSCL/P considering gene-gene (G×G) interaction. SUBJECTS AND METHODS We selected markers in 18 genes from the pathway and applied Cordell's method to test for G×G interaction using 1,908 NSCL/P case-parent trios ascertained in an international consortium where a genomewide association study (GWAS) of oral clefts was conducted. RESULTS We found intriguing signals among Asian and European ancestry groups for G×G interaction between markers in betaine-homocysteine methyltransferase gene (BHMT/BHMT2) and dimethylglycine dehydrogenase gene (DMGDH) attaining genomewide significance. In the pooled data, the top significant interaction was found between rs13158309 (BHMT) and rs10514154 (DMGDH, p = 1.45 × 10-12 ). CONCLUSIONS Our study illustrated the importance of taking into account potential G×G interaction for genetic association analysis in NSCL/P, and this study suggested both BHMT/BHMT2 and DMGDH should be considered as candidate genes for NSCL/P in future studies.
Collapse
Affiliation(s)
- P Wang
- School of Public Health, Peking University, Beijing, China.,Department of Statistics and Information, Beijing Center for Disease Prevention and Control & Beijing Research Center for Preventive Medicine, Beijing, China
| | - T Wu
- School of Public Health, Peking University, Beijing, China.,Key Laboratory of Reproductive Health, Ministry of Health, Beijing, China
| | - H Schwender
- Mathematical Institute, Heinrich Heine University Duesseldorf, Duesseldorf, Germany
| | - H Wang
- School of Public Health, Peking University, Beijing, China
| | - B Shi
- State Key Laboratory of Oral Disease, West China School of Stomatology, Sichuan University, Chengdu, China
| | - Z Q Wang
- School of Public Health, Peking University, Beijing, China
| | - Y Yuan
- School of Public Health, Peking University, Beijing, China
| | - D J Liu
- School of Public Health, Peking University, Beijing, China
| | - M Y Wang
- School of Public Health, Peking University, Beijing, China
| | - J Li
- Pediatric Dentistry, Peking University School of Stomatology, Beijing, China
| | - Z B Zhou
- Oral and Maxillofacial Surgery, Peking University School of Stomatology, Beijing, China
| | - H P Zhu
- Oral and Maxillofacial Surgery, Peking University School of Stomatology, Beijing, China
| | - T H Beaty
- School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
176
|
Verma SS, Lucas A, Zhang X, Veturi Y, Dudek S, Li B, Li R, Urbanowicz R, Moore JH, Kim D, Ritchie MD. Collective feature selection to identify crucial epistatic variants. BioData Min 2018; 11:5. [PMID: 29713383 PMCID: PMC5907720 DOI: 10.1186/s13040-018-0168-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 04/04/2018] [Indexed: 01/17/2023] Open
Abstract
Background Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remains a challenge due to the large number of features and relatively small sample size as input, thus leading to the so-called "short fat data" problem. The efficiency of machine learning methods can be increased by limiting the number of input features. Thus, it is very important to perform variable selection before searching for epistasis. Many methods have been evaluated and proposed to perform feature selection, but no single method works best in all scenarios. We demonstrate this by conducting two separate simulation analyses to evaluate the proposed collective feature selection approach. Results Through our simulation study we propose a collective feature selection approach to select features that are in the "union" of the best performing methods. We explored various parametric, non-parametric, and data mining approaches to perform feature selection. We choose our top performing methods to select the union of the resulting variables based on a user-defined percentage of variants selected from each method to take to downstream analysis. Our simulation analysis shows that non-parametric data mining approaches, such as MDR, may work best under one simulation criteria for the high effect size (penetrance) datasets, while non-parametric methods designed for feature selection, such as Ranger and Gradient boosting, work best under other simulation criteria. Thus, using a collective approach proves to be more beneficial for selecting variables with epistatic effects also in low effect size datasets and different genetic architectures. Following this, we applied our proposed collective feature selection approach to select the top 1% of variables to identify potential interacting variables associated with Body Mass Index (BMI) in ~ 44,000 samples obtained from Geisinger's MyCode Community Health Initiative (on behalf of DiscovEHR collaboration). Conclusions In this study, we were able to show that selecting variables using a collective feature selection approach could help in selecting true positive epistatic variables more frequently than applying any single method for feature selection via simulation studies. We were able to demonstrate the effectiveness of collective feature selection along with a comparison of many methods in our simulation analysis. We also applied our method to identify non-linear networks associated with obesity.
Collapse
Affiliation(s)
- Shefali S Verma
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Anastasia Lucas
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Xinyuan Zhang
- 2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Yogasudha Veturi
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Scott Dudek
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Binglan Li
- 2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Ruowang Li
- 3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Ryan Urbanowicz
- 3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Jason H Moore
- 3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Dokyoon Kim
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA
| | - Marylyn D Ritchie
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| |
Collapse
|
177
|
Davoli R, Gaffo E, Zappaterra M, Bortoluzzi S, Zambonelli P. Identification of differentially expressed small RNAs and prediction of target genes in Italian Large White pigs with divergent backfat deposition. Anim Genet 2018; 49:205-214. [DOI: 10.1111/age.12646] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/10/2018] [Indexed: 01/21/2023]
Affiliation(s)
- R. Davoli
- Department of Agricultural and-Food Sciences (DISTAL); University of Bologna; Viale G. Fanin 46 40127 Bologna Italy
| | - E. Gaffo
- Department of Molecular Medicine; University of Padova; Viale G. Colombo 3 35131 Padova Italy
| | - M. Zappaterra
- Department of Agricultural and-Food Sciences (DISTAL); University of Bologna; Viale G. Fanin 46 40127 Bologna Italy
| | - S. Bortoluzzi
- Department of Molecular Medicine; University of Padova; Viale G. Colombo 3 35131 Padova Italy
| | - P. Zambonelli
- Department of Agricultural and-Food Sciences (DISTAL); University of Bologna; Viale G. Fanin 46 40127 Bologna Italy
| |
Collapse
|
178
|
Stanfill AG, Starlard-Davenport A. Primer in Genetics and Genomics, Article 7-Multifactorial Concepts: Gene-Gene Interactions. Biol Res Nurs 2018. [PMID: 29514459 DOI: 10.1177/1099800418761098] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Most common disorders affecting human health are not attributable to simple Mendelian (single-gene) inheritance patterns. Rather, the risk of developing a complex disease is often the result of interactions across genes, whereby one gene modifies the phenotype of another gene. These types of interactions can occur between two or more genes and are referred to as epistasis. There are five major types of epistatic interactions, but in human genetics, additive epistasis is most often discussed and includes both positive and negative subtypes. Detecting epistatic interactions can be quite difficult because seemingly unrelated genes can interact with and influence each other. As a result of this complexity, statistical geneticists are constantly developing new methods to enhance detection, but there are disadvantages to each proposed method. In this article, we explore the concept of epistasis, discuss different types of epistatic interactions, and provide a brief introduction to statistical methods researchers use to uncover sets of epistatic interactions. Then, we consider Alzheimer's disease as an exemplar for a disease with epistatic effects. Finally, we provide helpful resources, where nurses can learn more about epistasis in order to incorporate these methods into their own program of research.
Collapse
Affiliation(s)
- Ansley Grimes Stanfill
- 1 Department of Acute and Tertiary Care, College of Nursing, University of Tennessee Health Science Center, Memphis, TN, USA.,2 Department of Genetics, Genomics, and Informatics, College of Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Athena Starlard-Davenport
- 2 Department of Genetics, Genomics, and Informatics, College of Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
| |
Collapse
|
179
|
van der Linden D, Schermer JA, de Zeeuw E, Dunkel CS, Pekaar KA, Bakker AB, Vernon PA, Petrides KV. Overlap Between the General Factor of Personality and Trait Emotional Intelligence: A Genetic Correlation Study. Behav Genet 2018; 48:147-154. [PMID: 29264815 PMCID: PMC5846839 DOI: 10.1007/s10519-017-9885-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 12/08/2017] [Indexed: 12/31/2022]
Abstract
A previous meta-analysis (Van der Linden et al., Psychol Bull 143:36-52, 2017) showed that the General Factor of Personality (GFP) overlaps with ability as well as trait emotional intelligence (EI). The correlation between trait EI and the GFP was so high (ρ = 0.88) in that meta-analysis that these two may be considered virtually identical constructs. The present study builds on these findings by examining whether the strong phenotypic correlation between the GFP and trait EI has a genetic component. In a sample of monozygotic and dizygotic twins, the heritability estimates for the GFP and trait EI were 53 and 45%, respectively. Moreover, there was a strong genetic correlation of r = .90 between the GFP and trait EI. Additional analyses suggested that a substantial proportion of the genetic correlations reflects non-additive genetic effects (e.g., dominance and epistasis). These findings are discussed in light of evolutionary accounts of the GFP.
Collapse
Affiliation(s)
- Dimitri van der Linden
- Department of Psychology, Education, and Child Studies, Erasmus University Rotterdam, P.O. Box 9104, 3000 DR, Rotterdam, The Netherlands.
| | - Julie A Schermer
- Management and Organizational Studies, University of Western Ontario, London, Canada
| | - Eveline de Zeeuw
- Department of Biological Psychology, Free University Amsterdam, Amsterdam, The Netherlands
| | - Curtis S Dunkel
- Department of Psychology, Western Illinois University, Macomb, USA
| | - Keri A Pekaar
- Department of Psychology, Education, and Child Studies, Erasmus University Rotterdam, P.O. Box 9104, 3000 DR, Rotterdam, The Netherlands
| | - Arnold B Bakker
- Department of Psychology, Education, and Child Studies, Erasmus University Rotterdam, P.O. Box 9104, 3000 DR, Rotterdam, The Netherlands
| | - Philip A Vernon
- Department of Psychology, University of Western Ontario, London, Canada
| | - K V Petrides
- London Psychometric Laboratory, University College London, London, UK
| |
Collapse
|
180
|
Primary ovarian insufficiency associated with autosomal abnormalities: from chromosome to genome-wide and beyond. Menopause 2018; 23:806-15. [PMID: 27045702 DOI: 10.1097/gme.0000000000000603] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
OBJECTIVE The pathophysiology of primary ovarian insufficiency (POI) is not well elucidated. Many candidate genetic aberrations are on the X-chromosome; on the contrary, many genetic perturbations are also on the autosomes. The aim of this review is to summarize the knowledge of genetic aberrations on autosomes from chromosomal rearrangement, gene abnormality, genome-wide association studies and epigenetics. METHODS Searches of electronic databases were performed. Articles and abstracts relevant to POI and genetic studies associated with autosomes were summarized in this interpretive literature review. RESULTS Various genetic aberrations located on the autosomes were found. These abnormalities are from chromosomal rearrangement, which might disrupt the critical region on chromosome loci or disturbance of the meiosis process. Specific gene aberrations are also identified. The genes that have functions in ovarian development, folliculogenesis, and steroidogenesis on autosomes are proposed to be involved from gene association studies. Gene-to-gene interaction or epistasis also might play a role in POI occurrence. Recently, genetic techniques to study the whole genome have emerged. Although no specific conclusion has been made, the studies using genome-wide association to find the specific aberration throughout the genome in POI have been published. Epigenetic mechanisms might also take part in the pathogenesis of POI. CONCLUSIONS The considerably complex process of POI is still not well understood. Further research is needed for gene functional validation studies to confirm the contribution of genes in POI, or additional genome-wide association studies using novel clustered regularly interspaced short palindromic repeat/Cas9 technique might make these mechanisms more comprehensible.
Collapse
|
181
|
Cole BS, Hall MA, Urbanowicz RJ, Gilbert‐Diamond D, Moore JH. Analysis of Gene‐Gene Interactions. ACTA ACUST UNITED AC 2018; 95:1.14.1-1.14.10. [DOI: 10.1002/cphg.45] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Brian S. Cole
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Molly A. Hall
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
- The Center for Systems Genomics, The Pennsylvania State University, University Park Pennsylvania
| | - Ryan J. Urbanowicz
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Diane Gilbert‐Diamond
- Institute for Quantitative Biomedical Sciences at Dartmouth Hanover New Hampshire
- Department of Epidemiology, Geisel School of Medicine at Dartmouth Hanover New Hampshire
| | - Jason H. Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| |
Collapse
|
182
|
Han SS, Chatterjee N. Review of Statistical Methods for Gene-Environment Interaction Analysis. CURR EPIDEMIOL REP 2018. [DOI: 10.1007/s40471-018-0135-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
183
|
Ameratunga R, Woon ST, Bryant VL, Steele R, Slade C, Leung EY, Lehnert K. Clinical Implications of Digenic Inheritance and Epistasis in Primary Immunodeficiency Disorders. Front Immunol 2018; 8:1965. [PMID: 29434582 PMCID: PMC5790765 DOI: 10.3389/fimmu.2017.01965] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 12/19/2017] [Indexed: 12/16/2022] Open
Abstract
The existence of epistasis in humans was first predicted by Bateson in 1909. Epistasis describes the non-linear, synergistic interaction of two or more genetic loci, which can substantially modify disease severity or result in entirely new phenotypes. The concept has remained controversial in human genetics because of the lack of well-characterized examples. In humans, it is only possible to demonstrate epistasis if two or more genes are mutated. In most cases of epistasis, the mutated gene products are likely to be constituents of the same physiological pathway leading to severe disruption of a cellular function such as antibody production. We have recently described a digenic family, who carry mutations of TNFRSF13B/TACI as well as TCF3 genes. Both genes lie in tandem along the immunoglobulin isotype switching and secretion pathway. We have shown they interact in an epistatic way causing severe immunodeficiency and autoimmunity in the digenic proband. With the advent of next generation sequencing, it is likely other families with digenic inheritance will be identified. Since digenic inheritance does not always cause epistasis, we propose an epistasis index which may help quantify the effects of the two mutations. We also discuss the clinical implications of digenic inheritance and epistasis in humans with primary immunodeficiency disorders.
Collapse
Affiliation(s)
- Rohan Ameratunga
- Department of Virology and Immunology, Auckland City Hospital, Auckland, New Zealand.,Department of Clinical Immunology, Auckland City Hospital, Auckland, New Zealand
| | - See-Tarn Woon
- Department of Virology and Immunology, Auckland City Hospital, Auckland, New Zealand
| | - Vanessa L Bryant
- Department of Immunology, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.,Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Richard Steele
- Department of Virology and Immunology, Auckland City Hospital, Auckland, New Zealand
| | - Charlotte Slade
- Department of Immunology, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.,Department of Allergy and Clinical Immunology, Royal Melbourne Hospital, Parkville, VIC, Australia
| | - Euphemia Yee Leung
- Auckland Cancer Society Research Centre, University of Auckland, Auckland, New Zealand
| | - Klaus Lehnert
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
184
|
Verma SS, Ritchie MD. Another Round of "Clue" to Uncover the Mystery of Complex Traits. Genes (Basel) 2018; 9:E61. [PMID: 29370075 PMCID: PMC5852557 DOI: 10.3390/genes9020061] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/19/2017] [Accepted: 01/15/2018] [Indexed: 12/13/2022] Open
Abstract
A plethora of genetic association analyses have identified several genetic risk loci. Technological and statistical advancements have now led to the identification of not only common genetic variants, but also low-frequency variants, structural variants, and environmental factors, as well as multi-omics variations that affect the phenotypic variance of complex traits in a population, thus referred to as complex trait architecture. The concept of heritability, or the proportion of phenotypic variance due to genetic inheritance, has been studied for several decades, but its application is mainly in addressing the narrow sense heritability (or additive genetic component) from Genome-Wide Association Studies (GWAS). In this commentary, we reflect on our perspective on the complexity of understanding heritability for human traits in comparison to model organisms, highlighting another round of clues beyond GWAS and an alternative approach, investigating these clues comprehensively to help in elucidating the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Shefali Setia Verma
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Marylyn D Ritchie
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
185
|
Liu D, Schwender H, Wang M, Wang H, Wang P, Zhu H, Zhou Z, Li J, Wu T, Beaty TH. Gene-gene interaction between MSX1 and TP63 in Asian case-parent trios with nonsyndromic cleft lip with or without cleft palate. Birth Defects Res 2018; 110:317-324. [PMID: 29341488 DOI: 10.1002/bdr2.1139] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Revised: 08/27/2017] [Accepted: 09/06/2017] [Indexed: 01/10/2023]
Abstract
BACKGROUND Small ubiquitin-like modification, also known as sumoylation, is a crucial post-translational regulatory mechanisms involved in development of the lip and palate. Recent studies reported two sumoylation target genes, MSX1 and TP63, to have achieved genome-wide level significance in tests of association with nonsyndromic clefts. Here, we performed a candidate gene analysis considering gene-gene and gene-environment interaction for SUMO1, MSX1, and TP63 to further explore the etiology of nonsyndromic cleft lip with or without cleft palate (NSCL/P). METHODS A total of 130 single-nucleotide polymorphisms (SNPs) in or near SUMO1, MSX1, and TP63 was analyzed among 1,038 Asian NSCL/P trios ascertained through an international consortium. Conditional logistic regression models were used to explore gene-gene (G × G) and gene-environment (G × E) interaction involving maternal environmental tobacco smoke and multivitamin supplementation. Bonferroni correction was used for G × E analysis and permutation tests were used for G × G analysis. RESULTS While transmission disequilibrium tests and gene-environment interaction analysis showed no significant results, we did find signals of gene-gene interaction between SNPs near MSX1 and TP63. Three pairwise interactions yielded significant p values in permutation tests (rs884690 and rs9290890 with p = 9.34 × 10-5 and empirical p = 1.00 × 10-4 , rs1022136 and rs4687098 with p = 2.41 × 10-4 and empirical p = 2.95 × 10-4 , rs6819546 and rs9681004 with p = 5.15 × 10-4 and empirical p = 3.02 × 10-4 ). CONCLUSION Gene-gene interaction between MSX1 and TP63 may influence the risk of NSCL/P in Asian populations. Our study provided additional understanding of the genetic etiology of NSCL/P and underlined the importance of considering gene-gene interaction in the etiology of this common craniofacial malformation.
Collapse
Affiliation(s)
- Dongjing Liu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Holger Schwender
- Mathematical Institute, Heinrich Heine University Duesseldorf, Duesseldorf, Germany
| | - Mengying Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Hong Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Ping Wang
- Department of Statistics and Information, Beijing Center for Disease Prevention and Control, Beijing, China
| | - Hongping Zhu
- School of Stomatology, Peking University, Beijing, China
| | - Zhibo Zhou
- School of Stomatology, Peking University, Beijing, China
| | - Jing Li
- School of Stomatology, Peking University, Beijing, China
| | - Tao Wu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China.,Key Laboratory of Reproductive Health, Ministry of Health, Beijing, China
| | - Terri H Beaty
- Department of Epidemiology, School of Public Health, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
186
|
Nowak S, Neidhart J, Szendro IG, Rzezonka J, Marathe R, Krug J. Interaction Analysis of Longevity Interventions Using Survival Curves. BIOLOGY 2018; 7:biology7010006. [PMID: 29316622 PMCID: PMC5872032 DOI: 10.3390/biology7010006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2017] [Revised: 12/30/2017] [Accepted: 01/03/2018] [Indexed: 01/05/2023]
Abstract
A long-standing problem in ageing research is to understand how different factors contributing to longevity should be expected to act in combination under the assumption that they are independent. Standard interaction analysis compares the extension of mean lifespan achieved by a combination of interventions to the prediction under an additive or multiplicative null model, but neither model is fundamentally justified. Moreover, the target of longevity interventions is not mean life span but the entire survival curve. Here we formulate a mathematical approach for predicting the survival curve resulting from a combination of two independent interventions based on the survival curves of the individual treatments, and quantify interaction between interventions as the deviation from this prediction. We test the method on a published data set comprising survival curves for all combinations of four different longevity interventions in Caenorhabditis elegans. We find that interactions are generally weak even when the standard analysis indicates otherwise.
Collapse
Affiliation(s)
- Stefan Nowak
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, 50931 Cologne, Germany.
- Institut für Theoretische Physik, Universität zu Köln, 50937 Cologne, Germany.
| | - Johannes Neidhart
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, 50931 Cologne, Germany.
- Institut für Theoretische Physik, Universität zu Köln, 50937 Cologne, Germany.
- MBR Optical Systems, 42279 Wuppertal, Germany.
| | - Ivan G Szendro
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, 50931 Cologne, Germany.
- Institut für Theoretische Physik, Universität zu Köln, 50937 Cologne, Germany.
| | - Jonas Rzezonka
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, 50931 Cologne, Germany.
- Institut für Theoretische Physik, Universität zu Köln, 50937 Cologne, Germany.
| | - Rahul Marathe
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, 50931 Cologne, Germany.
- Institut für Theoretische Physik, Universität zu Köln, 50937 Cologne, Germany.
- Department of Physics, Indian Institute of Technology Delhi, Hauz Khas, 110016 New Delhi, India.
| | - Joachim Krug
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, 50931 Cologne, Germany.
- Institut für Theoretische Physik, Universität zu Köln, 50937 Cologne, Germany.
| |
Collapse
|
187
|
|
188
|
Obolski U, Ram Y, Hadany L. Key issues review: evolution on rugged adaptive landscapes. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2018; 81:012602. [PMID: 29051394 DOI: 10.1088/1361-6633/aa94d4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Adaptive landscapes represent a mapping between genotype and fitness. Rugged adaptive landscapes contain two or more adaptive peaks: allele combinations with higher fitness than any of their neighbors in the genetic space. How do populations evolve on such rugged landscapes? Evolutionary biologists have struggled with this question since it was first introduced in the 1930s by Sewall Wright. Discoveries in the fields of genetics and biochemistry inspired various mathematical models of adaptive landscapes. The development of landscape models led to numerous theoretical studies analyzing evolution on rugged landscapes under different biological conditions. The large body of theoretical work suggests that adaptive landscapes are major determinants of the progress and outcome of evolutionary processes. Recent technological advances in molecular biology and microbiology allow experimenters to measure adaptive values of large sets of allele combinations and construct empirical adaptive landscapes for the first time. Such empirical landscapes have already been generated in bacteria, yeast, viruses, and fungi, and are contributing to new insights about evolution on adaptive landscapes. In this Key Issues Review we will: (i) introduce the concept of adaptive landscapes; (ii) review the major theoretical studies of evolution on rugged landscapes; (iii) review some of the recently obtained empirical adaptive landscapes; (iv) discuss recent mathematical and statistical analyses motivated by empirical adaptive landscapes, as well as provide the reader with instructions and source code to implement simulations of evolution on adaptive landscapes; and (v) discuss possible future directions for this exciting field.
Collapse
|
189
|
Gumpinger AC, Roqueiro D, Grimm DG, Borgwardt KM. Methods and Tools in Genome-wide Association Studies. Methods Mol Biol 2018; 1819:93-136. [PMID: 30421401 DOI: 10.1007/978-1-4939-8618-7_5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Many traits, such as height, the response to a given drug, or the susceptibility to certain diseases are presumably co-determined by genetics. Especially in the field of medicine, it is of major interest to identify genetic aberrations that alter an individual's risk to develop a certain phenotypic trait. Addressing this question requires the availability of comprehensive, high-quality genetic datasets. The technological advancements and the decreasing cost of genotyping in the last decade led to an increase in such datasets. Parallel to and in line with this technological progress, an analysis framework under the name of genome-wide association studies was developed to properly collect and analyze these data. Genome-wide association studies aim at finding statistical dependencies-or associations-between a trait of interest and point-mutations in the DNA. The statistical models used to detect such associations are diverse, spanning the whole range from the frequentist to the Bayesian setting.Since genetic datasets are inherently high-dimensional, the search for associations poses not only a statistical but also a computational challenge. As a result, a variety of toolboxes and software packages have been developed, each implementing different statistical methods while using various optimizations and mathematical techniques to enhance the computations.This chapter is devoted to the discussion of widely used methods and tools in genome-wide association studies. We present the different statistical models and the assumptions on which they are based, explain peculiarities of the data that have to be accounted for and, most importantly, introduce commonly used tools and software packages for the different tasks in a genome-wide association study, complemented with examples for their application.
Collapse
Affiliation(s)
- Anja C Gumpinger
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | - Damian Roqueiro
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Dominik G Grimm
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Karsten M Borgwardt
- Machine Learning and Computational Biology Lab, D-BSSE, ETH Zurich, Basel, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
190
|
Mielniczuk J, Teisseyre P. A deeper look at two concepts of measuring gene-gene interactions: logistic regression and interaction information revisited. Genet Epidemiol 2017; 42:187-200. [PMID: 29265411 DOI: 10.1002/gepi.22108] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 10/23/2017] [Accepted: 11/15/2017] [Indexed: 11/09/2022]
Abstract
Detection of gene-gene interactions is one of the most important challenges in genome-wide case-control studies. Besides traditional logistic regression analysis, recently the entropy-based methods attracted a significant attention. Among entropy-based methods, interaction information is one of the most promising measures having many desirable properties. Although both logistic regression and interaction information have been used in several genome-wide association studies, the relationship between them has not been thoroughly investigated theoretically. The present paper attempts to fill this gap. We show that although certain connections between the two methods exist, in general they refer two different concepts of dependence and looking for interactions in those two senses leads to different approaches to interaction detection. We introduce ordering between interaction measures and specify conditions for independent and dependent genes under which interaction information is more discriminative measure than logistic regression. Moreover, we show that for so-called perfect distributions those measures are equivalent. The numerical experiments illustrate the theoretical findings indicating that interaction information and its modified version are more universal tools for detecting various types of interaction than logistic regression and linkage disequilibrium measures.
Collapse
Affiliation(s)
- Jan Mielniczuk
- Institute of Computer Science, Polish Academy of Sciences, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Poland
| | - Paweł Teisseyre
- Institute of Computer Science, Polish Academy of Sciences, Poland
| |
Collapse
|
191
|
Mazaya M, Trinh HC, Kwon YK. Construction and analysis of gene-gene dynamics influence networks based on a Boolean model. BMC SYSTEMS BIOLOGY 2017; 11:133. [PMID: 29322926 PMCID: PMC5763298 DOI: 10.1186/s12918-017-0509-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
BACKGROUND Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. RESULTS To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene relationships. CONCLUSION Taken together, construction and analysis of the GDI network can be a useful approach to identify novel gene-gene relationships in terms of the dynamical influence.
Collapse
Affiliation(s)
- Maulida Mazaya
- Department of Electrical/Electronic and Computer Engineering, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan, 44610 Republic of Korea
| | - Hung-Cuong Trinh
- Department of Electrical/Electronic and Computer Engineering, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan, 44610 Republic of Korea
| | - Yung-Keun Kwon
- Department of Electrical/Electronic and Computer Engineering, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan, 44610 Republic of Korea
| |
Collapse
|
192
|
Chihuri S, Li G, Chen Q. Interaction of marijuana and alcohol on fatal motor vehicle crash risk: a case-control study. Inj Epidemiol 2017; 4:8. [PMID: 28286930 PMCID: PMC5357617 DOI: 10.1186/s40621-017-0105-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 02/20/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Concurrent use of marijuana and alcohol in drivers is of increasing concern but its role in crash causation has not been well understood. METHODS Using a case-control design, we assessed the individual and joint effects of marijuana and alcohol use on fatal crash risk. Cases (n = 1944) were drivers fatally injured in motor vehicle crashes in the United States at specific times in 2006, 2007 and 2008. Controls (n = 7719) were drivers who participated in the 2007 National Roadside Survey of Alcohol and Drug Use by Drivers. RESULTS Overall, cases were significantly more likely than controls to test positive for marijuana (12.2% vs. 5.9%, p < 0.0001), alcohol (57.8% vs. 7.7%, p < 0.0001) and both marijuana and alcohol (8.9% vs. 0.8%, p < 0.0001). Compared to drivers testing negative for alcohol and marijuana, the adjusted odds ratios of fatal crash involvement were 16.33 [95% confidence interval (CI): 14.23, 18.75] for those testing positive for alcohol and negative for marijuana, 1.54 (95% CI: 1.16, 2.03) for those testing positive for marijuana and negative for alcohol, and 25.09 (95% CI: 17.97, 35.03) for those testing positive for both alcohol and marijuana. CONCLUSIONS Alcohol use and marijuana use are each associated with significantly increased risks of fatal crash involvement. When alcohol and marijuana are used together, there exists a positive synergistic effect on fatal crash risk on the additive scale.
Collapse
Affiliation(s)
- Stanford Chihuri
- Center for Injury Epidemiology and Prevention, Columbia University Medical Center, 622 West 168th St, PH5-505, New York, NY 10032 USA
- Department of Anesthesiology, College of Physicians and Surgeons, Columbia University, New York, NY USA
| | - Guohua Li
- Center for Injury Epidemiology and Prevention, Columbia University Medical Center, 622 West 168th St, PH5-505, New York, NY 10032 USA
- Department of Anesthesiology, College of Physicians and Surgeons, Columbia University, New York, NY USA
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY USA
| | - Qixuan Chen
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY USA
| |
Collapse
|
193
|
Vahdati AR, Wagner A. Population Size Affects Adaptation in Complex Ways: Simulations on Empirical Adaptive Landscapes. Evol Biol 2017. [DOI: 10.1007/s11692-017-9440-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
194
|
Fuku N, Díaz-Peña R, Arai Y, Abe Y, Zempo H, Naito H, Murakami H, Miyachi M, Spuch C, Serra-Rexach JA, Emanuele E, Hirose N, Lucia A. Epistasis, physical capacity-related genes and exceptional longevity: FNDC5 gene interactions with candidate genes FOXOA3 and APOE. BMC Genomics 2017; 18:803. [PMID: 29143599 PMCID: PMC5688477 DOI: 10.1186/s12864-017-4194-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND Forkhead box O3A (FOXOA3) and apolipoprotein E (APOE) are arguably the strongest gene candidates to influence human exceptional longevity (EL, i.e., being a centenarian), but inconsistency exists among cohorts. Epistasis, defined as the effect of one locus being dependent on the presence of 'modifier genes', may contribute to explain the missing heritability of complex phenotypes such as EL. We assessed the potential association of epistasis among candidate polymorphisms related to physical capacity, as well as antioxidant defense and cardiometabolic traits, and EL in the Japanese population. A total of 1565 individuals were studied, subdivided into 822 middle-aged controls and 743 centenarians. RESULTS We found a FOXOA3 rs2802292 T-allele-dependent association of fibronectin type III domain-containing 5 (FDNC5) rs16835198 with EL: the frequency of carriers of the FOXOA3 rs2802292 T-allele among individuals with the rs16835198 GG genotype was significantly higher in cases than in controls (P < 0.05). On the other hand, among non-carriers of the APOE 'risk' ε4-allele, the frequency of the FDNC5 rs16835198 G-allele was higher in cases than in controls (48.4% vs. 43.6%, P < 0.05). Among carriers of the 'non-risk' APOE ε2-allele, the frequency of the rs16835198 G-allele was higher in cases than in controls (49% vs. 37.3%, P < 0.05). CONCLUSIONS The association of FDNC5 rs16835198 with EL seems to depend on the presence of the FOXOA3 rs2802292 T-allele and we report a novel association between FNDC5 rs16835198 stratified by the presence of the APOE ε2/ε4-allele and EL. More research on 'gene*gene' and 'gene*environment' effects is needed in the field of EL.
Collapse
Affiliation(s)
- Noriyuki Fuku
- Graduate School of Health and Sports Science, Juntendo University, Chiba, Japan.
| | - Roberto Díaz-Peña
- Hospital Universitari Institut Pere Mata, IISPV, URV. CIBERSAM, Reus, Spain.,Facultad de Ciencias de la Salud, Universidad Autónoma de Chile, Talca, Chile
| | - Yasumichi Arai
- Center for Supercentenarian Medical Research, Keio University School of Medicine, Tokyo, Japan
| | - Yukiko Abe
- Center for Supercentenarian Medical Research, Keio University School of Medicine, Tokyo, Japan
| | - Hirofumi Zempo
- Graduate School of Health and Sports Science, Juntendo University, Chiba, Japan
| | - Hisashi Naito
- Graduate School of Health and Sports Science, Juntendo University, Chiba, Japan
| | - Haruka Murakami
- Department of Physical Activity Research; National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan
| | - Motohiko Miyachi
- Department of Physical Activity Research; National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan
| | - Carlos Spuch
- Neurology Group, Galicia Sur Health Research Institute (IIS Galicia Sur), Centro de investigación biomédica en red del área de salud mental (CIBERSAM), Vigo, Spain
| | - José A Serra-Rexach
- Centro de investigación biomédica en Envejecimiento y Fragilidad (CIBERFES), Madrid, Spain
| | | | - Nobuyoshi Hirose
- Graduate School of Health and Sports Science, Juntendo University, Chiba, Japan
| | - Alejandro Lucia
- European University and Research Institute i+12, Madrid, Spain
| |
Collapse
|
195
|
Fischer D. The R-package GenomicTools for multifactor dimensionality reduction and the analysis of (exploratory) Quantitative Trait Loci. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2017; 151:171-177. [PMID: 28946999 DOI: 10.1016/j.cmpb.2017.08.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Revised: 07/11/2017] [Accepted: 08/21/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND AND OBJECTIVES We introduce the R-package GenomicTools to perform, among others, a Multifactor Dimensionality Reduction (MDR) for the identification of SNP-SNP interactions. The package further provides a new class of tests for an (exploratory) Quantitative Trait Loci analysis that overcomes some of the limitations of other popular (e)QTL approaches. Popular (e)QTL approaches that use linear models or ANOVA are often based on over-simplified models that have weak statistical properties and which are not robust against outlying observations. METHOD The algorithm to calculate the MDR is well established. To speed up its calculation in R, we implemented it in C++. Further, our implementation also supports the combination of several MDR results to an MDR ensemble classifier. The (e)QTL test procedure is based on a generalized Mann-Whitney test that is tailored for directional alternatives, as they are present in an (e)QTL analysis. RESULTS Our package GenomicTools provides functions to determine SNP combinations that have the highest accuracy for a MDR classification problem. It also provides functions to combine the best MDR results to a joined ensemble classifier for improved classification results. Further, the (e)QTL analysis is based on a solid statistical theory. In addition, informative visualizations of the results are provided. CONCLUSION The here presented new class of tests and methods have an easy to apply syntax, so that also researchers inexperienced in R are able to apply our proposed methods and implementations. The package creates publication ready Figures and hence could be a valuable tool for genomic data analysis.
Collapse
Affiliation(s)
- Daniel Fischer
- Natural Resources Institute Finland (Luke), Myllytie 1, Jokioinen, Finland; University of Tampere, School of Health Sciences, Tampere, Finland. http://genomictools.danielfischer.name
| |
Collapse
|
196
|
Cowman T, Koyutürk M. Prioritizing tests of epistasis through hierarchical representation of genomic redundancies. Nucleic Acids Res 2017; 45:e131. [PMID: 28605458 PMCID: PMC5737499 DOI: 10.1093/nar/gkx505] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 05/29/2017] [Indexed: 11/14/2022] Open
Abstract
Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors, with applications including assessment of susceptibility to complex diseases, decision making in precision medicine, and gaining insights into disease mechanisms. Since the number of genomic loci assayed by GWAS is extremely large (usually in the order of millions), identification of epistatic loci is a statistically difficult and computationally intensive problem. Even when only pairwise interactions are considered, the size of the search space ranges from hundreds of millions to billions of locus pairs. The large number of statistical tests performed also makes sufficient type one error correction imperative. Consequently, efficient algorithms are required to filter the tests that are performed and evaluate large GWAS data sets in a reasonable amount of computation time. It has been observed that many pairwise tests are redundant due to correlations in their genotype values across samples, known as linkage disequilibrium. However, algorithms that have been developed for efficient identification of epistatic loci do not systematically exploit linkage disequilibrium. Here, we propose a new algorithm for fast epistasis detection based on hierarchical representation of linkage disequilibrium (LinDen). We utilize redundancies in genotype patterns between neighboring loci to generate a hierarchical structure and execute a branch-and-bound search to prioritize loci testing based on approximations of a test statistic for pairs of locus groups. The hierarchical organization of tests performed by LinDen allows for efficient scaling based on the screened loci. We test LinDen comprehensively on three data sets obtained from the Wellcome Trust Case Control Consortium: type two diabetes, psoriasis, and hypertension. Our results show that, as compared other state-of-the-art tools for fast epistasis detection, LinDen drastically reduces the number of tests performed while discovering statistically significant locus pairs. LinDen is implemented in C++ and is available as open source at http://compbio.case.edu/linden/.
Collapse
Affiliation(s)
- Tyler Cowman
- Electrical Engineering & Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Mehmet Koyutürk
- Electrical Engineering & Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA.,Center for Proteomics & Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
197
|
Abstract
A long-standing goal in evolutionary biology is predicting evolution. Here, we show that the architecture of macromolecules fundamentally limits evolutionary predictability. Under physiological conditions, macromolecules, like proteins, flip between multiple structures, forming an ensemble of structures. A mutation affects all of these structures in slightly different ways, redistributing the relative probabilities of structures in the ensemble. As a result, mutations that follow the first mutation have a different effect than they would if introduced before. This implies that knowing the effects of every mutation in an ancestor would be insufficient to predict evolutionary trajectories past the first few steps, leading to profound unpredictability in evolution. We, therefore, conclude that detailed evolutionary predictions are not possible given the chemistry of macromolecules. Evolutionary prediction is of deep practical and philosophical importance. Here we show, using a simple computational protein model, that protein evolution remains unpredictable, even if one knows the effects of all mutations in an ancestral protein background. We performed a virtual deep mutational scan—revealing the individual and pairwise epistatic effects of every mutation to our model protein—and then used this information to predict evolutionary trajectories. Our predictions were poor. This is a consequence of statistical thermodynamics. Proteins exist as ensembles of similar conformations. The effect of a mutation depends on the relative probabilities of conformations in the ensemble, which in turn, depend on the exact amino acid sequence of the protein. Accumulating substitutions alter the relative probabilities of conformations, thereby changing the effects of future mutations. This manifests itself as subtle but pervasive high-order epistasis. Uncertainty in the effect of each mutation accumulates and undermines prediction. Because conformational ensembles are an inevitable feature of proteins, this is likely universal.
Collapse
|
198
|
Wu C, Jiang Y, Ren J, Cui Y, Ma S. Dissecting gene-environment interactions: A penalized robust approach accounting for hierarchical structures. Stat Med 2017; 37:437-456. [PMID: 29034484 DOI: 10.1002/sim.7518] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 07/30/2017] [Accepted: 09/07/2017] [Indexed: 12/26/2022]
Abstract
Identification of gene-environment (G × E) interactions associated with disease phenotypes has posed a great challenge in high-throughput cancer studies. The existing marginal identification methods have suffered from not being able to accommodate the joint effects of a large number of genetic variants, while some of the joint-effect methods have been limited by failing to respect the "main effects, interactions" hierarchy, by ignoring data contamination, and by using inefficient selection techniques under complex structural sparsity. In this article, we develop an effective penalization approach to identify important G × E interactions and main effects, which can account for the hierarchical structures of the 2 types of effects. Possible data contamination is accommodated by adopting the least absolute deviation loss function. The advantage of the proposed approach over the alternatives is convincingly demonstrated in both simulation and a case study on lung cancer prognosis with gene expression measurements and clinical covariates under the accelerated failure time model.
Collapse
Affiliation(s)
- Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Yu Jiang
- Division of Epidemiology, Biostatistics, and Environmental Health, University of Memphis, Memphis, TN 38111, USA
| | - Jie Ren
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, 619 Red Cedar Rd, East Lansing, MI 48824, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale University, 60 College Street, New Haven, CT 06520, USA
| |
Collapse
|
199
|
Stankovic M, Nikolic A, Nagorni-Obradovic L, Petrovic-Stanojevic N, Radojkovic D. Gene–Gene Interactions Between Glutathione S-Transferase M1 and Matrix Metalloproteinases 1, 9, and 12 in Chronic Obstructive Pulmonary Disease in Serbians. COPD 2017; 14:581-589. [DOI: 10.1080/15412555.2017.1369022] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Marija Stankovic
- Institute of Molecular Genetics and Genetic Engineering, University of Belgrade, Belgrade, Serbia
| | - Aleksandra Nikolic
- Institute of Molecular Genetics and Genetic Engineering, University of Belgrade, Belgrade, Serbia
| | - Ljudmila Nagorni-Obradovic
- Clinic for Pulmonary Diseases, Clinical Centre of Serbia, Belgrade, Serbia
- School of Medicine, University of Belgrade, Belgrade, Serbia
| | - Natasa Petrovic-Stanojevic
- Department of Pulmonology, Zvezdara University Medical Center, Belgrade, Serbia
- School of Dentistry, University of Belgrade, Belgrade, Serbia
| | - Dragica Radojkovic
- Institute of Molecular Genetics and Genetic Engineering, University of Belgrade, Belgrade, Serbia
| |
Collapse
|
200
|
Niche harmony search algorithm for detecting complex disease associated high-order SNP combinations. Sci Rep 2017; 7:11529. [PMID: 28912584 PMCID: PMC5599559 DOI: 10.1038/s41598-017-11064-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 08/17/2017] [Indexed: 02/01/2023] Open
Abstract
Genome-wide association study is especially challenging in detecting high-order disease-causing models due to model diversity, possible low or even no marginal effect of the model, and extraordinary search and computations. In this paper, we propose a niche harmony search algorithm where joint entropy is utilized as a heuristic factor to guide the search for low or no marginal effect model, and two computationally lightweight scores are selected to evaluate and adapt to diverse of disease models. In order to obtain all possible suspected pathogenic models, niche technique merges with HS, which serves as a taboo region to avoid HS trapping into local search. From the resultant set of candidate SNP-combinations, we use G-test statistic for testing true positives. Experiments were performed on twenty typical simulation datasets in which 12 models are with marginal effect and eight ones are with no marginal effect. Our results indicate that the proposed algorithm has very high detection power for searching suspected disease models in the first stage and it is superior to some typical existing approaches in both detection power and CPU runtime for all these datasets. Application to age-related macular degeneration (AMD) demonstrates our method is promising in detecting high-order disease-causing models.
Collapse
|