1
|
Zuo Y, Pei Y, Li Y, Wen S, Ren X, Li L, Wu Y, Hu Z. The synergism between metabolic and target-site resistance enhances the intensity of resistance to pyrethroids in Spodoptera exigua. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2025; 180:104313. [PMID: 40233841 DOI: 10.1016/j.ibmb.2025.104313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Revised: 03/20/2025] [Accepted: 04/13/2025] [Indexed: 04/17/2025]
Abstract
The widespread application of insecticides imposes intense selective pressure on pest populations, driving the evolution of high-level resistance and leading to frequent control failures of pest. Insecticide resistance is primarily mediated through two primary mechanisms: target-site insensitivity and enhanced metabolic detoxification. However, the potential interactions and synergistic effects between these mechanisms remain largely unexplored. In this study, we demonstrate a striking cooperative interaction between these two major resistance mechanisms in a field-derived strain of Spodoptera exigua exhibiting extreme resistance (631-fold) to the pyrethroid insecticide lambda-cyhalothrin. Through genetic mapping and linkage analysis, we identified that this resistance phenotype is conferred by the combined effects of overexpression of the P450 CYP9A9 (two copies: CYP9A9a and CYP9A9b) and a target-site mutation (L1014F, kdr) in the voltage-gated sodium channel. Using an introgression approach, we generated two near-isogenic strains: WH-kdr, carrying only the target-site resistance allele (6.2-fold resistance), and WH-CYP9A, harboring only the metabolic resistance genes (79-fold resistance), both compared to the susceptible WH-S strain. CRISPR/Cas9-mediated knockout of both CYP9A9 copies in the QP19 strain dramatically reduced resistance from 631-fold to 19-fold, while transgenic expression of the CYP9A9a variant (containing three amino acid substitutions) from QP19 strain in Helicoverpa armigera conferred 39-fold resistance to lambda-cyhalothrin. These findings provide compelling evidence that target-site resistance can significantly potentiate metabolic resistance, resulting in substantially higher resistance levels than either mechanism alone in S. exigua. These findings enhance the understanding of higher level resistance mechanisms mediated by interactions between resistance genes and provide theoretical basis for devising management strategies of insecticide resistance.
Collapse
Affiliation(s)
- Yayun Zuo
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, 712100, Shaanxi, China; Key Laboratory for Botanical Pesticide R&D of Shaanxi Province, Yangling, 712100, Shaanxi, China
| | - Yakun Pei
- Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture and Rural Affairs, Key Laboratory of Plant Protection Resources and Pest Management of Ministry of Education, College of Plant Protection, Northwest A&F University, Yangling, 712100, Shaanxi, China; Key Laboratory for Botanical Pesticide R&D of Shaanxi Province, Yangling, 712100, Shaanxi, China
| | - Yuan Li
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, 712100, Shaanxi, China; Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture and Rural Affairs, Key Laboratory of Plant Protection Resources and Pest Management of Ministry of Education, College of Plant Protection, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Shuang Wen
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, 712100, Shaanxi, China; Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture and Rural Affairs, Key Laboratory of Plant Protection Resources and Pest Management of Ministry of Education, College of Plant Protection, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Xuan Ren
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, 712100, Shaanxi, China; Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture and Rural Affairs, Key Laboratory of Plant Protection Resources and Pest Management of Ministry of Education, College of Plant Protection, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Lin Li
- The Key Laboratory of Plant Immunity and College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China
| | - Yidong Wu
- The Key Laboratory of Plant Immunity and College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China.
| | - Zhaonong Hu
- Key Laboratory of Integrated Pest Management on the Loess Plateau of Ministry of Agriculture and Rural Affairs, Key Laboratory of Plant Protection Resources and Pest Management of Ministry of Education, College of Plant Protection, Northwest A&F University, Yangling, 712100, Shaanxi, China; Key Laboratory for Botanical Pesticide R&D of Shaanxi Province, Yangling, 712100, Shaanxi, China.
| |
Collapse
|
2
|
Li S, Arora S, Attaoua R, Hamet P, Tremblay J, Bihlo A, Liu B, Rutter G. Leveraging hierarchical structures for genetic block interaction studies using the hierarchical transformer. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2024.11.18.24317486. [PMID: 39606365 PMCID: PMC11601704 DOI: 10.1101/2024.11.18.24317486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Initially introduced in 1909 by William Bateson, classic epistasis (genetic variant interaction) refers to the phenomenon that one variant prevents another variant from a different locus from manifesting its effects. The potential effects of genetic variant interactions on complex diseases have been recognized for the past decades. Moreover, It has been studied and demonstrated that leveraging the combined SNP effects within the genetic block can significantly increase calculation power, reducing background noise, ultimately leading to novel epistasis discovery that the single SNP statistical epistasis study might overlook. However, it is still an open question how we can best combine gene structure representation modelling and interaction learning into an end-to-end model for gene interaction searching. Here, in the current study, we developed a neural genetic block interaction searching model that can effectively process large SNP chip inputs and output the potential genetic block interaction heatmap. Our model augments a previously published hierarchical transformer architecture (Liu and Lapata, 2019) with the ability to model genetic blocks. The cross-block relationship mapping was achieved via a hierarchical attention mechanism which allows the sharing of information regarding specific phenotypes, as opposed to simple unsupervised dimensionality reduction methods e.g. PCA. Results on both simulation and UK Biobank studies show our model brings substantial improvements compared to traditional exhaustive searching and neural network methods.
Collapse
Affiliation(s)
- Shiying Li
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Shivam Arora
- Department of Mathematics and Statistics, Memorial University of Newfoundland, NL, Canada
| | - Redha Attaoua
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Pavel Hamet
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Johanne Tremblay
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
| | - Alexander Bihlo
- Department of Mathematics and Statistics, Memorial University of Newfoundland, NL, Canada
| | - Bang Liu
- Département d’informatique et de recherche opérationnelle, Université de Montréal, QC, Canada
| | - Guy Rutter
- Centre de Recherche du CHUM, and Faculty of Medicine, University of Montreal, QC, Canada
- Section of Cell Biology and Functional Genomics, Department of Metabolism, Diabetes and Reproduction, Imperial College of London, du Cane Road, London W120NN, United Kingdom
- Lee Kong Chian School of Medicine, Nan Yang Technological University, Singapore
| |
Collapse
|
3
|
Sha Z, Freda PJ, Bhandary P, Ghosh A, Matsumoto N, Moore JH, Hu T. Distinct network patterns emerge from Cartesian and XOR epistasis models: a comparative network science analysis. BioData Min 2024; 17:61. [PMID: 39732697 DOI: 10.1186/s13040-024-00413-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 12/09/2024] [Indexed: 12/30/2024] Open
Abstract
BACKGROUND Epistasis, the phenomenon where the effect of one gene (or variant) is masked or modified by one or more other genes, significantly contributes to the phenotypic variance of complex traits. Traditionally, epistasis has been modeled using the Cartesian epistatic model, a multiplicative approach based on standard statistical regression. However, a recent study investigating epistasis in obesity-related traits has identified potential limitations of the Cartesian epistatic model, revealing that it likely only detects a fraction of the genetic interactions occurring in natural systems. In contrast, the exclusive-or (XOR) epistatic model has shown promise in detecting a broader range of epistatic interactions and revealing more biologically relevant functions associated with interacting variants. To investigate whether the XOR epistatic model also forms distinct network structures compared to the Cartesian model, we applied network science to examine genetic interactions underlying body mass index (BMI) in rats (Rattus norvegicus). RESULTS Our comparative analysis of XOR and Cartesian epistatic models in rats reveals distinct topological characteristics. The XOR model exhibits enhanced sensitivity to epistatic interactions between the network communities found in the Cartesian epistatic network, facilitating the identification of novel trait-related biological functions via community-based enrichment analysis. Additionally, the XOR network features triangle network motifs, indicative of higher-order epistatic interactions. This research also evaluates the impact of linkage disequilibrium (LD)-based edge pruning on network-based epistasis analysis, finding that LD-based edge pruning may lead to increased network fragmentation, which may hinder the effectiveness of network analysis for the investigation of epistasis. We confirmed through network permutation analysis that most XOR and Cartesian epistatic networks derived from the data display distinct structural properties compared to randomly shuffled networks. CONCLUSIONS Collectively, these findings highlight the XOR model's ability to uncover meaningful biological associations and higher-order epistasis derived from lower-order network topologies. The introduction of community-based enrichment analysis and motif-based epistatic discovery emphasize network science as a critical approach for advancing epistasis research and understanding complex genetic architectures.
Collapse
Affiliation(s)
- Zhendong Sha
- School of Computing, Queen's University, 557 Goodwin Hall, 21-25 Union St, Kingston, K7L 2N8, Ontario, Canada
| | - Philip J Freda
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA
| | - Priyanka Bhandary
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA
| | - Attri Ghosh
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA
| | - Nicholas Matsumoto
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA.
| | - Ting Hu
- School of Computing, Queen's University, 557 Goodwin Hall, 21-25 Union St, Kingston, K7L 2N8, Ontario, Canada.
| |
Collapse
|
4
|
Tutaj H, Tomala K, Pirog A, Marszałek M, Korona R. Extreme positive epistasis for fitness in monosomic yeast strains. eLife 2024; 12:RP87455. [PMID: 39417696 PMCID: PMC11486488 DOI: 10.7554/elife.87455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2024] Open
Abstract
The loss of a single chromosome in a diploid organism halves the dosage of many genes and is usually accompanied by a substantial decrease in fitness. We asked whether this decrease simply reflects the joint damage caused by individual gene dosage deficiencies. We measured the fitness effects of single heterozygous gene deletions in yeast and combined them for each chromosome. This predicted a negative growth rate, that is, lethality, for multiple monosomies. However, monosomic strains remained alive and grew as if much (often most) of the damage caused by single mutations had disappeared, revealing an exceptionally large and positive epistatic component of fitness. We looked for functional explanations by analyzing the transcriptomes. There was no evidence of increased (compensatory) gene expression on the monosomic chromosomes. Nor were there signs of the cellular stress response that would be expected if monosomy led to protein destabilization and thus cytotoxicity. Instead, all monosomic strains showed extensive upregulation of genes encoding ribosomal proteins, but in an indiscriminate manner that did not correspond to their altered dosage. This response did not restore the stoichiometry required for efficient biosynthesis, which probably became growth limiting, making all other mutation-induced metabolic defects much less important. In general, the modular structure of the cell leads to an effective fragmentation of the total mutational load. Defects outside the module(s) currently defining fitness lose at least some of their relevance, producing the epiphenomenon of positive interactions between individually negative effects.
Collapse
Affiliation(s)
- Hanna Tutaj
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian UniversityCracowPoland
| | - Katarzyna Tomala
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian UniversityCracowPoland
| | - Adrian Pirog
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian UniversityCracowPoland
| | - Marzena Marszałek
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian UniversityCracowPoland
- Doctoral School of Exact and Natural Sciences, Jagiellonian UniversityCracowPoland
| | - Ryszard Korona
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian UniversityCracowPoland
| |
Collapse
|
5
|
Freda PJ, Ye S, Zhang R, Moore JH, Urbanowicz RJ. Assessing the limitations of relief-based algorithms in detecting higher-order interactions. BioData Min 2024; 17:37. [PMID: 39354639 PMCID: PMC11443793 DOI: 10.1186/s13040-024-00390-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 09/04/2024] [Indexed: 10/03/2024] Open
Abstract
BACKGROUND Epistasis, the interaction between genetic loci where the effect of one locus is influenced by one or more other loci, plays a crucial role in the genetic architecture of complex traits. However, as the number of loci considered increases, the investigation of epistasis becomes exponentially more complex, making the selection of key features vital for effective downstream analyses. Relief-Based Algorithms (RBAs) are often employed for this purpose due to their reputation as "interaction-sensitive" algorithms and uniquely non-exhaustive approach. However, the limitations of RBAs in detecting interactions, particularly those involving multiple loci, have not been thoroughly defined. This study seeks to address this gap by evaluating the efficiency of RBAs in detecting higher-order epistatic interactions. Motivated by previous findings that suggest some RBAs may rank predictive features involved in higher-order epistasis negatively, we explore the potential of absolute value ranking of RBA feature weights as an alternative approach for capturing complex interactions. In this study, we assess the performance of ReliefF, MultiSURF, and MultiSURFstar on simulated genetic datasets that model various patterns of genotype-phenotype associations, including 2-way to 5-way genetic interactions, and compare their performance to two control methods: a random shuffle and mutual information. RESULTS Our findings indicate that while RBAs effectively identify lower-order (2 to 3-way) interactions, their capability to detect higher-order interactions is significantly limited, primarily by large feature count but also by signal noise. Specifically, we observe that RBAs are successful in detecting fully penetrant 4-way XOR interactions using an absolute value ranking approach, but this is restricted to datasets with only 20 total features. CONCLUSIONS These results highlight the inherent limitations of current RBAs and underscore the need for the development of Relief-based approaches with enhanced detection capabilities for the investigation of epistasis, particularly in datasets with large feature counts and complex higher-order interactions.
Collapse
Affiliation(s)
- Philip J Freda
- Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA
| | - Suyu Ye
- Whiting School of Engineering, Johns Hopkins University, 3400 N. Charles St., Baltimore, 21218, MD, USA
| | - Robert Zhang
- University of Pennsylvania, Philadelphia, 19104, PA, USA
| | - Jason H Moore
- Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA
| | - Ryan J Urbanowicz
- Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, 90069, CA, USA.
| |
Collapse
|
6
|
Toral-Rios D, Pichardo-Rojas P, Ruiz-Sánchez E, Rosas-Carrasco Ó, Carvajal-García R, Gálvez-Coutiño DC, Martínez-Rodríguez NL, Rubio-Chávez AD, Alcántara-Flores M, López-Ramírez A, Martínez-Rosas AR, Ruiz-Chow ÁA, Alonso-Vanegas M, Campos-Peña V. Synergistic Effect between the APOE ε4 Allele with Genetic Variants of GSK3B and MAPT: Differential Profile between Refractory Epilepsy and Alzheimer Disease. Int J Mol Sci 2024; 25:10228. [PMID: 39337715 PMCID: PMC11432663 DOI: 10.3390/ijms251810228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 09/13/2024] [Accepted: 09/18/2024] [Indexed: 09/30/2024] Open
Abstract
Temporal Lobe Epilepsy (TLE) is a chronic neurological disorder characterized by recurrent focal seizures originating in the temporal lobe. Despite the variety of antiseizure drugs currently available to treat TLE, about 30% of cases continue to have seizures. The etiology of TLE is complex and multifactorial. Increasing evidence indicates that Alzheimer's disease (AD) and drug-resistant TLE present common pathological features that may induce hyperexcitability, especially aberrant hyperphosphorylation of tau protein. Genetic polymorphic variants located in genes of the microtubule-associated protein tau (MAPT) and glycogen synthase kinase-3β (GSK3B) have been associated with the risk of developing AD. The APOE ε4 allele is a major genetic risk factor for AD. Likewise, a gene-dose-dependent effect of ε4 seems to influence TLE. The present study aimed to investigate whether the APOE ɛ4 allele and genetic variants located in the MAPT and GSK3B genes are associated with the risk of developing AD and drug-resistant TLE in a cohort of the Mexican population. A significant association with the APOE ε4 allele was observed in patients with AD and TLE. Additional genetic interactions were identified between this allele and variants of the MAPT and GSK3B genes.
Collapse
Affiliation(s)
- Danira Toral-Rios
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, MO 63110, USA
| | - Pavel Pichardo-Rojas
- The Vivian L. Smith Department of Neurosurgery, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Elizabeth Ruiz-Sánchez
- Neurochemistry Laboratory, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| | - Óscar Rosas-Carrasco
- Geriatric Assessment Center, Department of Health, Iberoamerican University, Mexico City 01219, Mexico
| | | | - Dey Carol Gálvez-Coutiño
- Experimental Laboratory of Neurodegenerative Diseases, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| | - Nancy Lucero Martínez-Rodríguez
- Epidemiological Research Unit in Endocrinology and Nutrition, Children's Hospital of Mexico Federico Gómez, Mexico City 06720, Mexico
| | - Ana Daniela Rubio-Chávez
- High Specialty Medical Unit (UMAE), Specialty Hospital, National Medical Center (CMN), XXI Century, Mexico City 06720, Mexico
| | - Myr Alcántara-Flores
- Department of Psychiatry, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| | - Arely López-Ramírez
- Experimental Laboratory of Neurodegenerative Diseases, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| | - Alma Rosa Martínez-Rosas
- Cognition and Behavior Unit, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| | - Ángel Alberto Ruiz-Chow
- Department of Psychiatry, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| | - Mario Alonso-Vanegas
- Director of the International Center for Epilepsy Surgery, HMG-Coyoacan Hospital, Mexico City 04380, Mexico
| | - Victoria Campos-Peña
- Experimental Laboratory of Neurodegenerative Diseases, National Institute of Neurology and Neurosurgery "Manuel Velasco Suárez", Mexico City 14269, Mexico
| |
Collapse
|
7
|
Jang MJ, Tan LJ, Park MY, Shin S, Kim JM. Identification of interactions between genetic risk scores and dietary patterns for personalized prevention of kidney dysfunction in a population-based cohort. Nutr Diabetes 2024; 14:62. [PMID: 39143076 PMCID: PMC11325018 DOI: 10.1038/s41387-024-00316-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 07/09/2024] [Accepted: 07/16/2024] [Indexed: 08/16/2024] Open
Abstract
BACKGROUND & AIM Chronic kidney disease (CKD) is a heterogeneous disorder that affects the kidney structure and function. This study investigated the effect of the interaction between genetic factors and dietary pattern on kidney dysfunction in Korean adults. METHODS Baseline data were obtained from the Ansan and Ansung Study of the Korean Genome and Epidemiology Study involving 8230 participants aged 40-69 years. Kidney dysfunction was defined as an estimated glomerular filtration rate < 90 mL/minute/1.73 m2. Genomic DNAs genotyped on the Affymetrix® Genome-Wide Human SNP array 5.0 were isolated from peripheral blood. A genome-wide association study using a generalized linear model was performed on 1,590,162 single-nucleotide polymorphisms (SNPs). To select significant SNPs, the threshold criterion was set at P-value < 5 × 10-8. Linkage disequilibrium clumping was performed based on the R2 value, and 94 SNPs had a significant effect. Participants were divided into two groups based on their generic risk score (GRS): the low-GR group had GRS > 0, while the high-GR group had GRS ≤ 0. RESULTS Three distinct dietary patterns were extracted, namely, the "prudent pattern," "flour-based and animal food pattern," and "white rice pattern," to analyze the effect of dietary pattern on kidney function. In the "flour-based and animal food pattern," higher pattern scores were associated with a higher prevalence of kidney dysfunction in both the low and high GR groups (P for trend < 0.0001 in the low-, high-GR groups of model 1; 0.0050 and 0.0065 in the low-, high-GR groups of model 2, respectively). CONCLUSIONS The results highlight a significant association between the 'flour-based and animal food pattern' and higher kidney dysfunction prevalence in individuals with both low and high GR. These findings suggest that personalized nutritional interventions based on GR profiles may become the basis for presenting GR-based individual dietary patterns for kidney dysfunction.
Collapse
Affiliation(s)
- Min-Jae Jang
- Department of Animal Science and Technology, Chung-Ang University, Gyeonggi-do, 17546, Korea
| | - Li-Juan Tan
- Department of Food and Nutrition, Chung-Ang University, Gyeonggi-do, 17546, Korea
| | - Min Young Park
- Department of Molecular Pathobiology, NYU College of Dentistry, New York, USA
| | - Sangah Shin
- Department of Food and Nutrition, Chung-Ang University, Gyeonggi-do, 17546, Korea.
| | - Jun-Mo Kim
- Department of Animal Science and Technology, Chung-Ang University, Gyeonggi-do, 17546, Korea.
| |
Collapse
|
8
|
Batista S, Madar VS, Freda PJ, Bhandary P, Ghosh A, Matsumoto N, Chitre AS, Palmer AA, Moore JH. Interaction models matter: an efficient, flexible computational framework for model-specific investigation of epistasis. BioData Min 2024; 17:7. [PMID: 38419006 PMCID: PMC10900690 DOI: 10.1186/s13040-024-00358-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 02/20/2024] [Indexed: 03/02/2024] Open
Abstract
PURPOSE Epistasis, the interaction between two or more genes, is integral to the study of genetics and is present throughout nature. Yet, it is seldom fully explored as most approaches primarily focus on single-locus effects, partly because analyzing all pairwise and higher-order interactions requires significant computational resources. Furthermore, existing methods for epistasis detection only consider a Cartesian (multiplicative) model for interaction terms. This is likely limiting as epistatic interactions can evolve to produce varied relationships between genetic loci, some complex and not linearly separable. METHODS We present new algorithms for the interaction coefficients for standard regression models for epistasis that permit many varied models for the interaction terms for loci and efficient memory usage. The algorithms are given for two-way and three-way epistasis and may be generalized to higher order epistasis. Statistical tests for the interaction coefficients are also provided. We also present an efficient matrix based algorithm for permutation testing for two-way epistasis. We offer a proof and experimental evidence that methods that look for epistasis only at loci that have main effects may not be justified. Given the computational efficiency of the algorithm, we applied the method to a rat data set and mouse data set, with at least 10,000 loci and 1,000 samples each, using the standard Cartesian model and the XOR model to explore body mass index. RESULTS This study reveals that although many of the loci found to exhibit significant statistical epistasis overlap between models in rats, the pairs are mostly distinct. Further, the XOR model found greater evidence for statistical epistasis in many more pairs of loci in both data sets with almost all significant epistasis in mice identified using XOR. In the rat data set, loci involved in epistasis under the XOR model are enriched for biologically relevant pathways. CONCLUSION Our results in both species show that many biologically relevant epistatic relationships would have been undetected if only one interaction model was applied, providing evidence that varied interaction models should be implemented to explore epistatic interactions that occur in living systems.
Collapse
Affiliation(s)
- Sandra Batista
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N San Vicente Blvd., Pacific Design Center, Guite G540, West Hollywood, CA, 90069, USA.
| | | | - Philip J Freda
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N San Vicente Blvd., Pacific Design Center, Guite G540, West Hollywood, CA, 90069, USA
| | - Priyanka Bhandary
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N San Vicente Blvd., Pacific Design Center, Guite G540, West Hollywood, CA, 90069, USA
| | - Attri Ghosh
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N San Vicente Blvd., Pacific Design Center, Guite G540, West Hollywood, CA, 90069, USA
| | - Nicholas Matsumoto
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N San Vicente Blvd., Pacific Design Center, Guite G540, West Hollywood, CA, 90069, USA
| | - Apurva S Chitre
- Department of Psychiatry, University of California, San Diego, 9500 Gilman Dr., Mailcode: 0667, La Jolla, CA, 92093-0667, USA
| | - Abraham A Palmer
- Department of Psychiatry, University of California, San Diego, 9500 Gilman Dr., Mailcode: 0667, La Jolla, CA, 92093-0667, USA
- Institute for Genomic Medicine, University of California, San Diego, 9500 Gilman Dr., Mailcode: 0667, La Jolla, CA, 92093-0667, USA
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N San Vicente Blvd., Pacific Design Center, Guite G540, West Hollywood, CA, 90069, USA.
| |
Collapse
|
9
|
Yang X, Li X, Bao Q, Wang Z, He S, Qu X, Tang Y, Song B, Huang J, Yi G. Uncovering Evolutionary Adaptations in Common Warthogs through Genomic Analyses. Genes (Basel) 2024; 15:166. [PMID: 38397156 PMCID: PMC10888464 DOI: 10.3390/genes15020166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 01/15/2024] [Accepted: 01/20/2024] [Indexed: 02/25/2024] Open
Abstract
In the Suidae family, warthogs show significant survival adaptability and trait specificity. This study offers a comparative genomic analysis between the warthog and other Suidae species, including the Luchuan pig, Duroc pig, and Red River hog. By integrating the four genomes with sequences from the other four species, we identified 8868 single-copy orthologous genes. Based on 8868 orthologous protein sequences, phylogenetic assessments highlighted divergence timelines and unique evolutionary branches within suid species. Warthogs exist on different evolutionary branches compared to DRCs and LCs, with a divergence time preceding that of DRC and LC. Contraction and expansion analyses of warthog gene families have been conducted to elucidate the mechanisms of their evolutionary adaptations. Using GO, KEGG, and MGI databases, warthogs showed a preference for expansion in sensory genes and contraction in metabolic genes, underscoring phenotypic diversity and adaptive evolution direction. Associating genes with the QTLdb-pigSS11 database revealed links between gene families and immunity traits. The overlap of olfactory genes in immune-related QTL regions highlighted their importance in evolutionary adaptations. This work highlights the unique evolutionary strategies and adaptive mechanisms of warthogs, guiding future research into the distinct adaptability and disease resistance in pigs, particularly focusing on traits such as resistance to African Swine Fever Virus.
Collapse
Affiliation(s)
- Xintong Yang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning 530005, China;
| | - Xingzheng Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
| | - Qi Bao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
| | - Zhen Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
| | - Sang He
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
| | - Xiaolu Qu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
| | - Yueting Tang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
- School of Life Sciences, Henan University, Kaifeng 475004, China
| | - Bangmin Song
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
- School of Life Sciences, Henan University, Kaifeng 475004, China
| | - Jieping Huang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning 530005, China;
| | - Guoqiang Yi
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China; (X.Y.); (X.L.); (Q.B.); (Z.W.); (S.H.); (X.Q.); (Y.T.); (B.S.)
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Foshan 528226, China
- Bama Yao Autonomous County Rural Revitalization Research Institute, Bama 547500, China
| |
Collapse
|
10
|
Abstract
Despite monumental advances in molecular technology to generate genome sequence data at scale, there is still a considerable proportion of heritability in most complex diseases that remains unexplained. Because many of the discoveries have been single-nucleotide variants with small to moderate effects on disease, the functional implication of many of the variants is still unknown and, thus, we have limited new drug targets and therapeutics. We, and many others, posit that one primary factor that has limited our ability to identify novel drug targets from genome-wide association studies may be due to gene interactions (epistasis), gene-environment interactions, network/pathway effects, or multiomic relationships. We propose that many of these complex models explain much of the underlying genetic architecture of complex disease. In this review, we discuss the evidence from multiple research avenues, ranging from pairs of alleles to multiomic integration studies and pharmacogenomics, that supports the need for further investigation of gene interactions (or epistasis) in genetic and genomic studies of human disease. Our goal is to catalog the mounting evidence for epistasis in genetic studies and the connections between genetic interactions and human health and disease that could enable precision medicine of the future.
Collapse
Affiliation(s)
- Pankhuri Singhal
- Genetics and Epigenetics Graduate Group, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Shefali Setia Verma
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA;
- Penn Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
11
|
Melograna F, Li Z, Galazzo G, van Best N, Mommers M, Penders J, Stella F, Van Steen K. Edge and modular significance assessment in individual-specific networks. Sci Rep 2023; 13:7868. [PMID: 37188794 PMCID: PMC10185658 DOI: 10.1038/s41598-023-34759-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 05/07/2023] [Indexed: 05/17/2023] Open
Abstract
Individual-specific networks, defined as networks of nodes and connecting edges that are specific to an individual, are promising tools for precision medicine. When such networks are biological, interpretation of functional modules at an individual level becomes possible. An under-investigated problem is relevance or "significance" assessment of each individual-specific network. This paper proposes novel edge and module significance assessment procedures for weighted and unweighted individual-specific networks. Specifically, we propose a modular Cook's distance using a method that involves iterative modeling of one edge versus all the others within a module. Two procedures assessing changes between using all individuals and using all individuals but leaving one individual out (LOO) are proposed as well (LOO-ISN, MultiLOO-ISN), relying on empirically derived edges. We compare our proposals to competitors, including adaptions of OPTICS, kNN, and Spoutlier methods, by an extensive simulation study, templated on real-life scenarios for gene co-expression and microbial interaction networks. Results show the advantages of performing modular versus edge-wise significance assessments for individual-specific networks. Furthermore, modular Cook's distance is among the top performers across all considered simulation settings. Finally, the identification of outlying individuals regarding their individual-specific networks, is meaningful for precision medicine purposes, as confirmed by network analysis of microbiome abundance profiles.
Collapse
Affiliation(s)
- Federico Melograna
- BIO3 - Laboratory for Systems Medicine, Department of Human Genetics, KU Leuven, Leuven, Belgium.
| | - Zuqi Li
- BIO3 - Laboratory for Systems Medicine, Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Gianluca Galazzo
- School of Nutrition and Translational Research in Metabolism (NUTRIM), Department of Medical Microbiology Infectious Diseases and Infection Prevention, Maastricht University Medical Center+, Maastricht, The Netherlands
| | - Niels van Best
- Institute of Medical Microbiology, RWTH University Hospital Aachen, RWTH University, Aachen, Germany
- Department of Epidemiology, Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, The Netherlands
| | - Monique Mommers
- Department of Epidemiology, Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, The Netherlands
| | - John Penders
- School of Nutrition and Translational Research in Metabolism (NUTRIM), Department of Medical Microbiology Infectious Diseases and Infection Prevention, Maastricht University Medical Center+, Maastricht, The Netherlands
- Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, The Netherlands
| | - Fabio Stella
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, 20126, Milan, Italy
| | - Kristel Van Steen
- BIO3 - Laboratory for Systems Medicine, Department of Human Genetics, KU Leuven, Leuven, Belgium
- BIO3 - Laboratory for Systems Genetics, GIGA-R Medical Genomics, University of Liège, Liège, Belgium
| |
Collapse
|
12
|
Abstract
BACKGROUND Autoimmune hepatitis has an unknown cause and genetic associations that are not disease-specific or always present. Clarification of its missing causality and heritability could improve prevention and management strategies. AIMS Describe the key epigenetic and genetic mechanisms that could account for missing causality and heritability in autoimmune hepatitis; indicate the prospects of these mechanisms as pivotal factors; and encourage investigations of their pathogenic role and therapeutic potential. METHODS English abstracts were identified in PubMed using multiple key search phases. Several hundred abstracts and 210 full-length articles were reviewed. RESULTS Environmental induction of epigenetic changes is the prime candidate for explaining the missing causality of autoimmune hepatitis. Environmental factors (diet, toxic exposures) can alter chromatin structure and the production of micro-ribonucleic acids that affect gene expression. Epistatic interaction between unsuspected genes is the prime candidate for explaining the missing heritability. The non-additive, interactive effects of multiple genes could enhance their impact on the propensity and phenotype of autoimmune hepatitis. Transgenerational inheritance of acquired epigenetic marks constitutes another mechanism of transmitting parental adaptations that could affect susceptibility. Management strategies could range from lifestyle adjustments and nutritional supplements to precision editing of the epigenetic landscape. CONCLUSIONS Autoimmune hepatitis has a missing causality that might be explained by epigenetic changes induced by environmental factors and a missing heritability that might reflect epistatic gene interactions or transgenerational transmission of acquired epigenetic marks. These unassessed or under-evaluated areas warrant investigation.
Collapse
Affiliation(s)
- Albert J Czaja
- Mayo Clinic College of Medicine and Science, Rochester, MN, USA.
- Professor Emeritus of Medicine, Mayo Clinic College of Medicine and Science, 200 First Street SW, Rochester, MN, 55905, USA.
| |
Collapse
|
13
|
Learning high-order interactions for polygenic risk prediction. PLoS One 2023; 18:e0281618. [PMID: 36763605 PMCID: PMC9916647 DOI: 10.1371/journal.pone.0281618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/27/2023] [Indexed: 02/11/2023] Open
Abstract
Within the framework of precision medicine, the stratification of individual genetic susceptibility based on inherited DNA variation has paramount relevance. However, one of the most relevant pitfalls of traditional Polygenic Risk Scores (PRS) approaches is their inability to model complex high-order non-linear SNP-SNP interactions and their effect on the phenotype (e.g. epistasis). Indeed, they incur in a computational challenge as the number of possible interactions grows exponentially with the number of SNPs considered, affecting the statistical reliability of the model parameters as well. In this work, we address this issue by proposing a novel PRS approach, called High-order Interactions-aware Polygenic Risk Score (hiPRS), that incorporates high-order interactions in modeling polygenic risk. The latter combines an interaction search routine based on frequent itemsets mining and a novel interaction selection algorithm based on Mutual Information, to construct a simple and interpretable weighted model of user-specified dimensionality that can predict a given binary phenotype. Compared to traditional PRSs methods, hiPRS does not rely on GWAS summary statistics nor any external information. Moreover, hiPRS differs from Machine Learning-based approaches that can include complex interactions in that it provides a readable and interpretable model and it is able to control overfitting, even on small samples. In the present work we demonstrate through a comprehensive simulation study the superior performance of hiPRS w.r.t. state of the art methods, both in terms of scoring performance and interpretability of the resulting model. We also test hiPRS against small sample size, class imbalance and the presence of noise, showcasing its robustness to extreme experimental settings. Finally, we apply hiPRS to a case study on real data from DACHS cohort, defining an interaction-aware scoring model to predict mortality of stage II-III Colon-Rectal Cancer patients treated with oxaliplatin.
Collapse
|
14
|
Saha S, Perrin L, Röder L, Brun C, Spinelli L. Epi-MEIF: detecting higher order epistatic interactions for complex traits using mixed effect conditional inference forests. Nucleic Acids Res 2022; 50:e114. [PMID: 36107776 PMCID: PMC9639209 DOI: 10.1093/nar/gkac715] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 07/29/2022] [Accepted: 09/12/2022] [Indexed: 12/04/2022] Open
Abstract
Understanding the relationship between genetic variations and variations in complex and quantitative phenotypes remains an ongoing challenge. While Genome-wide association studies (GWAS) have become a vital tool for identifying single-locus associations, we lack methods for identifying epistatic interactions. In this article, we propose a novel method for higher-order epistasis detection using mixed effect conditional inference forest (epiMEIF). The proposed method is fitted on a group of single nucleotide polymorphisms (SNPs) potentially associated with the phenotype and the tree structure in the forest facilitates the identification of n-way interactions between the SNPs. Additional testing strategies further improve the robustness of the method. We demonstrate its ability to detect true n-way interactions via extensive simulations in both cross-sectional and longitudinal synthetic datasets. This is further illustrated in an application to reveal epistatic interactions from natural variations of cardiac traits in flies (Drosophila). Overall, the method provides a generalized way to identify higher-order interactions from any GWAS data, thereby greatly improving the detection of the genetic architecture underlying complex phenotypes.
Collapse
Affiliation(s)
- Saswati Saha
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
| | - Laurent Perrin
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
- CNRS, Marseille, France
| | - Laurence Röder
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
| | - Christine Brun
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
- CNRS, Marseille, France
| | - Lionel Spinelli
- Aix Marseille Univ, INSERM, TAGC (UMR1090), Turing Centre for Living systems, Marseille, France
| |
Collapse
|
15
|
Abstract
Genetic studies of human traits have revolutionized our understanding of the variation between individuals, and yet, the genetics of most traits is still poorly understood. In this review, we highlight the major open problems that need to be solved, and by discussing these challenges provide a primer to the field. We cover general issues such as population structure, epistasis and gene-environment interactions, data-related issues such as ancestry diversity and rare genetic variants, and specific challenges related to heritability estimates, genetic association studies, and polygenic risk scores. We emphasize the interconnectedness of these problems and suggest promising avenues to address them.
Collapse
Affiliation(s)
- Nadav Brandes
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Michal Linial
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
16
|
D'Silva S, Chakraborty S, Kahali B. Concurrent outcomes from multiple approaches of epistasis analysis for human body mass index associated loci provide insights into obesity biology. Sci Rep 2022; 12:7306. [PMID: 35508500 PMCID: PMC9068779 DOI: 10.1038/s41598-022-11270-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 04/18/2022] [Indexed: 12/13/2022] Open
Abstract
Genome wide association studies (GWAS) have focused on elucidating the genetic architecture of complex traits by assessing single variant effects in additive genetic models, albeit explaining a fraction of the trait heritability. Epistasis has recently emerged as one of the intrinsic mechanisms that could explain part of this missing heritability. We conducted epistasis analysis for genome-wide body mass index (BMI) associated SNPs in Alzheimer's Disease Neuroimaging Initiative (ADNI) and followed up top significant interacting SNPs for replication in the UK Biobank imputed genotype dataset. We report two pairwise epistatic interactions, between rs2177596 (RHBDD1) and rs17759796 (MAPK1), rs1121980 (FTO) and rs6567160 (MC4R), obtained from a consensus of nine different epistatic approaches. Gene interaction maps and tissue expression profiles constructed for these interacting loci highlights co-expression, co-localisation, physical interaction, genetic interaction, and shared pathways emphasising the neuronal influence in obesity and implicating concerted expression of associated genes in liver, pancreas, and adipose tissues insinuating to metabolic abnormalities characterized by obesity. Detecting epistasis could thus be a promising approach to understand the effect of simultaneously interacting multiple genetic loci in disease aetiology, beyond single locus effects.
Collapse
Affiliation(s)
- Sheldon D'Silva
- Centre for Brain Research, Indian Institute of Science, Bangalore, 560012, India
| | - Shreya Chakraborty
- Centre for Brain Research, Indian Institute of Science, Bangalore, 560012, India
- Interdisciplinary Mathematical Sciences, Indian Institute of Science, Bangalore, 560012, India
| | - Bratati Kahali
- Centre for Brain Research, Indian Institute of Science, Bangalore, 560012, India.
| |
Collapse
|
17
|
Slim L, Chatelain C, Foucauld HD, Azencott CA. A systematic analysis of gene-gene interaction in multiple sclerosis. BMC Med Genomics 2022; 15:100. [PMID: 35501860 PMCID: PMC9063218 DOI: 10.1186/s12920-022-01247-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 03/28/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND For the most part, genome-wide association studies (GWAS) have only partially explained the heritability of complex diseases. One of their limitations is to assume independent contributions of individual variants to the phenotype. Many tools have therefore been developed to investigate the interactions between distant loci, or epistasis. Among them, the recently proposed EpiGWAS models the interactions between a target variant and the rest of the genome. However, applying this approach to studying interactions along all genes of a disease map is not straightforward. Here, we propose a pipeline to that effect, which we illustrate by investigating a multiple sclerosis GWAS dataset from the Wellcome Trust Case Control Consortium 2 through 19 disease maps from the MetaCore pathway database. RESULTS For each disease map, we build an epistatic network by connecting the genes that are deemed to interact. These networks tend to be connected, complementary to the disease maps and contain hubs. In addition, we report 4 epistatic gene pairs involving missense variants, and 25 gene pairs with a deleterious epistatic effect mediated by eQTLs. Among these, we highlight the interaction of GLI-1 and SUFU, and of IP10 and NF-[Formula: see text]B, as they both match known biological interactions. The latter pair is particularly promising for therapeutic development, as both genes have known inhibitors. CONCLUSIONS Our study showcases the ability of EpiGWAS to uncover biologically interpretable epistatic interactions that are potentially actionable for the development of combination therapy.
Collapse
Affiliation(s)
- Lotfi Slim
- CBIO, MINES ParisTech, PSL Research University, 75006 Paris, France
- Translational Sciences, SANOFI R&D, 91385 Chilly-Mazarin, France
- NVIDIA Corporation, Santa Clara, 95051 USA
| | | | | | - Chloé-Agathe Azencott
- CBIO, MINES ParisTech, PSL Research University, 75006 Paris, France
- Institut Curie, PSL Research University, 75005 Paris, France
- U900, Inserm, 75005 Paris, France
| |
Collapse
|
18
|
Maculewicz E, Antkowiak B, Antkowiak O, Borecka A, Mastalerz A, Leońska-Duniec A, Humińska-Lisowska K, Michałowska-Sawczyn M, Garbacz A, Lorenz K, Szarska E, Dziuda Ł, Cywińska A, Cięszczyk P. The interactions between interleukin-1 family genes: IL1A, IL1B, IL1RN, and obesity parameters. BMC Genomics 2022; 23:112. [PMID: 35139823 PMCID: PMC8830010 DOI: 10.1186/s12864-021-08258-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/14/2021] [Indexed: 12/23/2022] Open
Abstract
Background Obesity has been recognized as a worldwide growing problem, producing many pathologies including the promotion of “proinflammatory state.” The etiology of human obesity is still only partially understood; however, the genetic background has been proved. Its nature is complex, and currently, it appears that the combined effects of the interactions among multiple genes should receive more attention. Due to the fact that obesity promotes proinflammatory conditions, in this study, we investigated the genetic polymorphism of IL-1 family genes in healthy people with normal and elevated body mass index (BMI) and fat %. Results The single-nucleotide polymorphisms (SNPs) within the IL1A -889C > T (rs1800587), IL1B + 3954 T > C (rs1143634), and IL1RN -87G > A (rs2234677) genes alone were associated neither with BMI nor fat % values in tested group. The associations between SNP–SNP interaction and BMI for the IL1B × IL1RN interactions were significant for dominant model (p = 0.02) and codominant model (p = 0.03). The same SNP-SNP interaction (IL1B × IL1RN) was associated also with fat % for codominant (p = 0.01) and recessive (p = 0.002) models. Conclusions This study further confirmed that IL-1 family genes are involved in genetic background of obesity. It has been shown that interaction IL1B × IL1RN was associated with both BMI and fat % with rare T allele protecting form higher values. Thus, even if certain polymorphisms in single genes of IL-1 family cannot be defined as related to obesity in examined population, the genetic interrelationships should be analyzed.
Collapse
Affiliation(s)
- Ewelina Maculewicz
- Faculty of Physical Education, Jozef Pilsudski University of Physical Education in Warsaw, 00-809, Warsaw, Poland.,Military Institute of Hygiene and Epidemiology, 01-163, Warsaw, Poland
| | - Bożena Antkowiak
- Military Institute of Hygiene and Epidemiology, 01-163, Warsaw, Poland
| | | | - Anna Borecka
- Military Institute of Hygiene and Epidemiology, 01-163, Warsaw, Poland
| | - Andrzej Mastalerz
- Faculty of Physical Education, Jozef Pilsudski University of Physical Education in Warsaw, 00-809, Warsaw, Poland
| | - Agata Leońska-Duniec
- Faculty of Physical Education, Gdansk University of Physical Education and Sport, 80-336, Gdansk, Poland
| | - Kinga Humińska-Lisowska
- Faculty of Physical Education, Gdansk University of Physical Education and Sport, 80-336, Gdansk, Poland
| | - Monika Michałowska-Sawczyn
- Faculty of Physical Education, Gdansk University of Physical Education and Sport, 80-336, Gdansk, Poland
| | | | - Katarzyna Lorenz
- Faculty of Physical Education, Jozef Pilsudski University of Physical Education in Warsaw, 00-809, Warsaw, Poland
| | - Ewa Szarska
- Military Institute of Hygiene and Epidemiology, 01-163, Warsaw, Poland
| | - Łukasz Dziuda
- Military Institute of Aviation Medicine, 01-755, Warsaw, Poland
| | - Anna Cywińska
- Faculty of Biological and Veterinary Sciences, Nicolaus Copernicus University in Torun, 87-100, Torun, Poland.
| | - Paweł Cięszczyk
- Faculty of Physical Education, Gdansk University of Physical Education and Sport, 80-336, Gdansk, Poland
| |
Collapse
|
19
|
Duroux D, Climente-González H, Azencott CA, Van Steen K. Interpretable network-guided epistasis detection. Gigascience 2022; 11:giab093. [PMID: 35134928 PMCID: PMC8848319 DOI: 10.1093/gigascience/giab093] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 10/12/2021] [Accepted: 12/13/2021] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Detecting epistatic interactions at the gene level is essential to understanding the biological mechanisms of complex diseases. Unfortunately, genome-wide interaction association studies involve many statistical challenges that make such detection hard. We propose a multi-step protocol for epistasis detection along the edges of a gene-gene co-function network. Such an approach reduces the number of tests performed and provides interpretable interactions while keeping type I error controlled. Yet, mapping gene interactions into testable single-nucleotide polymorphism (SNP)-interaction hypotheses, as well as computing gene pair association scores from SNP pair ones, is not trivial. RESULTS Here we compare 3 SNP-gene mappings (positional overlap, expression quantitative trait loci, and proximity in 3D structure) and use the adaptive truncated product method to compute gene pair scores. This method is non-parametric, does not require a known null distribution, and is fast to compute. We apply multiple variants of this protocol to a genome-wide association study dataset on inflammatory bowel disease. Different configurations produced different results, highlighting that various mechanisms are implicated in inflammatory bowel disease, while at the same time, results overlapped with known disease characteristics. Importantly, the proposed pipeline also differs from a conventional approach where no network is used, showing the potential for additional discoveries when prior biological knowledge is incorporated into epistasis detection.
Collapse
Affiliation(s)
- Diane Duroux
- BIO3 - Systems Genetics, GIGA-R Medical Genomics, University of Liège, 4000 Liège, Belgium, 11 Liège 4000, Belgium
| | - Héctor Climente-González
- Institut Curie, PSL Research University, F-75005 Paris, France
- INSERM, U900, F-75005 Paris, France
- CBIO-Centre for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France
- High-Dimensional Statistical Modeling Team, RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan
| | - Chloé-Agathe Azencott
- Institut Curie, PSL Research University, F-75005 Paris, France
- INSERM, U900, F-75005 Paris, France
- CBIO-Centre for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France
| | - Kristel Van Steen
- BIO3 - Systems Genetics, GIGA-R Medical Genomics, University of Liège, 4000 Liège, Belgium, 11 Liège 4000, Belgium
- BIO3 - Systems Medicine, Department of Human Genetics, KU Leuven, 3000 Leuven, Belgium, 49 3000 Leuven, Belgium
| |
Collapse
|
20
|
Walakira A, Ocira J, Duroux D, Fouladi R, Moškon M, Rozman D, Van Steen K. Detecting gene-gene interactions from GWAS using diffusion kernel principal components. BMC Bioinformatics 2022; 23:57. [PMID: 35105309 PMCID: PMC8805268 DOI: 10.1186/s12859-022-04580-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 01/18/2022] [Indexed: 11/10/2022] Open
Abstract
Genes and gene products do not function in isolation but as components of complex networks of macromolecules through physical or biochemical interactions. Dependencies of gene mutations on genetic background (i.e., epistasis) are believed to play a role in understanding molecular underpinnings of complex diseases such as inflammatory bowel disease (IBD). However, the process of identifying such interactions is complex due to for instance the curse of high dimensionality, dependencies in the data and non-linearity. Here, we propose a novel approach for robust and computationally efficient epistasis detection. We do so by first reducing dimensionality, per gene via diffusion kernel principal components (kpc). Subsequently, kpc gene summaries are used for downstream analysis including the construction of a gene-based epistasis network. We show that our approach is not only able to recover known IBD associated genes but also additional genes of interest linked to this difficult gastrointestinal disease.
Collapse
Affiliation(s)
- Andrew Walakira
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Junior Ocira
- BIO3 - Laboratory for Systems Genetics, GIGA-R Medical Genomics, University of Liège, Liège, Belgium
| | - Diane Duroux
- BIO3 - Laboratory for Systems Genetics, GIGA-R Medical Genomics, University of Liège, Liège, Belgium
| | - Ramouna Fouladi
- BIO3 - Laboratory for Systems Genetics, GIGA-R Medical Genomics, University of Liège, Liège, Belgium
| | - Miha Moškon
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Damjana Rozman
- Centre for Functional Genomics and Bio-Chips, Institute for Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Kristel Van Steen
- BIO3 - Laboratory for Systems Genetics, GIGA-R Medical Genomics, University of Liège, Liège, Belgium
- BIO3 - Laboratory for Systems Medicine, Department of Human Genetics, KU Leuven, Leuven, Belgium
| |
Collapse
|
21
|
Abstract
Aging has provided fruitful challenges for evolutionary theory, and evolutionary theory has deepened our understanding of aging. A great deal of genetic and molecular data now exists concerning mortality regulation and there is a growing body of knowledge concerning the life histories of diverse species. Assimilating all relevant data into a framework for the evolution of aging promises to significantly advance the field. We propose extensions of some key concepts to provide greater precision when applying these concepts to age-structured contexts. Secondary or byproduct effects of mutations are proposed as an important factor affecting survival patterns, including effects that may operate in small populations subject to genetic drift, widening the possibilities for mutation accumulation and pleiotropy. Molecular and genetic studies have indicated a diverse array of mechanisms that can modify aging and mortality rates, while transcriptome data indicate a high level of tissue and species specificity for genes affected by aging. The diversity of mechanisms and gene effects that can contribute to the pattern of aging in different organisms may mirror the complex evolutionary processes behind aging.
Collapse
Affiliation(s)
- Stewart Frankel
- Biology Department, University of Hartford, West Hartford, CT, United States
| | - Blanka Rogina
- Genetics and Genome Sciences, Institute for Systems Genomics, School of Medicine, University of Connecticut Health Center, Farmington, CT, United States
| |
Collapse
|
22
|
Liu D, Ban HJ, El Sergani AM, Lee MK, Hecht JT, Wehby GL, Moreno LM, Feingold E, Marazita ML, Cha S, Szabo-Rogers HL, Weinberg SM, Shaffer JR. PRICKLE1 × FOCAD Interaction Revealed by Genome-Wide vQTL Analysis of Human Facial Traits. Front Genet 2021; 12:674642. [PMID: 34434215 PMCID: PMC8381734 DOI: 10.3389/fgene.2021.674642] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 06/03/2021] [Indexed: 12/14/2022] Open
Abstract
The human face is a highly complex and variable structure resulting from the intricate coordination of numerous genetic and non-genetic factors. Hundreds of genomic loci impacting quantitative facial features have been identified. While these associations have been shown to influence morphology by altering the mean size and shape of facial measures, their effect on trait variance remains unclear. We conducted a genome-wide association analysis for the variance of 20 quantitative facial measurements in 2,447 European individuals and identified several suggestive variance quantitative trait loci (vQTLs). These vQTLs guided us to conduct an efficient search for gene-by-gene (G × G) interactions, which uncovered an interaction between PRICKLE1 and FOCAD affecting cranial base width. We replicated this G × G interaction signal at the locus level in an additional 5,128 Korean individuals. We used the hypomorphic Prickle1 Beetlejuice (Prickle1 Bj ) mouse line to directly test the function of Prickle1 on the cranial base and observed wider cranial bases in Prickle1 Bj/Bj . Importantly, we observed that the Prickle1 and Focadhesin proteins co-localize in murine cranial base chondrocytes, and this co-localization is abnormal in the Prickle1 Bj/Bj mutants. Taken together, our findings uncovered a novel G × G interaction effect in humans with strong support from both epidemiological and molecular studies. These results highlight the potential of studying measures of phenotypic variability in gene mapping studies of facial morphology.
Collapse
Affiliation(s)
- Dongjing Liu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Hyo-Jeong Ban
- Future Medicine Division, Korea Institute of Oriental Medicine, Daejeon, South Korea
| | - Ahmed M. El Sergani
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Myoung Keun Lee
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jacqueline T. Hecht
- Department of Pediatrics, McGovern Medical Center, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - George L. Wehby
- Department of Health Management and Policy, The University of Iowa, Iowa City, IA, United States
| | - Lina M. Moreno
- Department of Orthodontics, The University of Iowa, Iowa City, IA, United States
| | - Eleanor Feingold
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - Mary L. Marazita
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Psychiatry, Clinical and Translational Science Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Seongwon Cha
- Future Medicine Division, Korea Institute of Oriental Medicine, Daejeon, South Korea
| | - Heather L. Szabo-Rogers
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Developmental Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Regenerative Medicine at the McGowan Institute, University of Pittsburgh, Pittsburgh, PA, United States
- Center for Craniofacial Regeneration, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Seth M. Weinberg
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - John R. Shaffer
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
23
|
Odenkirk MT, Reif DM, Baker ES. Multiomic Big Data Analysis Challenges: Increasing Confidence in the Interpretation of Artificial Intelligence Assessments. Anal Chem 2021; 93:7763-7773. [PMID: 34029068 PMCID: PMC8465926 DOI: 10.1021/acs.analchem.0c04850] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The need for holistic molecular measurements to better understand disease initiation, development, diagnosis, and therapy has led to an increasing number of multiomic analyses. The wealth of information available from multiomic assessments, however, requires both the evaluation and interpretation of extremely large data sets, limiting analysis throughput and ease of adoption. Computational methods utilizing artificial intelligence (AI) provide the most promising way to address these challenges, yet despite the conceptual benefits of AI and its successful application in singular omic studies, the widespread use of AI in multiomic studies remains limited. Here, we discuss present and future capabilities of AI techniques in multiomic studies while introducing analytical checks and balances to validate the computational conclusions.
Collapse
Affiliation(s)
- Melanie T Odenkirk
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - David M Reif
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27606, United States
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Erin S Baker
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27606, United States
| |
Collapse
|
24
|
Bruger EL, Chubiz LM, Rojas Echenique JI, Renshaw CJ, Espericueta NV, Draghi JA, Marx CJ. Genetic Context Significantly Influences the Maintenance and Evolution of Degenerate Pathways. Genome Biol Evol 2021; 13:6245841. [PMID: 33885815 PMCID: PMC8214414 DOI: 10.1093/gbe/evab082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2021] [Indexed: 11/16/2022] Open
Abstract
Understanding the evolution of novel physiological traits is highly relevant for expanding the characterization and manipulation of biological systems. Acquisition of new traits can be achieved through horizontal gene transfer (HGT). Here, we investigate drivers that promote or deter the maintenance of HGT-driven degeneracy, occurring when processes accomplish identical functions through nonidentical components. Subsequent evolution can optimize newly acquired functions; for example, beneficial alleles identified in an engineered Methylorubrum extorquens strain allowed it to utilize a “Foreign” formaldehyde oxidation pathway substituted for its Native pathway for methylotrophic growth. We examined the fitness consequences of interactions between these alleles when they were combined with the Native pathway or both (Dual) pathways. Unlike the Foreign pathway context where they evolved, these alleles were often neutral or deleterious when moved into these alternative genetic backgrounds. However, there were instances where combinations of multiple alleles resulted in higher fitness outcomes than individual allelic substitutions could provide. Importantly, the genetic context accompanying these allelic substitutions significantly altered the fitness landscape, shifting local fitness peaks and restricting the set of accessible evolutionary trajectories. These findings highlight how genetic context can negatively impact the probability of maintaining native and HGT-introduced functions together, making it difficult for degeneracy to evolve. However, in cases where the cost of maintaining degeneracy was mitigated by adding evolved alleles impacting the function of these pathways, we observed rare opportunities for pathway coevolution to occur. Together, our results highlight the importance of genetic context and resulting epistasis in retaining or losing HGT-acquired degenerate functions.
Collapse
Affiliation(s)
- Eric L Bruger
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, USA.,Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, USA.,Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, USA.,The BEACON Center for the Study of Evolution in Action, University of Idaho, Moscow, Idaho, USA
| | - Lon M Chubiz
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA.,Department of Biology, University of Missouri, St. Louis, Missouri, USA
| | - José I Rojas Echenique
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA.,Department of Molecular Genetics, University of Toronto, Ontario, Canada
| | - Caleb J Renshaw
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, USA.,Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, USA
| | - Nora Victoria Espericueta
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, USA.,Department of Biological Sciences, California State University, Long Beach, California, USA
| | - Jeremy A Draghi
- Department of Biological Sciences, Virginia Institute of Technology, Blacksburg, Virginia, USA
| | - Christopher J Marx
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, USA.,Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, USA.,Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, USA.,The BEACON Center for the Study of Evolution in Action, University of Idaho, Moscow, Idaho, USA.,Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
25
|
Sheppard B, Rappoport N, Loh PR, Sanders SJ, Zaitlen N, Dahl A. A model and test for coordinated polygenic epistasis in complex traits. Proc Natl Acad Sci U S A 2021; 118:e1922305118. [PMID: 33833052 PMCID: PMC8053945 DOI: 10.1073/pnas.1922305118] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Interactions between genetic variants-epistasis-is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue-trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.
Collapse
Affiliation(s)
- Brooke Sheppard
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143
| | - Nadav Rappoport
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
| | - Stephan J Sanders
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143
| | - Noah Zaitlen
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA 94143;
- Department of Neurology, University of California Los Angeles, Los Angeles, CA 90095
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA 90095
| | - Andy Dahl
- Department of Neurology, University of California Los Angeles, Los Angeles, CA 90095;
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA 90095
- Section of Genetic Medicine, University of Chicago, Chicago, IL 60637
| |
Collapse
|
26
|
Orlenko A, Moore JH. A comparison of methods for interpreting random forest models of genetic association in the presence of non-additive interactions. BioData Min 2021; 14:9. [PMID: 33514397 PMCID: PMC7847145 DOI: 10.1186/s13040-021-00243-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 01/13/2021] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Non-additive interactions among genes are frequently associated with a number of phenotypes, including known complex diseases such as Alzheimer's, diabetes, and cardiovascular disease. Detecting interactions requires careful selection of analytical methods, and some machine learning algorithms are unable or underpowered to detect or model feature interactions that exhibit non-additivity. The Random Forest method is often employed in these efforts due to its ability to detect and model non-additive interactions. In addition, Random Forest has the built-in ability to estimate feature importance scores, a characteristic that allows the model to be interpreted with the order and effect size of the feature association with the outcome. This characteristic is very important for epidemiological and clinical studies where results of predictive modeling could be used to define the future direction of the research efforts. An alternative way to interpret the model is with a permutation feature importance metric which employs a permutation approach to calculate a feature contribution coefficient in units of the decrease in the model's performance and with the Shapely additive explanations which employ cooperative game theory approach. Currently, it is unclear which Random Forest feature importance metric provides a superior estimation of the true informative contribution of features in genetic association analysis. RESULTS To address this issue, and to improve interpretability of Random Forest predictions, we compared different methods for feature importance estimation in real and simulated datasets with non-additive interactions. As a result, we detected a discrepancy between the metrics for the real-world datasets and further established that the permutation feature importance metric provides more precise feature importance rank estimation for the simulated datasets with non-additive interactions. CONCLUSIONS By analyzing both real and simulated data, we established that the permutation feature importance metric provides more precise feature importance rank estimation in the presence of non-additive interactions.
Collapse
Affiliation(s)
- Alena Orlenko
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
27
|
Taha K, Davuluri R, Yoo P, Spencer J. Personizing the prediction of future susceptibility to a specific disease. PLoS One 2021; 16:e0243127. [PMID: 33406077 PMCID: PMC7787538 DOI: 10.1371/journal.pone.0243127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 11/17/2020] [Indexed: 01/22/2023] Open
Abstract
A traceable biomarker is a member of a disease's molecular pathway. A disease may be associated with several molecular pathways. Each different combination of these molecular pathways, to which detected traceable biomarkers belong, may serve as an indicative of the elicitation of the disease at a different time frame in the future. Based on this notion, we introduce a novel methodology for personalizing an individual's degree of future susceptibility to a specific disease. We implemented the methodology in a working system called Susceptibility Degree to a Disease Predictor (SDDP). For a specific disease d, let S be the set of molecular pathways, to which traceable biomarkers detected from most patients of d belong. For the same disease d, let S' be the set of molecular pathways, to which traceable biomarkers detected from a certain individual belong. SDDP is able to infer the subset S'' ⊆{S-S'} of undetected molecular pathways for the individual. Thus, SDDP can infer undetected molecular pathways of a disease for an individual based on few molecular pathways detected from the individual. SDDP can also help in inferring the combination of molecular pathways in the set {S'+S''}, whose traceable biomarkers collectively is an indicative of the disease. SDDP is composed of the following four components: information extractor, interrelationship between molecular pathways modeler, logic inferencer, and risk indicator. The information extractor takes advantage of the exponential increase of biomedical literature to automatically extract the common traceable biomarkers for a specific disease. The interrelationship between molecular pathways modeler models the hierarchical interrelationships between the molecular pathways of the traceable biomarkers. The logic inferencer transforms the hierarchical interrelationships between the molecular pathways into rule-based specifications. It employs the specification rules and the inference rules for predicate logic to infer as many as possible undetected molecular pathways of a disease for an individual. The risk indicator outputs a risk indicator value that reflects the individual's degree of future susceptibility to the disease. We evaluated SDDP by comparing it experimentally with other methods. Results revealed marked improvement.
Collapse
Affiliation(s)
- Kamal Taha
- Department of Electrical and Computer Science, Khalifa University, Abu Dhabi, UAE
- * E-mail:
| | - Ramana Davuluri
- Department of Biomedical Informatics, School of Medicine and College of Engineering and Applied Sciences, Stony Brook University, Stony Brook, New York, United States of America
| | - Paul Yoo
- Department of Computer Science & Information Systems, University of London, Birkbeck College, London, United Kingdom
| | - Jesse Spencer
- Department of Pathology, University of Utah, Salt Lake City, Utah, United States of America
| |
Collapse
|
28
|
Aristodimou A, Antoniades A, Dardiotis E, Loizidou E, Spyrou G, Votsi C, Kyproula C, Pantzaris M, Grigoriadis N, Hadjigeorgiou G, Kyriakides T, Pattichi C. A Framework for Efficient N-Way Interaction Testing in Case/Control Studies With Categorical Data. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2021; 2:256-262. [PMID: 35402966 PMCID: PMC8901013 DOI: 10.1109/ojemb.2021.3100416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 07/08/2021] [Accepted: 07/22/2021] [Indexed: 11/26/2022] Open
Abstract
Goal: Most common diseases are influenced by multiple gene interactions and interactions with the environment. Performing an exhaustive search to identify such interactions is computationally expensive and needs to address the multiple testing problem. A four-step framework is proposed for the efficient identification of n-Way interactions. Methods: The framework was applied on a Multiple Sclerosis dataset with 725 subjects and 147 tagging SNPs. The first two steps of the framework are quality control and feature selection. The next step uses clustering and binary encodes the features. The final step performs the n-Way interaction testing. Results: The feature space was reduced to 7 SNPs and using the proposed binary encoding, more 2-SNP and 3-SNP interactions were identified compared to using the initial encoding. Conclusions: The framework selects informative features and with the proposed binary encoding it is able to identify more n-way interactions by increasing the power of the statistical analysis.
Collapse
Affiliation(s)
| | | | - Efthimios Dardiotis
- Department of Neurology, Faculty of MedicineUniversity of Thessaly Volos 38221 Greece
| | - Eleni Loizidou
- Department of Hygiene and EpidemiologyUniversity of Ioannina Ioannina 451 10 Greece
- Institute for BioinnovationBiomedical Sciences Research Center Alexander Fleming, Athens, 16672 Greece
| | - George Spyrou
- Bioinformatics Department and Cyprus School of Molecular MedicineCyprus Institute of Neurology and Genetics Nicosia 2371 Cyprus
| | - Christina Votsi
- Neurogenetics Department and Cyprus School of Molecular MedicineCyprus Institute of Neurology and Genetics Nicosia 2371 Cyprus
| | - Christodoulou Kyproula
- Neurogenetics Department and Cyprus School of Molecular MedicineCyprus Institute of Neurology and Genetics Nicosia 2371 Cyprus
| | - Marios Pantzaris
- Department of Neurology and Cyprus School of Molecular MedicineCyprus Institute of Neurology and Genetics Nicosia 2371 Cyprus
| | - Nikolaos Grigoriadis
- Department of Neurology IIAristotle University of Thessaloniki Thessaloniki 541 24 Greece
| | | | - Theodoros Kyriakides
- Department of Basic and Clinical SciencesMedical School University of Nicosia Nicosia 1678 Cyprus
| | - Constantinos Pattichi
- Department of Computer ScienceUniversity of Cyprus Nicosia 1678 Cyprus
- Biomedical Engineering Research CentreUniversity of Cyprus Nicosia 1678 Cyprus
| |
Collapse
|
29
|
Sierksma A, Lu A, Mancuso R, Fattorelli N, Thrupp N, Salta E, Zoco J, Blum D, Buée L, De Strooper B, Fiers M. Novel Alzheimer risk genes determine the microglia response to amyloid-β but not to TAU pathology. EMBO Mol Med 2020; 12:e10606. [PMID: 31951107 PMCID: PMC7059012 DOI: 10.15252/emmm.201910606] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 12/20/2019] [Accepted: 12/20/2019] [Indexed: 12/20/2022] Open
Abstract
Polygenic risk scores have identified that genetic variants without genome-wide significance still add to the genetic risk of developing Alzheimer's disease (AD). Whether and how subthreshold risk loci translate into relevant disease pathways is unknown. We investigate here the involvement of AD risk variants in the transcriptional responses of two mouse models: APPswe/PS1L166P and Thy-TAU22. A unique gene expression module, highly enriched for AD risk genes, is specifically responsive to Aβ but not TAU pathology. We identify in this module 7 established AD risk genes (APOE, CLU, INPP5D, CD33, PLCG2, SPI1, and FCER1G) and 11 AD GWAS genes below the genome-wide significance threshold (GPC2, TREML2, SYK, GRN, SLC2A5, SAMSN1, PYDC1, HEXB, RRBP1, LYN, and BLNK), that become significantly upregulated when exposed to Aβ. Single microglia sequencing confirms that Aβ, not TAU, pathology induces marked transcriptional changes in microglia, including increased proportions of activated microglia. We conclude that genetic risk of AD functionally translates into different microglia pathway responses to Aβ pathology, placing AD genetic risk downstream of the amyloid pathway but upstream of TAU pathology.
Collapse
Affiliation(s)
- Annerieke Sierksma
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - Ashley Lu
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - Renzo Mancuso
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - Nicola Fattorelli
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - Nicola Thrupp
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - Evgenia Salta
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - Jesus Zoco
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| | - David Blum
- INSERM, CHU Lille, LabEx DISTALZ, UMR‐S 1172, Alzheimer & TauopathiesUniversité LilleLilleFrance
| | - Luc Buée
- INSERM, CHU Lille, LabEx DISTALZ, UMR‐S 1172, Alzheimer & TauopathiesUniversité LilleLilleFrance
| | - Bart De Strooper
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
- UK Dementia Research InstituteUniversity College LondonLondonUK
| | - Mark Fiers
- VIB Center for Brain & Disease ResearchLeuvenBelgium
- Laboratory for the Research of Neurodegenerative DiseasesDepartment of NeurosciencesLeuven Brain Institute (LBI)KU Leuven (University of Leuven)LeuvenBelgium
| |
Collapse
|
30
|
Rahit KMTH, Tarailo-Graovac M. Genetic Modifiers and Rare Mendelian Disease. Genes (Basel) 2020; 11:E239. [PMID: 32106447 PMCID: PMC7140819 DOI: 10.3390/genes11030239] [Citation(s) in RCA: 105] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 02/21/2020] [Indexed: 12/11/2022] Open
Abstract
Despite advances in high-throughput sequencing that have revolutionized the discovery of gene defects in rare Mendelian diseases, there are still gaps in translating individual genome variation to observed phenotypic outcomes. While we continue to improve genomics approaches to identify primary disease-causing variants, it is evident that no genetic variant acts alone. In other words, some other variants in the genome (genetic modifiers) may alleviate (suppress) or exacerbate (enhance) the severity of the disease, resulting in the variability of phenotypic outcomes. Thus, to truly understand the disease, we need to consider how the disease-causing variants interact with the rest of the genome in an individual. Here, we review the current state-of-the-field in the identification of genetic modifiers in rare Mendelian diseases and discuss the potential for future approaches that could bridge the existing gap.
Collapse
Affiliation(s)
- K. M. Tahsin Hassan Rahit
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada;
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada;
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
31
|
Moore JH, Olson RS, Schmitt P, Chen Y, Manduchi E. How Computational Experiments Can Improve Our Understanding of the Genetic Architecture of Common Human Diseases. ARTIFICIAL LIFE 2020; 26:23-37. [PMID: 32027528 DOI: 10.1162/artl_a_00308] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Susceptibility to common human diseases such as cancer is influenced by many genetic and environmental factors that work together in a complex manner. The state of the art is to perform a genome-wide association study (GWAS) that measures millions of single-nucleotide polymorphisms (SNPs) throughout the genome followed by a one-SNP-at-a-time statistical analysis to detect univariate associations. This approach has identified thousands of genetic risk factors for hundreds of diseases. However, the genetic risk factors detected have very small effect sizes and collectively explain very little of the overall heritability of the disease. Nonetheless, it is assumed that the genetic component of risk is due to many independent risk factors that contribute additively. The fact that many genetic risk factors with small effects can be detected is taken as evidence to support this notion. It is our working hypothesis that the genetic architecture of common diseases is partly driven by non-additive interactions. To test this hypothesis, we developed a heuristic simulation-based method for conducting experiments about the complexity of genetic architecture. We show that a genetic architecture driven by complex interactions is highly consistent with the magnitude and distribution of univariate effects seen in real data. We compare our results with measures of univariate and interaction effects from two large-scale GWASs of sporadic breast cancer and find evidence to support our hypothesis that is consistent with the results of our computational experiment.
Collapse
Affiliation(s)
- Jason H Moore
- University of Pennsylvania, Institute for Biomedical Informatics, Perelman School of Medicine.
| | - Randal S Olson
- University of Pennsylvania, Institute for Biomedical Informatics, Perelman School of Medicine
| | - Peter Schmitt
- University of Pennsylvania, Institute for Biomedical Informatics, Perelman School of Medicine
| | - Yong Chen
- University of Pennsylvania, Institute for Biomedical Informatics, Perelman School of Medicine
| | - Elisabetta Manduchi
- University of Pennsylvania, Institute for Biomedical Informatics, Perelman School of Medicine
| |
Collapse
|
32
|
Sio YY, Matta SA, Ng YT, Chew FT. Epistasis between phenylethanolamine N-methyltransferase and β2-adrenergic receptor influences extracellular epinephrine level and associates with the susceptibility to allergic asthma. Clin Exp Allergy 2020; 50:352-363. [PMID: 31855300 DOI: 10.1111/cea.13552] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 12/03/2019] [Accepted: 12/08/2019] [Indexed: 12/14/2022]
Abstract
BACKGROUND Reduced extracellular epinephrine level often associates with asthma-related symptoms; however, the correlation between asthma and genetic variants in genes participating in the epinephrine signalling pathway remains unclear. OBJECTIVE To characterize the functions of single nucleotide polymorphisms (SNPs) in phenylethanolamine N-methyltransferase (PNMT) and β2-adrenergic receptor (ADRB2), and to study the effects, including both direct and epistatic, of these SNPs on serum epinephrine level and asthma susceptibility. METHODS Single nucleotide polymorphisms functions were characterized through in vitro luciferase assay. ADRB2 gene expression level in peripheral blood mononuclear cell (PBMC) was measured by transcriptome sequencing and expression microarray on two separate Asian cohorts (NUS-UTAR, n = 278 and NUS-TA, n = 58). Serum epinephrine level was assessed on a Singapore Chinese cohort (NUS-SH, n = 314) with 155 asthmatic and 159 non-asthmatic subjects. A separate Singapore Chinese cohort (NUS-G, n = 3009) was genotyped to show disease association (direct and epistatic effect) of functional SNPs in PNMT and ADRB2. RESULTS Reduced serum epinephrine level was associated with increased asthma risk in Singapore Chinese. The minor allele of rs876493 was shown to increase PNMT promoter activity and reduce asthma risk. Multiple SNPs in ADRB2 forms a haplotype that was associated with the differential promoter activity of this gene. In this haplotype, rs11168070 was associated directly with ADRB2 expression in PBMCs. Both minor alleles from rs876493 and rs11168070 contribute synergistically to reduce asthma risk and increase serum epinephrine level. CONCLUSION AND CLINICAL RELEVANCE Epistatic interaction between genetic variants from PNMT (rs876493) and ADRB2 (rs11168070) is associated with serum epinephrine level and the susceptibility of asthma. Our findings improved the current understanding of the genetic basis of this disease, while genotypic states of these SNPs may serve as potential biomarkers to predict susceptibility to the disease.
Collapse
Affiliation(s)
- Yang Yie Sio
- Department of Biological Sciences, National University of Singapore, Singapore City, Singapore
| | - Sri Anusha Matta
- Department of Biological Sciences, National University of Singapore, Singapore City, Singapore
| | - Yu Ting Ng
- Department of Biological Sciences, National University of Singapore, Singapore City, Singapore
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, Singapore City, Singapore
| |
Collapse
|
33
|
Selecting Closely-Linked SNPs Based on Local Epistatic Effects for Haplotype Construction Improves Power of Association Mapping. G3-GENES GENOMES GENETICS 2019; 9:4115-4126. [PMID: 31604824 PMCID: PMC6893203 DOI: 10.1534/g3.119.400451] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Genome-wide association studies (GWAS) have gained central importance for the identification of candidate loci underlying complex traits. Single nucleotide polymorphism (SNP) markers are mostly used as genetic variants for the analysis of genotype-phenotype associations in populations, but closely linked SNPs that are grouped into haplotypes are also exploited. The benefit of haplotype-based GWAS approaches vs. SNP-based approaches is still under debate because SNPs in high linkage disequilibrium provide redundant information. To overcome some constraints of the commonly-used haplotype-based GWAS in which only consecutive SNPs are considered for haplotype construction, we propose a new method called functional haplotype-based GWAS (FH GWAS). FH GWAS is featured by combining SNPs into haplotypes based on the additive and epistatic effects among SNPs. Such haplotypes were termed functional haplotypes (FH). As shown by simulation studies, the FH GWAS approach clearly outperformed the SNP-based approach unless the minor allele frequency of the SNPs making up the haplotypes is low and the linkage disequilibrium between them is high. Applying FH GWAS for the trait flowering time in a large Arabidopsis thaliana population with whole-genome sequencing data revealed its potential empirically. FH GWAS identified all candidate regions which were detected in SNP-based and two other haplotype-based GWAS approaches. In addition, a novel region on chromosome 4 was solely detected by FH GWAS. Thus both the results of our simulation and empirical studies demonstrate that FH GWAS is a promising method and superior to the SNP-based approach even if almost complete genotype information is available.
Collapse
|
34
|
Chattopadhyay A, Lu TP. Gene-gene interaction: the curse of dimensionality. ANNALS OF TRANSLATIONAL MEDICINE 2019; 7:813. [PMID: 32042829 DOI: 10.21037/atm.2019.12.87] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Identified genetic variants from genome wide association studies frequently show only modest effects on the disease risk, leading to the "missing heritability" problem. An avenue, to account for a part of this "missingness" is to evaluate gene-gene interactions (epistasis) thereby elucidating their effect on complex diseases. This can potentially help with identifying gene functions, pathways, and drug targets. However, the exhaustive evaluation of all possible genetic interactions among millions of single nucleotide polymorphisms (SNPs) raises several issues, otherwise known as the "curse of dimensionality". The dimensionality involved in the epistatic analysis of such exponentially growing SNPs diminishes the usefulness of traditional, parametric statistical methods. With the immense popularity of multifactor dimensionality reduction (MDR), a non-parametric method, proposed in 2001, that classifies multi-dimensional genotypes into one- dimensional binary approaches, led to the emergence of a fast-growing collection of methods that were based on the MDR approach. Moreover, machine-learning (ML) methods such as random forests and neural networks (NNs), deep-learning (DL) approaches, and hybrid approaches have also been applied profusely, in the recent years, to tackle this dimensionality issue associated with whole genome gene-gene interaction studies. However, exhaustive searching in MDR based approaches or variable selection in ML methods, still pose the risk of missing out on relevant SNPs. Furthermore, interpretability issues are a major hindrance for DL methods. To minimize this loss of information, Python based tools such as PySpark can potentially take advantage of distributed computing resources in the cloud, to bring back smaller subsets of data for further local analysis. Parallel computing can be a powerful resource that stands to fight this "curse". PySpark supports all standard Python libraries and C extensions thus making it convenient to write codes to deliver dramatic improvements in processing speed for extraordinarily large sets of data.
Collapse
Affiliation(s)
- Amrita Chattopadhyay
- Institute of Epidemiology and Preventive Medicine, Department of Public Health, National Taiwan University, Taipei
| | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, Department of Public Health, National Taiwan University, Taipei
| |
Collapse
|
35
|
Cao X, Yu G, Ren W, Guo M, Wang J. DualWMDR: Detecting epistatic interaction with dual screening and multifactor dimensionality reduction. Hum Mutat 2019; 41:719-734. [DOI: 10.1002/humu.23951] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 09/10/2019] [Accepted: 11/07/2019] [Indexed: 12/14/2022]
Affiliation(s)
- Xia Cao
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Guoxian Yu
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Wei Ren
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Maozu Guo
- School of Electrical and Information EngineeringBeijing University of Civil Engineering and ArchitectureBeijing China
- Beijing Key Laboratory of Intelligent Processing for Building Big DataBeijing China
| | - Jun Wang
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| |
Collapse
|
36
|
Alzoubi D, Desouki AA, Lercher MJ. Flux balance analysis with or without molecular crowding fails to predict two thirds of experimentally observed epistasis in yeast. Sci Rep 2019; 9:11837. [PMID: 31413270 PMCID: PMC6694147 DOI: 10.1038/s41598-019-47935-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 07/08/2019] [Indexed: 12/15/2022] Open
Abstract
Computational predictions of double gene knockout effects by flux balance analysis (FBA) have been used to characterized genome-wide patterns of epistasis in microorganisms. However, it is unclear how in silico predictions are related to in vivo epistasis, as FBA predicted only a minority of experimentally observed genetic interactions between non-essential metabolic genes in yeast. Here, we perform a detailed comparison of yeast experimental epistasis data to predictions generated with different constraint-based metabolic modeling algorithms. The tested methods comprise standard FBA; a variant of MOMA, which was specifically designed to predict fitness effects of non-essential gene knockouts; and two alternative implementations of FBA with macro-molecular crowding, which account approximately for enzyme kinetics. The number of interactions uniquely predicted by one method is typically larger than its overlap with any alternative method. Only 20% of negative and 10% of positive interactions jointly predicted by all methods are confirmed by the experimental data; almost all unique predictions appear to be false. More than two thirds of epistatic interactions are undetectable by any of the tested methods. The low prediction accuracies indicate that the physiology of yeast double metabolic gene knockouts is dominated by processes not captured by current constraint-based analysis methods.
Collapse
Affiliation(s)
- Deya Alzoubi
- Institute for Computer Science and Department of Biology, Heinrich Heine University, Universitätsstraße 1, Düsseldorf, D-40221, Germany
| | - Abdelmoneim Amer Desouki
- Institute for Computer Science and Department of Biology, Heinrich Heine University, Universitätsstraße 1, Düsseldorf, D-40221, Germany
| | - Martin J Lercher
- Institute for Computer Science and Department of Biology, Heinrich Heine University, Universitätsstraße 1, Düsseldorf, D-40221, Germany.
| |
Collapse
|
37
|
Joiret M, Mahachie John JM, Gusareva ES, Van Steen K. Confounding of linkage disequilibrium patterns in large scale DNA based gene-gene interaction studies. BioData Min 2019; 12:11. [PMID: 31198442 PMCID: PMC6558841 DOI: 10.1186/s13040-019-0199-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 05/09/2019] [Indexed: 01/07/2023] Open
Abstract
Background In Genome-Wide Association Studies (GWAS), the concept of linkage disequilibrium is important as it allows identifying genetic markers that tag the actual causal variants. In Genome-Wide Association Interaction Studies (GWAIS), similar principles hold for pairs of causal variants. However, Linkage Disequilibrium (LD) may also interfere with the detection of genuine epistasis signals in that there may be complete confounding between Gametic Phase Disequilibrium (GPD) and interaction. GPD may involve unlinked genetic markers, even residing on different chromosomes. Often GPD is eliminated in GWAIS, via feature selection schemes or so-called pruning algorithms, to obtain unconfounded epistasis results. However, little is known about the optimal degree of GPD/LD-pruning that gives a balance between false positive control and sufficient power of epistasis detection statistics. Here, we focus on Model-Based Multifactor Dimensionality Reduction as one large-scale epistasis detection tool. Its performance has been thoroughly investigated in terms of false positive control and power, under a variety of scenarios involving different trait types and study designs, as well as error-free and noisy data, but never with respect to multicollinear SNPs. Results Using real-life human LD patterns from a homogeneous subpopulation of British ancestry, we investigated the impact of LD-pruning on the statistical sensitivity of MB-MDR. We considered three different non-fully penetrant epistasis models with varying effect sizes. There is a clear advantage in pre-analysis pruning using sliding windows at r2 of 0.75 or lower, but using a threshold of 0.20 has a detrimental effect on the power to detect a functional interactive SNP pair (power < 25%). Signal sensitivity, directly using LD-block information to determine whether an epistasis signal is present or not, benefits from LD-pruning as well (average power across scenarios: 87%), but is largely hampered by functional loci residing at the boundaries of an LD-block. Conclusions Our results confirm that LD patterns and the position of causal variants in LD blocks do have an impact on epistasis detection, and that pruning strategies and LD-blocks definitions combined need careful attention, if we wish to maximize the power of large-scale epistasis screenings.
Collapse
Affiliation(s)
- Marc Joiret
- BIO3, GIGA-R Medical Genomics, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium.,Biomechanics Research Unit, GIGA-R in-silico medicine, Liège, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium
| | | | - Elena S Gusareva
- BIO3, GIGA-R Medical Genomics, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium
| | - Kristel Van Steen
- BIO3, GIGA-R Medical Genomics, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium.,WELBIO researcher, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium
| |
Collapse
|
38
|
Hill GE, Havird JC, Sloan DB, Burton RS, Greening C, Dowling DK. Assessing the fitness consequences of mitonuclear interactions in natural populations. Biol Rev Camb Philos Soc 2019; 94:1089-1104. [PMID: 30588726 PMCID: PMC6613652 DOI: 10.1111/brv.12493] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Revised: 11/27/2018] [Accepted: 11/30/2018] [Indexed: 12/22/2022]
Abstract
Metazoans exist only with a continuous and rich supply of chemical energy from oxidative phosphorylation in mitochondria. The oxidative phosphorylation machinery that mediates energy conservation is encoded by both mitochondrial and nuclear genes, and hence the products of these two genomes must interact closely to achieve coordinated function of core respiratory processes. It follows that selection for efficient respiration will lead to selection for compatible combinations of mitochondrial and nuclear genotypes, and this should facilitate coadaptation between mitochondrial and nuclear genomes (mitonuclear coadaptation). Herein, we outline the modes by which mitochondrial and nuclear genomes may coevolve within natural populations, and we discuss the implications of mitonuclear coadaptation for diverse fields of study in the biological sciences. We identify five themes in the study of mitonuclear interactions that provide a roadmap for both ecological and biomedical studies seeking to measure the contribution of intergenomic coadaptation to the evolution of natural populations. We also explore the wider implications of the fitness consequences of mitonuclear interactions, focusing on central debates within the fields of ecology and biomedicine.
Collapse
Affiliation(s)
- Geoffrey E. Hill
- Department of Biological Sciences, Auburn University, United States of America
| | - Justin C. Havird
- Department of Biology, Colorado State University, United States of America
| | - Daniel B. Sloan
- Department of Biology, Colorado State University, United States of America
| | - Ronald S. Burton
- Scripps Institution of Oceanography, University of California, San Diego, United States of America
| | - Chris Greening
- School of Biological Sciences, Monash University, Clayton, Victoria 3800, Australia
| | - Damian K. Dowling
- School of Biological Sciences, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
39
|
Van Steen K, Moore JH. How to increase our belief in discovered statistical interactions via large-scale association studies? Hum Genet 2019; 138:293-305. [PMID: 30840129 PMCID: PMC6483943 DOI: 10.1007/s00439-019-01987-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 02/20/2019] [Indexed: 12/31/2022]
Abstract
The understanding that differences in biological epistasis may impact disease risk, diagnosis, or disease management stands in wide contrast to the unavailability of widely accepted large-scale epistasis analysis protocols. Several choices in the analysis workflow will impact false-positive and false-negative rates. One of these choices relates to the exploitation of particular modelling or testing strategies. The strengths and limitations of these need to be well understood, as well as the contexts in which these hold. This will contribute to determining the potentially complementary value of epistasis detection workflows and is expected to increase replication success with biological relevance. In this contribution, we take a recently introduced regression-based epistasis detection tool as a leading example to review the key elements that need to be considered to fully appreciate the value of analytical epistasis detection performance assessments. We point out unresolved hurdles and give our perspectives towards overcoming these.
Collapse
Affiliation(s)
- K Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liege, Belgium.
- Department of Human Genetics, University of Leuven, Leuven, Belgium.
| | - J H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
40
|
Hou TT, Lin F, Bai S, Cleves MA, Xu HM, Lou XY. Generalized multifactor dimensionality reduction approaches to identification of genetic interactions underlying ordinal traits. Genet Epidemiol 2018; 43:24-36. [PMID: 30387901 DOI: 10.1002/gepi.22169] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 08/31/2018] [Accepted: 09/21/2018] [Indexed: 12/11/2022]
Abstract
The manifestation of complex traits is influenced by gene-gene and gene-environment interactions, and the identification of multifactor interactions is an important but challenging undertaking for genetic studies. Many complex phenotypes such as disease severity are measured on an ordinal scale with more than two categories. A proportional odds model can improve statistical power for these outcomes, when compared to a logit model either collapsing the categories into two mutually exclusive groups or limiting the analysis to pairs of categories. In this study, we propose a proportional odds model-based generalized multifactor dimensionality reduction (GMDR) method for detection of interactions underlying polytomous ordinal phenotypes. Computer simulations demonstrated that this new GMDR method has a higher power and more accurate predictive ability than the GMDR methods based on a logit model and a multinomial logit model. We applied this new method to the genetic analysis of low-density lipoprotein (LDL) cholesterol, a causal risk factor for coronary artery disease, in the Multi-Ethnic Study of Atherosclerosis, and identified a significant joint action of the CELSR2, SERPINA12, HPGD, and APOB genes. This finding provides new information to advance the limited knowledge about genetic regulation and gene interactions in metabolic pathways of LDL cholesterol. In conclusion, the proportional odds model-based GMDR is a useful tool that can boost statistical power and prediction accuracy in studying multifactor interactions underlying ordinal traits.
Collapse
Affiliation(s)
- Ting-Ting Hou
- Biostatistics Program, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas.,Arkansas Children's Research Institute, Little Rock, Arkansas.,Institute of Bioinformatics and Institute of Crop Science, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Feng Lin
- Institute of Bioinformatics and Institute of Crop Science, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Shasha Bai
- Biostatistics Program, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas.,Arkansas Children's Research Institute, Little Rock, Arkansas
| | - Mario A Cleves
- Biostatistics Program, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas.,Arkansas Children's Research Institute, Little Rock, Arkansas
| | - Hai-Ming Xu
- Biostatistics Program, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas.,Arkansas Children's Research Institute, Little Rock, Arkansas.,Institute of Bioinformatics and Institute of Crop Science, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Xiang-Yang Lou
- Biostatistics Program, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas.,Arkansas Children's Research Institute, Little Rock, Arkansas.,Arkansas Children's Nutrition Center, Little Rock, Arkansas
| |
Collapse
|
41
|
Campbell RF, McGrath PT, Paaby AB. Analysis of Epistasis in Natural Traits Using Model Organisms. Trends Genet 2018; 34:883-898. [PMID: 30166071 PMCID: PMC6541385 DOI: 10.1016/j.tig.2018.08.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 06/06/2018] [Accepted: 08/03/2018] [Indexed: 12/16/2022]
Abstract
The ability to detect and understand epistasis in natural populations is important for understanding how biological traits are influenced by genetic variation. However, identification and characterization of epistasis in natural populations remains difficult due to statistical issues that arise as a result of multiple comparisons, and the fact that most genetic variants segregate at low allele frequencies. In this review, we discuss how model organisms may be used to manipulate genotypic combinations to power the detection of epistasis as well as test interactions between specific genes. Findings from a number of species indicate that statistical epistasis is pervasive between natural genetic variants. However, the properties of experimental systems that enable analysis of epistasis also constrain extrapolation of these results back into natural populations.
Collapse
Affiliation(s)
- Richard F Campbell
- Department of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332 USA
| | - Patrick T McGrath
- Department of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332 USA; Department of Physics, Georgia Institute of Technology, Atlanta, GA, 30332 USA.
| | - Annalise B Paaby
- Department of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332 USA
| |
Collapse
|
42
|
Urbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Inform 2018; 85:168-188. [PMID: 30030120 PMCID: PMC6299838 DOI: 10.1016/j.jbi.2018.07.015] [Citation(s) in RCA: 96] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 06/30/2018] [Accepted: 07/14/2018] [Indexed: 11/23/2022]
Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the 'Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this study (1) support the assertion that RBAs are particularly flexible, efficient, and powerful feature selection methods that differentiate relevant features having univariate, multivariate, epistatic, or heterogeneous associations, (2) confirm the efficacy of expansions for classification vs. regression, discrete vs. continuous features, missing data, multiple classes, or class imbalance, (3) identify previously unknown limitations of specific RBAs, and (4) suggest that while MultiSURF∗ performs best for explicitly identifying pure 2-way interactions, MultiSURF yields the most reliable feature selection performance across a wide range of problem types.
Collapse
Affiliation(s)
- Ryan J Urbanowicz
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Randal S Olson
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Peter Schmitt
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | - Jason H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
43
|
Manduchi E, Williams SM, Chesi A, Johnson ME, Wells AD, Grant SFA, Moore JH. Leveraging epigenomics and contactomics data to investigate SNP pairs in GWAS. Hum Genet 2018; 137:413-425. [PMID: 29797095 PMCID: PMC5996751 DOI: 10.1007/s00439-018-1893-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Accepted: 05/20/2018] [Indexed: 12/29/2022]
Abstract
Although Genome Wide Association Studies (GWAS) have led to many valuable insights into the genetic bases of common diseases over the past decade, the issue of missing heritability has surfaced, as the discovered main effect genetic variants found to date do not account for much of a trait's predicted genetic component. We present a workflow, integrating epigenomics and topologically associating domain data, aimed at discovering trait-associated SNP pairs from GWAS where neither SNP achieved independent genome-wide significance. Each analyzed SNP pair consists of one SNP in a putative active enhancer and another SNP in a putative physically interacting gene promoter in a trait-relevant tissue. As a proof-of-principle case study, we used this approach to identify focused collections of SNP pairs that we analyzed in three independent Type 2 diabetes (T2D) GWAS. This approach led us to discover 35 significant SNP pairs, encompassing both novel signals and signals for which we have found orthogonal support from other sources. Nine of these pairs are consistent with eQTL results, two are consistent with our own capture C experiments, and seven involve signals supported by recent T2D literature.
Collapse
Affiliation(s)
- Elisabetta Manduchi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Scott M Williams
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| | - Alessandra Chesi
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Matthew E Johnson
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Andrew D Wells
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Struan F A Grant
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
44
|
Sharbrough J, Havird JC, Noe GR, Warren JM, Sloan DB. The Mitonuclear Dimension of Neanderthal and Denisovan Ancestry in Modern Human Genomes. Genome Biol Evol 2018; 9:1567-1581. [PMID: 28854627 PMCID: PMC5509035 DOI: 10.1093/gbe/evx114] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2017] [Indexed: 12/15/2022] Open
Abstract
Some human populations interbred with Neanderthals and Denisovans, resulting in substantial contributions to modern-human genomes. Therefore, it is now possible to use genomic data to investigate mechanisms that shaped historical gene flow between humans and our closest hominin relatives. More generally, in eukaryotes, mitonuclear interactions have been argued to play a disproportionate role in generating reproductive isolation. There is no evidence of mtDNA introgression into modern human populations, which means that all introgressed nuclear alleles from archaic hominins must function on a modern-human mitochondrial background. Therefore, mitonuclear interactions are also potentially relevant to hominin evolution. We performed a detailed accounting of mtDNA divergence among hominin lineages and used population-genomic data to test the hypothesis that mitonuclear incompatibilities have preferentially restricted the introgression of nuclear genes with mitochondrial functions. We found a small but significant underrepresentation of introgressed Neanderthal alleles at such nuclear loci. Structural analyses of mitochondrial enzyme complexes revealed that these effects are unlikely to be mediated by physically interacting sites in mitochondrial and nuclear gene products. We did not detect any underrepresentation of introgressed Denisovan alleles at mitochondrial-targeted loci, but this may reflect reduced power because locus-specific estimates of Denisovan introgression are more conservative. Overall, we conclude that genes involved in mitochondrial function may have been subject to distinct selection pressures during the history of introgression from archaic hominins but that mitonuclear incompatibilities have had, at most, a small role in shaping genome-wide introgression patterns, perhaps because of limited functional divergence in mtDNA and interacting nuclear genes.
Collapse
Affiliation(s)
- Joel Sharbrough
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Justin C Havird
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Gregory R Noe
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Jessica M Warren
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO
| |
Collapse
|
45
|
Fang L, Tang BS, Fan K, Wan CM, Yan XX, Guo JF. Alzheimer's disease susceptibility genes modify the risk of Parkinson disease and Parkinson's disease-associated cognitive impairment. Neurosci Lett 2018; 677:55-59. [PMID: 29698690 DOI: 10.1016/j.neulet.2018.04.042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2018] [Revised: 04/19/2018] [Accepted: 04/22/2018] [Indexed: 12/16/2022]
Abstract
The pathogenic mechanism underlying Parkinson's disease (PD) and PD- Cognitive impairment (CI) remains elusive. Its potential link to the risk factors in Alzheimer's disease (AD) is unclear. In this study, we analyzed 16 CE-associated single nucleotide polymorphisms (SNPs) in twelve genes in a Chinese cohort of 450 PD cases and 449 controls. Among our 298 cases clinically evaluated for CI, 113 cases did not show CI signs (PD-NC), 86 cases had mildly cognitive impairment (PD-MCI) and 99 cases had dementia (PD-D). We found that the APOE ε4 allele is associated with a higher risk for PD-D. Gene-gene interaction analysis revealed that three significant gene-gene interactions, including BDNF and CLU, APOE and CR1, and DYRK1A and CD2AP increase the risk for PD. Because these SNPs are known genetic risk factors for AD, their contribution to PD and PD-D shown in this study suggests that PD/PD-D and AD may share convergent pathways in their pathogenesis through gene-gene interactions.
Collapse
Affiliation(s)
- Lu Fang
- Beijing Institute for Brain Disorders, Center for Brain Disorders Research, Capital Medical University, Beijing 100069, China
| | - Bei-Sha Tang
- Beijing Institute for Brain Disorders, Center for Brain Disorders Research, Capital Medical University, Beijing 100069, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; Laboratory of Medical Genetics, Central South University Changsha, Hunan 410078, China; National Clinical Research Center for Geriatric Disorders, Changsha, Hunan 410078, China; Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, Hunan 410008, China; Collaborative Innovation Center for Brain Science, Shanghai 200032, China; Collaborative Innovation Center for Brain Science, Shanghai 200032, China; Collaborative Innovation Center for Genetics and Development, Shanghai 200433, China
| | - Kuan Fan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Chang-Min Wan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Xin-Xiang Yan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; National Clinical Research Center for Geriatric Disorders, Changsha, Hunan 410078, China; Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, Hunan 410008, China
| | - Ji-Feng Guo
- Beijing Institute for Brain Disorders, Center for Brain Disorders Research, Capital Medical University, Beijing 100069, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; Laboratory of Medical Genetics, Central South University Changsha, Hunan 410078, China; National Clinical Research Center for Geriatric Disorders, Changsha, Hunan 410078, China; Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, Hunan 410008, China.
| |
Collapse
|
46
|
Piette ER, Moore JH. Improving machine learning reproducibility in genetic association studies with proportional instance cross validation (PICV). BioData Min 2018; 11:6. [PMID: 29713384 PMCID: PMC5907739 DOI: 10.1186/s13040-018-0167-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 04/03/2018] [Indexed: 11/10/2022] Open
Abstract
Background Machine learning methods and conventions are increasingly employed for the analysis of large, complex biomedical data sets, including genome-wide association studies (GWAS). Reproducibility of machine learning analyses of GWAS can be hampered by biological and statistical factors, particularly so for the investigation of non-additive genetic interactions. Application of traditional cross validation to a GWAS data set may result in poor consistency between the training and testing data set splits due to an imbalance of the interaction genotypes relative to the data as a whole. We propose a new cross validation method, proportional instance cross validation (PICV), that preserves the original distribution of an independent variable when splitting the data set into training and testing partitions. Results We apply PICV to simulated GWAS data with epistatic interactions of varying minor allele frequencies and prevalences and compare performance to that of a traditional cross validation procedure in which individuals are randomly allocated to training and testing partitions. Sensitivity and positive predictive value are significantly improved across all tested scenarios for PICV compared to traditional cross validation. We also apply PICV to GWAS data from a study of primary open-angle glaucoma to investigate a previously-reported interaction, which fails to significantly replicate; PICV however improves the consistency of testing and training results. Conclusions Application of traditional machine learning procedures to biomedical data may require modifications to better suit intrinsic characteristics of the data, such as the potential for highly imbalanced genotype distributions in the case of epistasis detection. The reproducibility of genetic interaction findings can be improved by considering this variable imbalance in cross validation implementation, such as with PICV. This approach may be extended to problems in other domains in which imbalanced variable distributions are a concern.
Collapse
Affiliation(s)
- Elizabeth R Piette
- 1Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Jason H Moore
- 2Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| |
Collapse
|
47
|
Ritchie MD, Van Steen K. The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. ANNALS OF TRANSLATIONAL MEDICINE 2018; 6:157. [PMID: 29862246 DOI: 10.21037/atm.2018.04.05] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
One of the primary goals in this era of precision medicine is to understand the biology of human diseases and their treatment, such that each individual patient receives the best possible treatment for their disease based on their genetic and environmental exposures. One way to work towards achieving this goal is to identify the environmental exposures and genetic variants that are relevant to each disease in question, as well as the complex interplay between genes and environment. Genome-wide association studies (GWAS) have allowed for a greater understanding of the genetic component of many complex traits. However, these genetic effects are largely small and thus, our ability to use these GWAS finding for precision medicine is limited. As more and more GWAS have been performed, rather than focusing only on common single nucleotide polymorphisms (SNPs) and additive genetic models, many researchers have begun to explore alternative heritable components of complex traits including rare variants, structural variants, epigenetics, and genetic interactions. While genetic interactions are a plausible reality that could explain some of the heritabliy that has not yet been identified, especially when one considers the identification of genetic interactions in model organisms as well as our understanding of biological complexity, still there are significant challenges and considerations in identifying these genetic interactions. Broadly, these can be summarized in three categories: abundance of methods, practical considerations, and biological interpretation. In this review, we will discuss these important elements in the search for genetic interactions along with some potential solutions. While genetic interactions are theoretically understood to be important for complex human disease, the body of evidence is still building to support this component of the underlying genetic architecture of complex human traits. Our hope is that more sophisticated modeling approaches and more robust computational techniques will enable the community to identify these important genetic interactions and improve our ability to implement precision medicine in the future.
Collapse
Affiliation(s)
- Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics Unit - BIO3, University of Liège, Liège, Belgium.,Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
48
|
Uppu S, Krishna A, Gopalan RP. A Review on Methods for Detecting SNP Interactions in High-Dimensional Genomic Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:599-612. [PMID: 28060710 DOI: 10.1109/tcbb.2016.2635125] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. This paper reviews the current methods and the related software packages to detect the SNP interactions that contribute to diseases. The issues that need to be considered when developing these models are addressed in this review. The paper also reviews the achievements in data simulation to evaluate the performance of these models. Further, it discusses the future of SNP interaction analysis.
Collapse
|
49
|
Cole BS, Hall MA, Urbanowicz RJ, Gilbert‐Diamond D, Moore JH. Analysis of Gene‐Gene Interactions. ACTA ACUST UNITED AC 2018; 95:1.14.1-1.14.10. [DOI: 10.1002/cphg.45] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Brian S. Cole
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Molly A. Hall
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
- The Center for Systems Genomics, The Pennsylvania State University, University Park Pennsylvania
| | - Ryan J. Urbanowicz
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Diane Gilbert‐Diamond
- Institute for Quantitative Biomedical Sciences at Dartmouth Hanover New Hampshire
- Department of Epidemiology, Geisel School of Medicine at Dartmouth Hanover New Hampshire
| | - Jason H. Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| |
Collapse
|
50
|
Verma SS, Ritchie MD. Another Round of "Clue" to Uncover the Mystery of Complex Traits. Genes (Basel) 2018; 9:E61. [PMID: 29370075 PMCID: PMC5852557 DOI: 10.3390/genes9020061] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/19/2017] [Accepted: 01/15/2018] [Indexed: 12/13/2022] Open
Abstract
A plethora of genetic association analyses have identified several genetic risk loci. Technological and statistical advancements have now led to the identification of not only common genetic variants, but also low-frequency variants, structural variants, and environmental factors, as well as multi-omics variations that affect the phenotypic variance of complex traits in a population, thus referred to as complex trait architecture. The concept of heritability, or the proportion of phenotypic variance due to genetic inheritance, has been studied for several decades, but its application is mainly in addressing the narrow sense heritability (or additive genetic component) from Genome-Wide Association Studies (GWAS). In this commentary, we reflect on our perspective on the complexity of understanding heritability for human traits in comparison to model organisms, highlighting another round of clues beyond GWAS and an alternative approach, investigating these clues comprehensively to help in elucidating the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Shefali Setia Verma
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Marylyn D Ritchie
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|