1
|
Chen K, Alexander LE, Mahgoub U, Okazaki Y, Higashi Y, Perera AM, Showman LJ, Loneman D, Dennison TS, Lopez M, Claussen R, Peddicord L, Saito K, Lauter N, Dorman KS, Nikolau BJ, Yandeau-Nelson MD. Dynamic relationships among pathways producing hydrocarbons and fatty acids of maize silk cuticular waxes. PLANT PHYSIOLOGY 2024; 195:2234-2255. [PMID: 38537616 PMCID: PMC11213258 DOI: 10.1093/plphys/kiae150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 02/06/2024] [Indexed: 06/30/2024]
Abstract
The hydrophobic cuticle is the first line of defense between aerial portions of plants and the external environment. On maize (Zea mays L.) silks, the cuticular cutin matrix is infused with cuticular waxes, consisting of a homologous series of very long-chain fatty acids (VLCFAs), aldehydes, and hydrocarbons. Together with VLC fatty-acyl-CoAs (VLCFA-CoAs), these metabolites serve as precursors, intermediates, and end-products of the cuticular wax biosynthetic pathway. To deconvolute the potentially confounding impacts of the change in silk microenvironment and silk development on this pathway, we profiled cuticular waxes on the silks of the inbreds B73 and Mo17, and their reciprocal hybrids. Multivariate interrogation of these metabolite abundance data demonstrates that VLCFA-CoAs and total free VLCFAs are positively correlated with the cuticular wax metabolome, and this metabolome is primarily affected by changes in the silk microenvironment and plant genotype. Moreover, the genotype effect on the pathway explains the increased accumulation of cuticular hydrocarbons with a concomitant reduction in cuticular VLCFA accumulation on B73 silks, suggesting that the conversion of VLCFA-CoAs to hydrocarbons is more effective in B73 than Mo17. Statistical modeling of the ratios between cuticular hydrocarbons and cuticular VLCFAs reveals a significant role of precursor chain length in determining this ratio. This study establishes the complexity of the product-precursor relationships within the silk cuticular wax-producing network by dissecting both the impact of genotype and the allocation of VLCFA-CoA precursors to different biological processes and demonstrates that longer chain VLCFA-CoAs are preferentially utilized for hydrocarbon biosynthesis.
Collapse
Affiliation(s)
- Keting Chen
- Department of Genetics, Development & Cell Biology, Iowa State University, Ames, IA 50011, USA
- Bioinformatics & Computational Biology Graduate Program, Iowa State University, Ames, IA 50011, USA
| | - Liza E Alexander
- Roy J. Carver Department of Biochemistry, Biophysics & Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | - Umnia Mahgoub
- Department of Genetics, Development & Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Yozo Okazaki
- Metabolomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa 230-0045, Japan
- Graduate School of Bioresources, Mie University, Tsu, Mie 514-8507, Japan
| | - Yasuhiro Higashi
- Metabolomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa 230-0045, Japan
| | - Ann M Perera
- W.M. Keck Metabolomics Research Laboratory, Iowa State University, Ames, IA 50011, USA
| | - Lucas J Showman
- W.M. Keck Metabolomics Research Laboratory, Iowa State University, Ames, IA 50011, USA
| | - Derek Loneman
- Department of Genetics, Development & Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Tesia S Dennison
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
- Interdepartmental Genetics & Genomics Graduate Program, Iowa State University, Ames, IA 50011, USA
| | - Miriam Lopez
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA 50011, USA
| | - Reid Claussen
- Department of Genetics, Development & Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Layton Peddicord
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
- Interdepartmental Genetics & Genomics Graduate Program, Iowa State University, Ames, IA 50011, USA
| | - Kazuki Saito
- Metabolomics Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa 230-0045, Japan
| | - Nick Lauter
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
- Interdepartmental Genetics & Genomics Graduate Program, Iowa State University, Ames, IA 50011, USA
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA 50011, USA
| | - Karin S Dorman
- Department of Genetics, Development & Cell Biology, Iowa State University, Ames, IA 50011, USA
- Bioinformatics & Computational Biology Graduate Program, Iowa State University, Ames, IA 50011, USA
- Department of Statistics, Iowa State University, Ames, IA 50011, USA
| | - Basil J Nikolau
- Bioinformatics & Computational Biology Graduate Program, Iowa State University, Ames, IA 50011, USA
- Roy J. Carver Department of Biochemistry, Biophysics & Molecular Biology, Iowa State University, Ames, IA 50011, USA
- Interdepartmental Genetics & Genomics Graduate Program, Iowa State University, Ames, IA 50011, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA 50011, USA
| | - Marna D Yandeau-Nelson
- Department of Genetics, Development & Cell Biology, Iowa State University, Ames, IA 50011, USA
- Bioinformatics & Computational Biology Graduate Program, Iowa State University, Ames, IA 50011, USA
- Interdepartmental Genetics & Genomics Graduate Program, Iowa State University, Ames, IA 50011, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
2
|
Cui T, El Mekkaoui K, Reinvall J, Havulinna AS, Marttinen P, Kaski S. Gene-gene interaction detection with deep learning. Commun Biol 2022; 5:1238. [PMID: 36371468 PMCID: PMC9653457 DOI: 10.1038/s42003-022-04186-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 10/27/2022] [Indexed: 11/13/2022] Open
Abstract
The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.
Collapse
Affiliation(s)
- Tianyu Cui
- Department of Computer Science, Aalto University, Espoo, Finland.
| | | | - Jaakko Reinvall
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Aki S Havulinna
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
- Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
| | - Pekka Marttinen
- Department of Computer Science, Aalto University, Espoo, Finland
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Samuel Kaski
- Department of Computer Science, Aalto University, Espoo, Finland
- Department of Computer Science, University of Manchester, Manchester, UK
| |
Collapse
|
3
|
Schaffner SL, Kobor MS. DNA methylation as a mediator of genetic and environmental influences on Parkinson's disease susceptibility: Impacts of alpha-Synuclein, physical activity, and pesticide exposure on the epigenome. Front Genet 2022; 13:971298. [PMID: 36061205 PMCID: PMC9437223 DOI: 10.3389/fgene.2022.971298] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 07/25/2022] [Indexed: 12/15/2022] Open
Abstract
Parkinson's disease (PD) is a neurodegenerative disorder with a complex etiology and increasing prevalence worldwide. As PD is influenced by a combination of genetic and environment/lifestyle factors in approximately 90% of cases, there is increasing interest in identification of the interindividual mechanisms underlying the development of PD as well as actionable lifestyle factors that can influence risk. This narrative review presents an outline of the genetic and environmental factors contributing to PD risk and explores the possible roles of cytosine methylation and hydroxymethylation in the etiology and/or as early-stage biomarkers of PD, with an emphasis on epigenome-wide association studies (EWAS) of PD conducted over the past decade. Specifically, we focused on variants in the SNCA gene, exposure to pesticides, and physical activity as key contributors to PD risk. Current research indicates that these factors individually impact the epigenome, particularly at the level of CpG methylation. There is also emerging evidence for interaction effects between genetic and environmental contributions to PD risk, possibly acting across multiple omics layers. We speculated that this may be one reason for the poor replicability of the results of EWAS for PD reported to date. Our goal is to provide direction for future epigenetics studies of PD to build upon existing foundations and leverage large datasets, new technologies, and relevant statistical approaches to further elucidate the etiology of this disease.
Collapse
Affiliation(s)
- Samantha L. Schaffner
- Edwin S. H. Leong Healthy Aging Program, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
- Department of Medical Genetics, British Columbia Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada
| | - Michael S. Kobor
- Edwin S. H. Leong Healthy Aging Program, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
- Department of Medical Genetics, British Columbia Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
4
|
Bottolo L, Banterle M, Richardson S, Ala-Korpela M, Järvelin MR, Lewin A. A computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional quantitative trait loci discovery. J R Stat Soc Ser C Appl Stat 2021; 70:886-908. [PMID: 35001978 PMCID: PMC7612194 DOI: 10.1111/rssc.12490] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Our work is motivated by the search for metabolite quantitative trait loci (QTL) in a cohort of more than 5000 people. There are 158 metabolites measured by NMR spectroscopy in the 31-year follow-up of the Northern Finland Birth Cohort 1966 (NFBC66). These metabolites, as with many multivariate phenotypes produced by high-throughput biomarker technology, exhibit strong correlation structures. Existing approaches for combining such data with genetic variants for multivariate QTL analysis generally ignore phenotypic correlations or make restrictive assumptions about the associations between phenotypes and genetic loci. We present a computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional data, with cell-sparse variable selection and sparse graphical structure for covariance selection. Cell sparsity allows different phenotype responses to be associated with different genetic predictors and the graphical structure is used to represent the conditional dependencies between phenotype variables. To achieve feasible computation of the large model space, we exploit a factorisation of the covariance matrix. Applying the model to the NFBC66 data with 9000 directly genotyped single nucleotide polymorphisms, we are able to simultaneously estimate genotype-phenotype associations and the residual dependence structure among the metabolites. The R package BayesSUR with full documentation is available at https://cran.r-project.org/web/packages/BayesSUR/.
Collapse
Affiliation(s)
- Leonardo Bottolo
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
- The Alan Turing Institute, London, UK
- MRC Biostatistics Unit, Cambridge, UK
| | - Marco Banterle
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| | - Sylvia Richardson
- The Alan Turing Institute, London, UK
- MRC Biostatistics Unit, Cambridge, UK
| | - Mika Ala-Korpela
- Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland
- NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland
| | - Marjo-Riitta Järvelin
- Center for Life Course Health Research, University of Oulu, Oulu, Finland
- Biocenter Oulu, University of Oulu, Oulu, Finland
- Department of Epidemiology and Biostatistics, Imperial College London, London, UK
- MRC-PHE Centre for Environment and Health, Imperial College London, London, UK
- Department of Life Sciences, Brunel University London, Uxbridge, UK
| | - Alex Lewin
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
5
|
Yang D, Goh G, Wang H. A fully Bayesian approach to sparse reduced-rank multivariate regression. STAT MODEL 2020. [DOI: 10.1177/1471082x20948697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In the context of high-dimensional multivariate linear regression, sparse reduced-rank regression (SRRR) provides a way to handle both variable selection and low-rank estimation problems. Although there has been extensive research on SRRR, statistical inference procedures that deal with the uncertainty due to variable selection and rank reduction are still limited. To fill this research gap, we develop a fully Bayesian approach to SRRR. A major difficulty that occurs in a fully Bayesian framework is that the dimension of parameter space varies with the selected variables and the reduced-rank. Due to the varying-dimensional problems, traditional Markov chain Monte Carlo (MCMC) methods such as Gibbs sampler and Metropolis-Hastings algorithm are inapplicable in our Bayesian framework. To address this issue, we propose a new posterior computation procedure based on the Laplace approximation within the collapsed Gibbs sampler. A key feature of our fully Bayesian method is that the model uncertainty is automatically integrated out by the proposed MCMC computation. The proposed method is examined via simulation study and real data analysis.
Collapse
Affiliation(s)
- Dunfu Yang
- Department of Statistics, Kansas State University, Manhattan, KS, USA
| | - Gyuhyeong Goh
- Department of Statistics, Kansas State University, Manhattan, KS, USA
| | - Haiyan Wang
- Department of Statistics, Kansas State University, Manhattan, KS, USA
| |
Collapse
|
6
|
Nath AP, Ritchie SC, Grinberg NF, Tang HHF, Huang QQ, Teo SM, Ahola-Olli AV, Würtz P, Havulinna AS, Santalahti K, Pitkänen N, Lehtimäki T, Kähönen M, Lyytikäinen LP, Raitoharju E, Seppälä I, Sarin AP, Ripatti S, Palotie A, Perola M, Viikari JS, Jalkanen S, Maksimow M, Salmi M, Wallace C, Raitakari OT, Salomaa V, Abraham G, Kettunen J, Inouye M. Multivariate Genome-wide Association Analysis of a Cytokine Network Reveals Variants with Widespread Immune, Haematological, and Cardiometabolic Pleiotropy. Am J Hum Genet 2019; 105:1076-1090. [PMID: 31679650 DOI: 10.1016/j.ajhg.2019.10.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 09/30/2019] [Indexed: 01/18/2023] Open
Abstract
Cytokines are essential regulatory components of the immune system, and their aberrant levels have been linked to many disease states. Despite increasing evidence that cytokines operate in concert, many of the physiological interactions between cytokines, and the shared genetic architecture that underlies them, remain unknown. Here, we aimed to identify and characterize genetic variants with pleiotropic effects on cytokines. Using three population-based cohorts (n = 9,263), we performed multivariate genome-wide association studies (GWAS) for a correlation network of 11 circulating cytokines, then combined our results in meta-analysis. We identified a total of eight loci significantly associated with the cytokine network, of which two (PDGFRB and ABO) had not been detected previously. In addition, conditional analyses revealed a further four secondary signals at three known cytokine loci. Integration, through the use of Bayesian colocalization analysis, of publicly available GWAS summary statistics with the cytokine network associations revealed shared causal variants between the eight cytokine loci and other traits; in particular, cytokine network variants at the ABO, SERPINE2, and ZFPM2 loci showed pleiotropic effects on the production of immune-related proteins, on metabolic traits such as lipoprotein and lipid levels, on blood-cell-related traits such as platelet count, and on disease traits such as coronary artery disease and type 2 diabetes.
Collapse
Affiliation(s)
- Artika P Nath
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom; Department of Microbiology and Immunology, University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Scott C Ritchie
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom
| | - Nastasiya F Grinberg
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Department of Medicine, University of Cambridge, Cambridge CB2 0AW, United Kingdom
| | - Howard Ho-Fung Tang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - Qin Qin Huang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia; Department of Clinical Pathology, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Shu Mei Teo
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom
| | - Ari V Ahola-Olli
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA; Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland
| | - Peter Würtz
- Research Programs Unit, Diabetes and Obesity, University of Helsinki, Helsinki 00014, Finland; Nightingale Health Ltd., Helsinki 00300, Finland
| | - Aki S Havulinna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland; National Institute of Health and Welfare, Helsinki 00271, Finland
| | - Kristiina Santalahti
- Medicity Research Laboratory, Department of Medical Microbiology and Immunology, University of Turku, Turku 20520, Finland
| | - Niina Pitkänen
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku 20520, Finland
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories, Tampere 33520, Finland; Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere 33520, Finland
| | - Mika Kähönen
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere 33520, Finland; Department of Clinical Physiology, Tampere University Hospital, Tampere 33521, Finland
| | - Leo-Pekka Lyytikäinen
- Department of Clinical Chemistry, Fimlab Laboratories, Tampere 33520, Finland; Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere 33520, Finland
| | - Emma Raitoharju
- Department of Clinical Chemistry, Fimlab Laboratories, Tampere 33520, Finland; Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere 33520, Finland
| | - Ilkka Seppälä
- Department of Clinical Chemistry, Fimlab Laboratories, Tampere 33520, Finland; Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere 33520, Finland
| | - Antti-Pekka Sarin
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland; National Institute of Health and Welfare, Helsinki 00271, Finland
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland; Department of Public Health, University of Helsinki, Helsinki 00014, Finland; Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Aarno Palotie
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland; Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA; Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts 02114, USA; Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Markus Perola
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland; National Institute of Health and Welfare, Helsinki 00271, Finland
| | - Jorma S Viikari
- Department of Medicine, University of Turku, Turku 20520, Finland; Division of Medicine, Turku University Hospital, Turku 20520, Finland
| | - Sirpa Jalkanen
- Medicity Research Laboratory, Department of Medical Microbiology and Immunology, University of Turku, Turku 20520, Finland
| | - Mikael Maksimow
- Medicity Research Laboratory, Department of Medical Microbiology and Immunology, University of Turku, Turku 20520, Finland
| | - Marko Salmi
- Medicity Research Laboratory and Institute of Biomedicine, University of Turku, Turku 20520, Finland
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Department of Medicine, University of Cambridge, Cambridge CB2 0AW, United Kingdom; MRC Biostatistics Unit, Institute of Public Health, Cambridge CB2 0SR, United Kingdom
| | - Olli T Raitakari
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku 20520, Finland; The Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku 20520, Finland
| | - Veikko Salomaa
- Medicity Research Laboratory, Department of Medical Microbiology and Immunology, University of Turku, Turku 20520, Finland
| | - Gad Abraham
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom; Department of Clinical Pathology, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Johannes Kettunen
- Medicity Research Laboratory, Department of Medical Microbiology and Immunology, University of Turku, Turku 20520, Finland; Computational Medicine, Centre for Life Course Health Research, University of Oulu, Oulu 90014, Finland; NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio 70211, Finland; Biocenter Oulu, University of Oulu, Oulu 90014, Finland
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom; Department of Clinical Pathology, University of Melbourne, Parkville, Victoria 3010, Australia; The Alan Turing Institute, London, United Kingdom.
| |
Collapse
|
7
|
Effect of non-normality and low count variants on cross-phenotype association tests in GWAS. Eur J Hum Genet 2019; 28:300-312. [PMID: 31582815 DOI: 10.1038/s41431-019-0514-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 09/01/2019] [Accepted: 09/05/2019] [Indexed: 01/21/2023] Open
Abstract
Many complex human diseases, such as type 2 diabetes, are characterized by multiple underlying traits/phenotypes that have substantially shared genetic architecture. Multivariate analysis of correlated traits has the potential to increase the power of detecting underlying common genetic loci. Several cross-phenotype association methods have been proposed-some require individual-level data on traits and genotypes, while the others require only summary-level data. In this article, we explore whether non-normality of multivariate trait distribution affects the inference from some of the existing multi-trait methods and how that effect is dependent on the allele count of the genetic variant being tested. We find that most of these tests are susceptible to biases that lead to spurious association signals. Even after controlling for confounders that may contribute to non-normality and then applying inverse normal transformation on the residuals of each trait, these tests may have inflated type I errors for variants with low minor allele counts (MACs). A likelihood ratio test of association based on the ordinal regression of individual-level genotype conditional on the traits seems to be the least biased and can maintain type I error when the MAC is reasonably large (e.g., MAC > 30). Application of these methods to publicly available summary statistics of eight amino acid traits on European samples seem to exhibit systematic inflation (especially for variants with low MAC), which is consistent with our findings from simulation experiments.
Collapse
|
8
|
Leppäaho E, Renvall H, Salmela E, Kere J, Salmelin R, Kaski S. Discovering heritable modes of MEG spectral power. Hum Brain Mapp 2019; 40:1391-1402. [PMID: 30600573 PMCID: PMC6590382 DOI: 10.1002/hbm.24454] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 09/27/2018] [Accepted: 10/19/2018] [Indexed: 12/14/2022] Open
Abstract
Brain structure and many brain functions are known to be genetically controlled, but direct links between neuroimaging measures and their underlying cellular-level determinants remain largely undiscovered. Here, we adopt a novel computational method for examining potential similarities in high-dimensional brain imaging data between siblings. We examine oscillatory brain activity measured with magnetoencephalography (MEG) in 201 healthy siblings and apply Bayesian reduced-rank regression to extract a low-dimensional representation of familial features in the participants' spectral power structure. Our results show that the structure of the overall spectral power at 1-90 Hz is a highly conspicuous feature that not only relates siblings to each other but also has very high consistency within participants' own data, irrespective of the exact experimental state of the participant. The analysis is extended by seeking genetic associations for low-dimensional descriptions of the oscillatory brain activity. The observed variability in the MEG spectral power structure was associated with SDK1 (sidekick cell adhesion molecule 1) and suggestively with several other genes that function, for example, in brain development. The current results highlight the potential of sophisticated computational methods in combining molecular and neuroimaging levels for exploring brain functions, even for high-dimensional data limited to a few hundred participants.
Collapse
Affiliation(s)
- Eemeli Leppäaho
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Helsinki, Finland.,Aalto NeuroImaging, Aalto University, Helsinki, Finland
| | - Elina Salmela
- Department of Biosciences, University of Helsinki, Helsinki, Finland
| | - Juha Kere
- Molecular Neurology Research Program, University of Helsinki, Folkhälsan Institute of Genetics, Helsinki, Finland.,Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden.,School of Basic and Medical Biosciences, King's College London, Guy's Hospital, London, United Kingdom
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Helsinki, Finland.,Aalto NeuroImaging, Aalto University, Helsinki, Finland
| | - Samuel Kaski
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| |
Collapse
|
9
|
Sundin I, Peltola T, Micallef L, Afrabandpey H, Soare M, Mamun Majumder M, Daee P, He C, Serim B, Havulinna A, Heckman C, Jacucci G, Marttinen P, Kaski S. Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge. Bioinformatics 2018; 34:i395-i403. [PMID: 29949984 PMCID: PMC6022689 DOI: 10.1093/bioinformatics/bty257] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Motivation Precision medicine requires the ability to predict the efficacies of different treatments for a given individual using high-dimensional genomic measurements. However, identifying predictive features remains a challenge when the sample size is small. Incorporating expert knowledge offers a promising approach to improve predictions, but collecting such knowledge is laborious if the number of candidate features is very large. Results We introduce a probabilistic framework to incorporate expert feedback about the impact of genomic measurements on the outcome of interest and present a novel approach to collect the feedback efficiently, based on Bayesian experimental design. The new approach outperformed other recent alternatives in two medical applications: prediction of metabolic traits and prediction of sensitivity of cancer cells to different drugs, both using genomic features as predictors. Furthermore, the intelligent approach to collect feedback reduced the workload of the expert to approximately 11%, compared to a baseline approach. Availability and implementation Source code implementing the introduced computational methods is freely available at https://github.com/AaltoPML/knowledge-elicitation-for-precision-medicine. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Iiris Sundin
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Tomi Peltola
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Luana Micallef
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Homayun Afrabandpey
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Marta Soare
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Muntasir Mamun Majumder
- Institute for Molecular Medicine Finland FIMM, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Pedram Daee
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Chen He
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Baris Serim
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Aki Havulinna
- Institute for Molecular Medicine Finland FIMM, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland.,National Institute for Health and Welfare THL, Helsinki, Finland
| | - Caroline Heckman
- Institute for Molecular Medicine Finland FIMM, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Giulio Jacucci
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Pekka Marttinen
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Samuel Kaski
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| |
Collapse
|
10
|
Würtz P, Kangas AJ, Soininen P, Lawlor DA, Davey Smith G, Ala-Korpela M. Quantitative Serum Nuclear Magnetic Resonance Metabolomics in Large-Scale Epidemiology: A Primer on -Omic Technologies. Am J Epidemiol 2017; 186:1084-1096. [PMID: 29106475 PMCID: PMC5860146 DOI: 10.1093/aje/kwx016] [Citation(s) in RCA: 371] [Impact Index Per Article: 46.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Accepted: 01/19/2017] [Indexed: 12/13/2022] Open
Abstract
Detailed metabolic profiling in large-scale epidemiologic studies has uncovered novel biomarkers for cardiometabolic diseases and clarified the molecular associations of established risk factors. A quantitative metabolomics platform based on nuclear magnetic resonance spectroscopy has found widespread use, already profiling over 400,000 blood samples. Over 200 metabolic measures are quantified per sample; in addition to many biomarkers routinely used in epidemiology, the method simultaneously provides fine-grained lipoprotein subclass profiling and quantification of circulating fatty acids, amino acids, gluconeogenesis-related metabolites, and many other molecules from multiple metabolic pathways. Here we focus on applications of magnetic resonance metabolomics for quantifying circulating biomarkers in large-scale epidemiology. We highlight the molecular characterization of risk factors, use of Mendelian randomization, and the key issues of study design and analyses of metabolic profiling for epidemiology. We also detail how integration of metabolic profiling data with genetics can enhance drug development. We discuss why quantitative metabolic profiling is becoming widespread in epidemiology and biobanking. Although large-scale applications of metabolic profiling are still novel, it seems likely that comprehensive biomarker data will contribute to etiologic understanding of various diseases and abilities to predict disease risks, with the potential to translate into multiple clinical settings.
Collapse
Affiliation(s)
- Peter Würtz
- Correspondence to Dr. Peter Würtz, Computational Medicine, Faculty of Medicine, Aapistie 5A, P.O. Box 5000, FI-90014 University of Oulu, Finland (e-mail: ); or Dr. Mika Ala-Korpela, Computational Medicine, Faculty of Medicine, Aapistie 5A, P.O. Box 5000, FI-90014 University of Oulu, Finland (e-mail: )
| | | | | | | | | | - Mika Ala-Korpela
- Correspondence to Dr. Peter Würtz, Computational Medicine, Faculty of Medicine, Aapistie 5A, P.O. Box 5000, FI-90014 University of Oulu, Finland (e-mail: ); or Dr. Mika Ala-Korpela, Computational Medicine, Faculty of Medicine, Aapistie 5A, P.O. Box 5000, FI-90014 University of Oulu, Finland (e-mail: )
| |
Collapse
|
11
|
Greenlaw K, Szefer E, Graham J, Lesperance M, Nathoo FS. A Bayesian group sparse multi-task regression model for imaging genetics. Bioinformatics 2017; 33:2513-2522. [PMID: 28419235 PMCID: PMC5870710 DOI: 10.1093/bioinformatics/btx215] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Revised: 02/20/2017] [Accepted: 04/12/2017] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Recent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. Wang et al. have developed an approach for the analysis of imaging genomic studies using penalized multi-task regression with regularization based on a novel group l2,1-norm penalty which encourages structured sparsity at both the gene level and SNP level. While incorporating a number of useful features, the proposed method only furnishes a point estimate of the regression coefficients; techniques for conducting statistical inference are not provided. A new Bayesian method is proposed here to overcome this limitation. RESULTS We develop a Bayesian hierarchical modeling formulation where the posterior mode corresponds to the estimator proposed by Wang et al. and an approach that allows for full posterior inference including the construction of interval estimates for the regression parameters. We show that the proposed hierarchical model can be expressed as a three-level Gaussian scale mixture and this representation facilitates the use of a Gibbs sampling algorithm for posterior simulation. Simulation studies demonstrate that the interval estimates obtained using our approach achieve adequate coverage probabilities that outperform those obtained from the nonparametric bootstrap. Our proposed methodology is applied to the analysis of neuroimaging and genetic data collected as part of the Alzheimer's Disease Neuroimaging Initiative (ADNI), and this analysis of the ADNI cohort demonstrates clearly the value added of incorporating interval estimation beyond only point estimation when relating SNPs to brain imaging endophenotypes. AVAILABILITY AND IMPLEMENTATION Software and sample data is available as an R package 'bgsmtr' that can be downloaded from The Comprehensive R Archive Network (CRAN). CONTACT nathoo@uvic.ca. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Keelin Greenlaw
- Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | - Elena Szefer
- Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Jinko Graham
- Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Mary Lesperance
- Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | - Farouk S Nathoo
- Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | | |
Collapse
|
12
|
Kim DS, Jackson AU, Li YK, Stringham HM, Kuusisto J, Kangas AJ, Soininen P, Ala-Korpela M, Burant CF, Salomaa V, Boehnke M, Laakso M, Speliotes EK. Novel association of TM6SF2 rs58542926 genotype with increased serum tyrosine levels and decreased apoB-100 particles in Finns. J Lipid Res 2017; 58:1471-1481. [PMID: 28539357 PMCID: PMC5496043 DOI: 10.1194/jlr.p076034] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 05/12/2017] [Indexed: 02/06/2023] Open
Abstract
A glutamate-to-lysine variant (rs58542926-T) in transmembrane 6 superfamily member 2 (TM6SF2) is associated with increased fatty liver disease and diabetes in conjunction with decreased cardiovascular disease risk. To identify mediators of the effects of TM6SF2, we tested for associations between rs58542926-T and serum lipoprotein/metabolite measures in cross-sectional data from nondiabetic statin-naïve participants. We identified independent associations between rs58542926-T and apoB-100 particles (β = -0.057 g/l, P = 1.99 × 10-14) and tyrosine levels (β = 0.0020 mmol/l, P = 1.10 × 10-8), controlling for potential confounders, in 6,929 Finnish men. The association between rs58542926-T and apoB-100 was confirmed in an independent sample of 2,196 Finnish individuals from the FINRISK study (βreplication = -0.029, Preplication = 0.029). Secondary analyses demonstrated an rs58542926-T dose-dependent decrease in particle concentration, cholesterol, and triglyceride (TG) content for VLDL and LDL particles (P < 0.001 for all). No significant associations between rs58542926-T and HDL measures were observed. TM6SF2 SNP rs58542926-T and tyrosine levels were associated with increased incident T2D risk in both METSIM and FINRISK. Decreased liver production/secretion of VLDL, decreased cholesterol and TGs in VLDL/LDL particles in serum, and increased tyrosine levels identify possible mechanisms by which rs58542926-T exerts its effects on increasing risk of fatty liver disease, decreasing cardiovascular disease, and increasing diabetes risk, respectively.
Collapse
Affiliation(s)
- Daniel Seung Kim
- Department of Biostatistics and Center for Statistical Genetics,University of Michigan, Ann Arbor, MI
| | - Anne U. Jackson
- Department of Biostatistics and Center for Statistical Genetics,University of Michigan, Ann Arbor, MI
| | - Yatong K. Li
- Department of Biostatistics and Center for Statistical Genetics,University of Michigan, Ann Arbor, MI
| | - Heather M. Stringham
- Department of Biostatistics and Center for Statistical Genetics,University of Michigan, Ann Arbor, MI
| | - FinMetSeq Investigators
- Department of Biostatistics and Center for Statistical Genetics,University of Michigan, Ann Arbor, MI
- Division of Metabolism, Endocrinology, and Diabetes, Department of Medicine,University of Michigan, Ann Arbor, MI
- Center for Computational Medicine and Bioinformatics,University of Michigan, Ann Arbor, MI
- Division of Gastroenterology, Department of Medicine,University of Michigan, Ann Arbor, MI
- Institute of Clinical Medicine, Internal Medicine,University of Eastern Finland, Kuopio, Finland
- Nuclear Magnetic Resonance Metabolomics Laboratory, School of Pharmacy,University of Eastern Finland, Kuopio, Finland
- Department of Medicine,Kuopio University Hospital, Kuopio, Finland
- Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland
- Computational Medicine, School of Social and Community Medicine and Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- National Institute for Health and Welfare, Helsinki, Finland
| | - Johanna Kuusisto
- Institute of Clinical Medicine, Internal Medicine,University of Eastern Finland, Kuopio, Finland
- Department of Medicine,Kuopio University Hospital, Kuopio, Finland
| | - Antti J. Kangas
- Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland
| | - Pasi Soininen
- Nuclear Magnetic Resonance Metabolomics Laboratory, School of Pharmacy,University of Eastern Finland, Kuopio, Finland
- Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland
| | - Mika Ala-Korpela
- Nuclear Magnetic Resonance Metabolomics Laboratory, School of Pharmacy,University of Eastern Finland, Kuopio, Finland
- Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland
- Computational Medicine, School of Social and Community Medicine and Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
| | - Charles F. Burant
- Division of Metabolism, Endocrinology, and Diabetes, Department of Medicine,University of Michigan, Ann Arbor, MI
- Center for Computational Medicine and Bioinformatics,University of Michigan, Ann Arbor, MI
| | - Veikko Salomaa
- National Institute for Health and Welfare, Helsinki, Finland
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics,University of Michigan, Ann Arbor, MI
| | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine,University of Eastern Finland, Kuopio, Finland
- Department of Medicine,Kuopio University Hospital, Kuopio, Finland
| | - Elizabeth K. Speliotes
- Center for Computational Medicine and Bioinformatics,University of Michigan, Ann Arbor, MI
- Division of Gastroenterology, Department of Medicine,University of Michigan, Ann Arbor, MI
| |
Collapse
|
13
|
Kaakinen M, Mägi R, Fischer K, Heikkinen J, Järvelin MR, Morris AP, Prokopenko I. A rare-variant test for high-dimensional data. Eur J Hum Genet 2017; 25:988-994. [PMID: 28537275 PMCID: PMC5513099 DOI: 10.1038/ejhg.2017.90] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Revised: 02/17/2017] [Accepted: 03/28/2017] [Indexed: 12/22/2022] Open
Abstract
Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and use of a wider allele frequency range, including rare variants (RVs). MPA methods for single-variant association have been proposed, but given their low power for RVs, more efficient approaches are required. We propose multi-phenotype analysis of rare variants (MARV), a burden test-based method for RVs extended to the joint analysis of multiple phenotypes through a powerful reverse regression technique. Specifically, MARV models the proportion of RVs at which minor alleles are carried by individuals within a genomic region as a linear combination of multiple phenotypes, which can be both binary and continuous, and the method accommodates directly the genotyped and imputed data. The full model, including all phenotypes, is tested for association for discovery, and a more thorough dissection of the phenotype combinations for any set of RVs is also enabled. We show, via simulations, that the type I error rate is well controlled under various correlations between two continuous phenotypes, and that the method outperforms a univariate burden test in all considered scenarios. Application of MARV to 4876 individuals from the Northern Finland Birth Cohort 1966 for triglycerides, high- and low-density lipoprotein cholesterols highlights known loci with stronger signals of association than those observed in univariate RV analyses and suggests novel RV effects for these lipid traits.
Collapse
Affiliation(s)
- Marika Kaakinen
- Department of Genomics of Common Disease, Imperial College London, London, UK
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Krista Fischer
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Jani Heikkinen
- Department of Genomics of Common Disease, Imperial College London, London, UK.,Neuroepidemiology and Ageing (NEA) Research Unit, Imperial College London, London, UK
| | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.,Center for Life Course Health Research, University of Oulu, Oulu, Finland.,Unit of Primary Care, Oulu University Hospital, Oulu, Finland.,Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Andrew P Morris
- Department of Biostatistics, University of Liverpool, Liverpool, UK
| | - Inga Prokopenko
- Department of Genomics of Common Disease, Imperial College London, London, UK
| |
Collapse
|
14
|
Lu ZH, Khondker Z, Ibrahim JG, Wang Y, Zhu H. Bayesian longitudinal low-rank regression models for imaging genetic data from longitudinal studies. Neuroimage 2017; 149:305-322. [PMID: 28143775 DOI: 10.1016/j.neuroimage.2017.01.052] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Revised: 12/27/2016] [Accepted: 01/22/2017] [Indexed: 12/29/2022] Open
Abstract
To perform a joint analysis of multivariate neuroimaging phenotypes and candidate genetic markers obtained from longitudinal studies, we develop a Bayesian longitudinal low-rank regression (L2R2) model. The L2R2 model integrates three key methodologies: a low-rank matrix for approximating the high-dimensional regression coefficient matrices corresponding to the genetic main effects and their interactions with time, penalized splines for characterizing the overall time effect, and a sparse factor analysis model coupled with random effects for capturing within-subject spatio-temporal correlations of longitudinal phenotypes. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. Simulations show that the L2R2 model outperforms several other competing methods. We apply the L2R2 model to investigate the effect of single nucleotide polymorphisms (SNPs) on the top 10 and top 40 previously reported Alzheimer disease-associated genes. We also identify associations between the interactions of these SNPs with patient age and the tissue volumes of 93 regions of interest from patients' brain images obtained from the Alzheimer's Disease Neuroimaging Initiative.
Collapse
Affiliation(s)
- Zhao-Hua Lu
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Zakaria Khondker
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yue Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| | | |
Collapse
|
15
|
Cichonska A, Rousu J, Marttinen P, Kangas AJ, Soininen P, Lehtimäki T, Raitakari OT, Järvelin MR, Salomaa V, Ala-Korpela M, Ripatti S, Pirinen M. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics 2016; 32:1981-9. [PMID: 27153689 PMCID: PMC4920109 DOI: 10.1093/bioinformatics/btw052] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 12/04/2015] [Accepted: 01/19/2016] [Indexed: 01/22/2023] Open
Abstract
MOTIVATION A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. RESULTS We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. AVAILABILITY AND IMPLEMENTATION Code is available at https://github.com/aalto-ics-kepaco CONTACTS anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anna Cichonska
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland, Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| | - Juho Rousu
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| | - Pekka Marttinen
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| | - Antti J Kangas
- Computational Medicine, University of Oulu, Oulu University Hospital and Biocenter Oulu, Oulu, Finland
| | - Pasi Soininen
- Computational Medicine, University of Oulu, Oulu University Hospital and Biocenter Oulu, Oulu, Finland, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland
| | - Olli T Raitakari
- Department of Clinical Physiology and Nuclear Medicine, University of Turku and Turku University Hospital, Turku, Finland, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland
| | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment & Health, School of Public Health, Imperial College London, London, UK, Centre for Life Course Epidemiology, Faculty of Medicine, University of Oulu, Oulu, Finland, Biocenter Oulu, University of Oulu, Oulu, Finland, Unit of Primary Care, Oulu University Hospital, Oulu, Finland
| | - Veikko Salomaa
- National Institute for Health and Welfare, Helsinki, Finland
| | - Mika Ala-Korpela
- Computational Medicine, University of Oulu, Oulu University Hospital and Biocenter Oulu, Oulu, Finland, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland, Public Health, University of Helsinki, Helsinki, Finland and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Matti Pirinen
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland
| |
Collapse
|
16
|
Richardson S, Tseng GC, Sun W. Statistical Methods in Integrative Genomics. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2016; 3:181-209. [PMID: 27482531 PMCID: PMC4963036 DOI: 10.1146/annurev-statistics-041715-033506] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions.
Collapse
Affiliation(s)
- Sylvia Richardson
- MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, CB2 0SR, United Kingdom
| | - George C. Tseng
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261
| | - Wei Sun
- Department of Biostatistics, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 27516
| |
Collapse
|
17
|
Misra BB, van der Hooft JJJ. Updates in metabolomics tools and resources: 2014-2015. Electrophoresis 2015; 37:86-110. [DOI: 10.1002/elps.201500417] [Citation(s) in RCA: 100] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Revised: 10/04/2015] [Accepted: 10/05/2015] [Indexed: 12/12/2022]
Affiliation(s)
- Biswapriya B. Misra
- Department of Biology, Genetics Institute; University of Florida; Gainesville FL USA
| | | |
Collapse
|
18
|
Lewin A, Saadi H, Peters JE, Moreno-Moral A, Lee JC, Smith KGC, Petretto E, Bottolo L, Richardson S. MT-HESS: an efficient Bayesian approach for simultaneous association detection in OMICS datasets, with application to eQTL mapping in multiple tissues. Bioinformatics 2015; 32:523-32. [PMID: 26504141 PMCID: PMC4743623 DOI: 10.1093/bioinformatics/btv568] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Accepted: 09/03/2015] [Indexed: 01/22/2023] Open
Abstract
MOTIVATION Analysing the joint association between a large set of responses and predictors is a fundamental statistical task in integrative genomics, exemplified by numerous expression Quantitative Trait Loci (eQTL) studies. Of particular interest are the so-called ': hotspots ': , important genetic variants that regulate the expression of many genes. Recently, attention has focussed on whether eQTLs are common to several tissues, cell-types or, more generally, conditions or whether they are specific to a particular condition. RESULTS We have implemented MT-HESS, a Bayesian hierarchical model that analyses the association between a large set of predictors, e.g. SNPs, and many responses, e.g. gene expression, in multiple tissues, cells or conditions. Our Bayesian sparse regression algorithm goes beyond ': one-at-a-time ': association tests between SNPs and responses and uses a fully multivariate model search across all linear combinations of SNPs, coupled with a model of the correlation between condition/tissue-specific responses. In addition, we use a hierarchical structure to leverage shared information across different genes, thus improving the detection of hotspots. We show the increase of power resulting from our new approach in an extensive simulation study. Our analysis of two case studies highlights new hotspots that would remain undetected by standard approaches and shows how greater prediction power can be achieved when several tissues are jointly considered. AVAILABILITY AND IMPLEMENTATION C[Formula: see text] source code and documentation including compilation instructions are available under GNU licence at http://www.mrc-bsu.cam.ac.uk/software/.
Collapse
Affiliation(s)
- Alex Lewin
- Department of Mathematics, Brunel University London
| | - Habib Saadi
- Department of Epidemiology and Biostatistics, Imperial College London, London
| | - James E Peters
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge, MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge
| | | | - James C Lee
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge
| | - Kenneth G C Smith
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge
| | - Enrico Petretto
- MRC Clinical Sciences Centre, Imperial College London, London, UK, Duke-NUS Graduate Medical School, Singapore, Singapore
| | - Leonardo Bottolo
- Department of Mathematics, Imperial College London, London, UK and Department of Medical Genetics, University of Cambridge
| | - Sylvia Richardson
- MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge
| |
Collapse
|
19
|
Okser S, Pahikkala T, Airola A, Salakoski T, Ripatti S, Aittokallio T. Regularized machine learning in the genetic prediction of complex traits. PLoS Genet 2014; 10:e1004754. [PMID: 25393026 PMCID: PMC4230844 DOI: 10.1371/journal.pgen.1004754] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Affiliation(s)
- Sebastian Okser
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Tapio Pahikkala
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Antti Airola
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Tapio Salakoski
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Samuli Ripatti
- Hjelt Institute, University of Helsinki, Helsinki, Finland
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | - Tero Aittokallio
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- * E-mail:
| |
Collapse
|