1
|
Mudappathi R, Patton T, Chen H, Yang P, Sun Z, Wang P, Shi CX, Wang J, Liu L. reg-eQTL: Integrating transcription factor effects to unveil regulatory variants. Am J Hum Genet 2025; 112:659-674. [PMID: 39922197 PMCID: PMC11947170 DOI: 10.1016/j.ajhg.2025.01.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 01/09/2025] [Accepted: 01/15/2025] [Indexed: 02/10/2025] Open
Abstract
Regulatory single-nucleotide variants (rSNVs) in noncoding regions of the genome play a crucial role in gene transcription by altering transcription factor (TF) binding, chromatin states, and other epigenetic modifications. Existing expression quantitative trait locus (eQTL) methods identify genomic loci associated with gene-expression changes, but they often fall short in pinpointing causal variants. We introduce reg-eQTL, a computational method that incorporates TF effects and interactions with genetic variants into eQTL analysis. This approach provides deeper insights into the regulatory mechanisms, bringing us one step closer to identifying potential causal variants by uncovering how TFs interact with SNVs to influence gene expression. This method defines a trio consisting of a genetic variant, a target gene, and a TF and tests its impact on gene transcription. In comprehensive simulations, reg-eQTL shows improved power of detecting rSNVs with low population frequency, weak effects, and synergetic interaction with TF as compared to traditional eQTL methods. Application of reg-eQTL to GTEx data from lung, brain, and whole-blood tissues uncovered regulatory trios that include eQTLs and increased the number of eQTLs shared across tissue types. Regulatory networks constructed on the basis of these trios reveal intricate gene regulation across tissue types.
Collapse
Affiliation(s)
- Rekha Mudappathi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA; Division of Epidemiology, Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Tatiana Patton
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Hai Chen
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA; Division of Epidemiology, Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Ping Yang
- Division of Epidemiology, Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Zhifu Sun
- Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Panwen Wang
- Department of Quantitative Health Sciences and Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ, USA
| | - Chang-Xin Shi
- Division of Hematology/Oncology, Department of Medicine, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Junwen Wang
- Department of Quantitative Health Sciences and Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ, USA; Division of Applied Oral Sciences & Community Dental Care, Faculty of Dentistry, The University of Hong Kong, 34 Hospital Road, Hong Kong SAR, China
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA; Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA.
| |
Collapse
|
2
|
Huang C, Butterly CR, Moody D, Pourkheirandish M. Mini review: Targeting below-ground plant performance to improve nitrogen use efficiency (NUE) in barley. Front Genet 2023; 13:1060304. [PMID: 36935938 PMCID: PMC10017981 DOI: 10.3389/fgene.2022.1060304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 12/19/2022] [Indexed: 03/06/2023] Open
Abstract
Nitrogen (N) fertilizer is one of the major inputs for grain crops including barley and its usage is increasing globally. However, N use efficiency (NUE) is low in cereal crops, leading to higher production costs, unfulfilled grain yield potential and environmental hazards. N uptake is initiated from plant root tips but a very limited number of studies have been conducted on roots relevant to NUE specifically. In this review, we used barley, the fourth most important cereal crop, as the primary study plant to investigate this topic. We first highlighted the recent progress and study gaps in genetic analysis results, primarily, the genome-wide association study (GWAS) regarding both biological and statistical considerations. In addition, different factors contributing to NUE are discussed in terms of root morphological and anatomical traits, as well as physiological mechanisms such as N transporter activities and hormonal regulation.
Collapse
Affiliation(s)
- Claire Huang
- Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - Clayton R. Butterly
- Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - David Moody
- InterGrain Pty Ltd., Bibra Lake, WA, Australia
| | - Mohammad Pourkheirandish
- Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
3
|
Bykova M, Hou Y, Eng C, Cheng F. Quantitative trait locus (xQTL) approaches identify risk genes and drug targets from human non-coding genomes. Hum Mol Genet 2022; 31:R105-R113. [PMID: 36018824 PMCID: PMC9989738 DOI: 10.1093/hmg/ddac208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/18/2022] [Accepted: 08/19/2022] [Indexed: 11/13/2022] Open
Abstract
Advances and reduction of costs in various sequencing technologies allow for a closer look at variations present in the non-coding regions of the human genome. Correlating non-coding variants with large-scale multi-omic data holds the promise not only of a better understanding of likely causal connections between non-coding DNA and expression of traits but also identifying potential disease-modifying medicines. Genome-phenome association studies have created large datasets of DNA variants that are associated with multiple traits or diseases, such as Alzheimer's disease; yet, the functional consequences of variants, in particular of non-coding variants, remain largely unknown. Recent advances in functional genomics and computational approaches have led to the identification of potential roles of DNA variants, such as various quantitative trait locus (xQTL) techniques. Multi-omic assays and analytic approaches toward xQTL have identified links between genetic loci and human transcriptomic, epigenomic, proteomic and metabolomic data. In this review, we first discuss the recent development of xQTL from multi-omic findings. We then highlight multimodal analysis of xQTL and genetic data for identification of risk genes and drug targets using Alzheimer's disease as an example. We finally discuss challenges and future research directions (e.g. artificial intelligence) for annotation of non-coding variants in complex diseases.
Collapse
Affiliation(s)
- Marina Bykova
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Yuan Hou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
| |
Collapse
|
4
|
Genome and transcriptome profiling of spontaneous preterm birth phenotypes. Sci Rep 2022; 12:1003. [PMID: 35046466 PMCID: PMC8770724 DOI: 10.1038/s41598-022-04881-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 12/23/2021] [Indexed: 12/27/2022] Open
Abstract
Preterm birth (PTB) occurs before 37 weeks of gestation. Risk factors include genetics and infection/inflammation. Different mechanisms have been reported for spontaneous preterm birth (SPTB) and preterm birth following preterm premature rupture of membranes (PPROM). This study aimed to identify early pregnancy biomarkers of SPTB and PPROM from the maternal genome and transcriptome. Pregnant women were recruited at the Liverpool Women’s Hospital. Pregnancy outcomes were categorised as SPTB, PPROM (≤ 34 weeks gestation, n = 53), high-risk term (HTERM, ≥ 37 weeks, n = 126) or low-risk (no history of SPTB/PPROM) term (LTERM, ≥ 39 weeks, n = 188). Blood samples were collected at 16 and 20 weeks gestation from which, genome (UK Biobank Axiom array) and transcriptome (Clariom D Human assay) data were acquired. PLINK and R were used to perform genetic association and differential expression analyses and expression quantitative trait loci (eQTL) mapping. Several significant molecular signatures were identified across the analyses in preterm cases. Genome-wide significant SNP rs14675645 (ASTN1) was associated with SPTB whereas microRNA-142 transcript and PPARG1-FOXP3 gene set were associated with PPROM at week 20 of gestation and is related to inflammation and immune response. This study has determined genomic and transcriptomic candidate biomarkers of SPTB and PPROM that require validation in diverse populations.
Collapse
|
5
|
Abstract
Expression quantitative trait locus (eQTL) analysis has proven to be a powerful method to describe how variation in phenotypes may be attributed to a given genotype. While the field of bioinformatics and genomics has experienced exponential growth with modern technological advances, an unintended consequence arises as a lack of a gold standard for many applications and methods, which may be compounded with ever-improving computational capabilities. Researchers working on eQTL analysis have at their disposal a multitude of bioinformatics software, each with different assumptions and algorithms, which may produce confusion as to their respective applicability. In this chapter, we will introduce eQTLs, survey commonly used software to conduct a mapping study, as well as provide data correction methods to avoid the pitfalls of such analyses.
Collapse
Affiliation(s)
- Conor Nodzak
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA.
| |
Collapse
|
6
|
Li G, Shabalin AA, Rusyn I, Wright FA, Nobel AB. An empirical Bayes approach for multiple tissue eQTL analysis. Biostatistics 2019; 19:391-406. [PMID: 29029013 DOI: 10.1093/biostatistics/kxx048] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Accepted: 08/23/2017] [Indexed: 12/18/2022] Open
Abstract
Expression quantitative trait locus (eQTL) analyses identify genetic markers associated with the expression of a gene. Most up-to-date eQTL studies consider the connection between genetic variation and expression in a single tissue. Multi-tissue analyses have the potential to improve findings in a single tissue, and elucidate the genotypic basis of differences between tissues. In this article, we develop a hierarchical Bayesian model (MT-eQTL) for multi-tissue eQTL analysis. MT-eQTL explicitly captures patterns of variation in the presence or absence of eQTL, as well as the heterogeneity of effect sizes across tissues. We devise an efficient Expectation-Maximization (EM) algorithm for model fitting. Inferences concerning eQTL detection and the configuration of eQTL across tissues are derived from the adaptive thresholding of local false discovery rates, and maximum a posteriori estimation, respectively. We also provide theoretical justification of the adaptive procedure. We investigate the MT-eQTL model through an extensive analysis of a 9-tissue data set from the GTEx initiative.
Collapse
Affiliation(s)
- Gen Li
- Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 W 168th St, New York, NY, 10032 USA
| | - Andrey A Shabalin
- Center for Biomarker Research and Personalized Medicine, Virginia Commonwealth University, 1112 East Clay Street, Richmond, VA, 23298 USA
| | - Ivan Rusyn
- Texas Veterinary Medical Center, Texas A & M University, 660 Raymond Stotzer Pkwy, College Station, TX, 77843 USA
| | - Fred A Wright
- Department of Statistics and Biological Sciences, North Carolina State University, 1 Lampe Drive, Raleigh, NC, 27695 USA
| | - Andrew B Nobel
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, 318 E Cameron Ave, Chapel Hill, NC, 27599 USA
| |
Collapse
|
7
|
Beretta S, Castelli M, Gonçalves I, Kel I, Giansanti V, Merelli I. Improving eQTL Analysis Using a Machine Learning Approach for Data Integration: A Logistic Model Tree Solution. J Comput Biol 2018; 25:1091-1105. [DOI: 10.1089/cmb.2017.0167] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Affiliation(s)
- Stefano Beretta
- Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca, Milan, Italy
- Istituto di Tecnologie Biomediche, Consiglio Nazionale Ricerche, Segrate, Italy
| | - Mauro Castelli
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Campus de Campolide, 1070-312, Lisboa, Portugal
| | - Ivo Gonçalves
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Campus de Campolide, 1070-312, Lisboa, Portugal
- INESC Coimbra, DEEC, University of Coimbra, Polo 2, 3030-290, Coimbra, Portugal
| | - Ivan Kel
- Istituto di Tecnologie Biomediche, Consiglio Nazionale Ricerche, Segrate, Italy
| | - Valentina Giansanti
- Istituto di Tecnologie Biomediche, Consiglio Nazionale Ricerche, Segrate, Italy
| | - Ivan Merelli
- Istituto di Tecnologie Biomediche, Consiglio Nazionale Ricerche, Segrate, Italy
| |
Collapse
|
8
|
Palowitch J, Shabalin A, Zhou YH, Nobel AB, Wright FA. Estimation of cis-eQTL effect sizes using a log of linear model. Biometrics 2018; 74:616-625. [PMID: 29073327 PMCID: PMC5920774 DOI: 10.1111/biom.12810] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 09/01/2017] [Accepted: 09/01/2017] [Indexed: 11/29/2022]
Abstract
The study of expression Quantitative Trait Loci (eQTL) is an important problem in genomics and biomedicine. While detection (testing) of eQTL associations has been widely studied, less work has been devoted to the estimation of eQTL effect size. To reduce false positives, detection methods frequently rely on linear modeling of rank-based normalized or log-transformed gene expression data. Unfortunately, these approaches do not correspond to the simplest model of eQTL action, and thus yield estimates of eQTL association that can be uninterpretable and inaccurate. In this article, we propose a new, log-of-linear model for eQTL action, termed ACME, that captures allelic contributions to cis-acting eQTLs in an additive fashion, yielding effect size estimates that correspond to a biologically coherent model of cis-eQTLs. We describe a non-linear least-squares algorithm to fit the model by maximum likelihood, and obtain corresponding p-values. We perform careful investigation of the model using a combination of simulated data and data from the Genotype Tissue Expression (GTEx) project. Our results reveal little evidence for dominance effects, a parsimonious result that accords with a simple biological model for allele-specific expression and supports use of the ACME model. We show that Type-I error is well-controlled under our approach in a realistic setting, so that rank-based normalizations are unnecessary. Furthermore, we show that such normalizations can be detrimental to power and estimation accuracy under the proposed model. We then show, through effect size analyses of whole-genome cis-eQTLs in the GTEx data, that using standard normalizations instead of ACME noticeably affects the ranking and sign of estimates.
Collapse
Affiliation(s)
- John Palowitch
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Andrey Shabalin
- Department of Psychiatry, University of Utah, Salt Lake City, Utah 84108, U.S.A
| | - Yi-Hui Zhou
- Bioinformatics Research Center and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, U.S.A
| | - Andrew B Nobel
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Fred A Wright
- Bioinformatics Research Center and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, U.S.A
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, U.S.A
| |
Collapse
|
9
|
Cote I, Andersen ME, Ankley GT, Barone S, Birnbaum LS, Boekelheide K, Bois FY, Burgoon LD, Chiu WA, Crawford-Brown D, Crofton KM, DeVito M, Devlin RB, Edwards SW, Guyton KZ, Hattis D, Judson RS, Knight D, Krewski D, Lambert J, Maull EA, Mendrick D, Paoli GM, Patel CJ, Perkins EJ, Poje G, Portier CJ, Rusyn I, Schulte PA, Simeonov A, Smith MT, Thayer KA, Thomas RS, Thomas R, Tice RR, Vandenberg JJ, Villeneuve DL, Wesselkamper S, Whelan M, Whittaker C, White R, Xia M, Yauk C, Zeise L, Zhao J, DeWoskin RS. The Next Generation of Risk Assessment Multi-Year Study-Highlights of Findings, Applications to Risk Assessment, and Future Directions. ENVIRONMENTAL HEALTH PERSPECTIVES 2016; 124:1671-1682. [PMID: 27091369 PMCID: PMC5089888 DOI: 10.1289/ehp233] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Revised: 10/30/2015] [Accepted: 03/29/2016] [Indexed: 05/19/2023]
Abstract
BACKGROUND The Next Generation (NexGen) of Risk Assessment effort is a multi-year collaboration among several organizations evaluating new, potentially more efficient molecular, computational, and systems biology approaches to risk assessment. This article summarizes our findings, suggests applications to risk assessment, and identifies strategic research directions. OBJECTIVE Our specific objectives were to test whether advanced biological data and methods could better inform our understanding of public health risks posed by environmental exposures. METHODS New data and methods were applied and evaluated for use in hazard identification and dose-response assessment. Biomarkers of exposure and effect, and risk characterization were also examined. Consideration was given to various decision contexts with increasing regulatory and public health impacts. Data types included transcriptomics, genomics, and proteomics. Methods included molecular epidemiology and clinical studies, bioinformatic knowledge mining, pathway and network analyses, short-duration in vivo and in vitro bioassays, and quantitative structure activity relationship modeling. DISCUSSION NexGen has advanced our ability to apply new science by more rapidly identifying chemicals and exposures of potential concern, helping characterize mechanisms of action that influence conclusions about causality, exposure-response relationships, susceptibility and cumulative risk, and by elucidating new biomarkers of exposure and effects. Additionally, NexGen has fostered extensive discussion among risk scientists and managers and improved confidence in interpreting and applying new data streams. CONCLUSIONS While considerable uncertainties remain, thoughtful application of new knowledge to risk assessment appears reasonable for augmenting major scope assessments, forming the basis for or augmenting limited scope assessments, and for prioritization and screening of very data limited chemicals. Citation: Cote I, Andersen ME, Ankley GT, Barone S, Birnbaum LS, Boekelheide K, Bois FY, Burgoon LD, Chiu WA, Crawford-Brown D, Crofton KM, DeVito M, Devlin RB, Edwards SW, Guyton KZ, Hattis D, Judson RS, Knight D, Krewski D, Lambert J, Maull EA, Mendrick D, Paoli GM, Patel CJ, Perkins EJ, Poje G, Portier CJ, Rusyn I, Schulte PA, Simeonov A, Smith MT, Thayer KA, Thomas RS, Thomas R, Tice RR, Vandenberg JJ, Villeneuve DL, Wesselkamper S, Whelan M, Whittaker C, White R, Xia M, Yauk C, Zeise L, Zhao J, DeWoskin RS. 2016. The Next Generation of Risk Assessment multiyear study-highlights of findings, applications to risk assessment, and future directions. Environ Health Perspect 124:1671-1682; http://dx.doi.org/10.1289/EHP233.
Collapse
Affiliation(s)
- Ila Cote
- National Center for Environmental Assessment, U.S. Environmental Protection Agency (EPA), Washington, District of Columbia, USA
- Address correspondence to I. Cote, U.S. Environmental Protection Agency, Region 8, Room 8152, 1595 Wynkoop St., Denver, CO 80202-1129 USA. Telephone: (202) 288-9539. E-mail:
| | | | - Gerald T. Ankley
- National Health and Environmental Effects Research Laboratory, U.S. EPA, Duluth, Minnesota, USA
| | - Stanley Barone
- Office of Chemical Safety and Pollution Prevention, U.S. EPA, Washington, District of Columbia, USA
| | - Linda S. Birnbaum
- National Institute of Environmental Health Sciences, and
- National Toxicology Program, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, North Carolina, USA
| | - Kim Boekelheide
- Department of Pathology and Laboratory Medicine, Brown University, Providence, Rhode Island, USA
| | - Frederic Y. Bois
- Unité Modèles pour l’Écotoxicologie et la Toxicologie, Institut National de l’Environnement Industriel et des Risques, Verneuil en Halatte, France
| | - Lyle D. Burgoon
- U.S. Army Engineer Research and Development Center, Research Triangle Park, North Carolina, USA
| | - Weihsueh A. Chiu
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, USA
| | | | | | - Michael DeVito
- National Institute of Environmental Health Sciences, and
- National Toxicology Program, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, North Carolina, USA
| | - Robert B. Devlin
- National Health and Environmental Effects Research Laboratory, U.S. EPA, Research Triangle Park, North Carolina, USA
| | - Stephen W. Edwards
- National Health and Environmental Effects Research Laboratory, U.S. EPA, Research Triangle Park, North Carolina, USA
| | | | - Dale Hattis
- George Perkins Marsh Institute, Clark University, Worcester, Massachusetts, USA
| | | | - Derek Knight
- European Chemicals Agency, Annankatu, Helsinki, Finland
| | - Daniel Krewski
- McLaughlin Centre for Population Health Risk Assessment, University of Ottawa, Ottawa, Ontario, Canada
| | - Jason Lambert
- National Center for Environmental Assessment, U.S. EPA, Cincinnati, Ohio, USA
| | - Elizabeth Anne Maull
- National Institute of Environmental Health Sciences, and
- National Toxicology Program, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, North Carolina, USA
| | - Donna Mendrick
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas, USA
| | | | - Chirag Jagdish Patel
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | - Edward J. Perkins
- U.S. Army Engineer Research and Development Center, Vicksburg, Mississippi, USA
| | - Gerald Poje
- Grant Consulting Group, Washington, District of Columbia, USA
| | | | - Ivan Rusyn
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, USA
| | - Paul A. Schulte
- Education and Information Division, National Institute for Occupational Safety and Health, Centers for Disease Control and Prevention, Cincinnati, Ohio, USA
| | - Anton Simeonov
- National Center for Advancing Translational Sciences, NIH, DHHS, Bethesda, Maryland, USA
| | - Martyn T. Smith
- Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley, Berkeley, California, USA
| | - Kristina A. Thayer
- National Institute of Environmental Health Sciences, and
- National Toxicology Program, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, North Carolina, USA
| | | | - Reuben Thomas
- Gladstone Institutes, University of California, San Francisco, San Francisco, California, USA
| | - Raymond R. Tice
- National Institute of Environmental Health Sciences, and
- National Toxicology Program, National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, North Carolina, USA
| | - John J. Vandenberg
- National Center for Environmental Assessment, U.S. Environmental Protection Agency (EPA), Washington, District of Columbia, USA
| | - Daniel L. Villeneuve
- National Health and Environmental Effects Research Laboratory, U.S. EPA, Duluth, Minnesota, USA
| | - Scott Wesselkamper
- National Center for Environmental Assessment, U.S. EPA, Cincinnati, Ohio, USA
| | - Maurice Whelan
- Systems Toxicology Unit, European Commission Joint Research Centre, Ispra, Italy
| | - Christine Whittaker
- Education and Information Division, National Institute for Occupational Safety and Health, Centers for Disease Control and Prevention, Cincinnati, Ohio, USA
| | - Ronald White
- Center for Effective Government, Washington, District of Columbia, USA
| | - Menghang Xia
- National Center for Advancing Translational Sciences, NIH, DHHS, Bethesda, Maryland, USA
| | - Carole Yauk
- Environmental Health Science and Research Bureau, Health Canada, Ottawa, Ontario, Canada
| | - Lauren Zeise
- Office of Environmental Health Hazard Assessment, California EPA, Oakland, California, USA
| | - Jay Zhao
- National Center for Environmental Assessment, U.S. EPA, Cincinnati, Ohio, USA
| | - Robert S. DeWoskin
- National Center for Environmental Assessment, U.S. Environmental Protection Agency (EPA), Washington, District of Columbia, USA
| |
Collapse
|
10
|
Danjou F, Zoledziewska M, Sidore C, Steri M, Busonero F, Maschio A, Mulas A, Perseu L, Barella S, Porcu E, Pistis G, Pitzalis M, Pala M, Menzel S, Metrustry S, Spector TD, Leoni L, Angius A, Uda M, Moi P, Thein SL, Galanello R, Abecasis GR, Schlessinger D, Sanna S, Cucca F. Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels. Nat Genet 2015; 47:1264-71. [PMID: 26366553 PMCID: PMC4627580 DOI: 10.1038/ng.3307] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 04/23/2015] [Indexed: 12/20/2022]
Abstract
We report genome-wide association study results for the levels of A1, A2 and fetal hemoglobins, analyzed for the first time concurrently. Integrating high-density array genotyping and whole-genome sequencing in a large general population cohort from Sardinia, we detected 23 associations at 10 loci. Five signals are due to variants at previously undetected loci: MPHOSPH9, PLTP-PCIF1, ZFPM1 (FOG1), NFIX and CCND3. Among the signals at known loci, ten are new lead variants and four are new independent signals. Half of all variants also showed pleiotropic associations with different hemoglobins, which further corroborated some of the detected associations and identified features of coordinated hemoglobin species production.
Collapse
Affiliation(s)
- Fabrice Danjou
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
| | | | - Carlo Sidore
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Center for Statistical Genetics, Ann Arbor, University of Michigan, MI, USA
- Università degli Studi di Sassari, Sassari, Italy
| | - Maristella Steri
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
| | - Fabio Busonero
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Center for Statistical Genetics, Ann Arbor, University of Michigan, MI, USA
- University of Michigan, DNA Sequencing Core, Ann Arbor, MI, USA
| | - Andrea Maschio
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Center for Statistical Genetics, Ann Arbor, University of Michigan, MI, USA
- University of Michigan, DNA Sequencing Core, Ann Arbor, MI, USA
| | - Antonella Mulas
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Università degli Studi di Sassari, Sassari, Italy
| | - Lucia Perseu
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
| | - Susanna Barella
- Ospedale Regionale per le Microcitemie, ASL8, Cagliari, Italy
| | - Eleonora Porcu
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Center for Statistical Genetics, Ann Arbor, University of Michigan, MI, USA
- Università degli Studi di Sassari, Sassari, Italy
| | - Giorgio Pistis
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Center for Statistical Genetics, Ann Arbor, University of Michigan, MI, USA
- Università degli Studi di Sassari, Sassari, Italy
| | | | - Mauro Pala
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
| | - Stephan Menzel
- Department of Molecular Hematology, King’s College London, London, UK
| | - Sarah Metrustry
- Department of Twin Research and Genetic Epidemiology, King’s College London, UK
| | - Timothy D. Spector
- Department of Twin Research and Genetic Epidemiology, King’s College London, UK
| | - Lidia Leoni
- Center for Advanced Studies, Research, and Development in Sardinia (CRS4), AGCT Program, Parco Scientifico e tecnologico della Sardegna, Pula, Italy
| | - Andrea Angius
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Center for Advanced Studies, Research, and Development in Sardinia (CRS4), AGCT Program, Parco Scientifico e tecnologico della Sardegna, Pula, Italy
| | - Manuela Uda
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
| | - Paolo Moi
- Ospedale Regionale per le Microcitemie, ASL8, Cagliari, Italy
- Department of Public Health and Clinical and Molecular Medicine, University of Cagliari, Cagliari, Italy
| | - Swee Lay Thein
- Department of Molecular Hematology, King’s College London, London, UK
- Department of Hematological Medecine, King’s College Hospital NHS Foundation Trust, London, UK
| | - Renzo Galanello
- Ospedale Regionale per le Microcitemie, ASL8, Cagliari, Italy
- Department of Public Health and Clinical and Molecular Medicine, University of Cagliari, Cagliari, Italy
- Renzo Galanello prematurely passed away on May, 13 2013
| | | | - David Schlessinger
- Laboratory of Genetics, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA
| | - Serena Sanna
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
| | - Francesco Cucca
- Istituto di Ricerca Genetica e Biomedica, CNR, Monserrato, Cagliari, Italy
- Università degli Studi di Sassari, Sassari, Italy
| |
Collapse
|
11
|
Statistical and Computational Methods for Genetic Diseases: An Overview. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:954598. [PMID: 26106440 PMCID: PMC4464008 DOI: 10.1155/2015/954598] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 04/23/2015] [Indexed: 12/19/2022]
Abstract
The identification of causes of genetic diseases has been carried out by several approaches with increasing complexity. Innovation of genetic methodologies leads to the production of large amounts of data that needs the support of statistical and computational methods to be correctly processed. The aim of the paper is to provide an overview of statistical and computational methods paying attention to methods for the sequence analysis and complex diseases.
Collapse
|
12
|
Hancock DB, Gaddis NC, Levy JL, Bierut LJ, Kral AH, Johnson EO. Associations of common variants in the BST2 region with HIV-1 acquisition in African American and European American people who inject drugs. AIDS 2015; 29:767-77. [PMID: 25985399 PMCID: PMC4439198 DOI: 10.1097/qad.0000000000000604] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
OBJECTIVE The bone marrow stromal cell antigen 2 (BST2) gene encodes a host restriction factor that acts as an innate immune sensor of HIV-1 exposure and suppresses release of HIV-1 particles. We aimed to identify associations of variants in the BST2 gene region with HIV-1 acquisition and disease progression. DESIGN/METHODS Using HIV+ cases and HIV- controls from the Urban Health Study (n=3136 African Americans and European Americans who inject drugs), we tested 470 variants in BST2 and its flanking regions for association with HIV-1 acquisition and log-transformed viral load. RESULTS We found that the single nucleotide polymorphism (SNP) rs113189798 surpassed the P value threshold corrected for multiple testing. The rs113189798-G allele (frequency=16% in African Americans, 4% in European Americans) was associated with increased HIV-1 acquisition risk (meta-analysis P=1.43 × 10): odds ratio (95% confidence interval) of 1.22 (1.01-1.49) in African Americans and 2.17 (1.43-3.33) in European Americans. We also found that the previously reported rs12609479-A allele (frequency=35% in African Americans, 81% in European Americans) was nominally associated with decreased risk of acquiring HIV-1 in our study (meta-analysis P=0.036). Rs12609479-A is predicted to increase BST2 expression and thereby decrease risk of acquiring HIV-1. Rs113189798 and rs12609479 were only weakly correlated [square of the correlation coefficient (r)=0.2-0.4] and represented distinct association signals. None of our tested variants were significantly associated with log-transformed viral load among the HIV-infected cases. CONCLUSION Our findings support BST2 as a genetic susceptibility factor for HIV-1 acquisition: identifying a novel SNP association for rs13189798 and linking the previously reported regulatory SNP rs12609479 to HIV-1 acquisition.
Collapse
Affiliation(s)
- Dana B Hancock
- aBehavioral Health Epidemiology Program, Behavioral Health and Criminal Justice Division bResearch Computing Division, Research Triangle Institute (RTI) International, Research Triangle Park, North Carolina cDepartment of Psychiatry, Washington University School of Medicine, St. Louis, Missouri dUrban Health Program, Behavioral Health and Criminal Justice Division, RTI International, San Francisco, California eFellow Program and Behavioral Health and Criminal Justice Division, RTI International, Research Triangle Park, North Carolina, USA
| | | | | | | | | | | |
Collapse
|
13
|
Yin Z, Xia K, Chung W, Sullivan PF, Zou F. Fast eQTL Analysis for Twin Studies. Genet Epidemiol 2015; 39:357-65. [PMID: 25865703 DOI: 10.1002/gepi.21900] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Revised: 01/17/2015] [Accepted: 02/23/2015] [Indexed: 12/29/2022]
Abstract
Twin data are commonly used for studying complex psychiatric disorders, and mixed effects models are one of the most popular tools for modeling dependence structures between twin pairs. However, for eQTL (expression quantitative trait loci) data where associations between thousands of transcripts and millions of single nucleotide polymorphisms need to be tested, mixed effects models are computationally inefficient and often impractical. In this paper, we propose a fast eQTL analysis approach for twin eQTL data where we randomly split twin pairs into two groups, so that within each group the samples are unrelated, and we then apply a multiple linear regression analysis separately to each group. A score statistic that automatically adjusts the (hidden) correlation between the two groups is constructed for combining the results from the two groups. The proposed method has well-controlled type I error. Compared to mixed effects models, the proposed method has similar power but drastically improved computational efficiency. We demonstrate the computational advantage of the proposed method via extensive simulations. The proposed method is also applied to a large twin eQTL data from the Netherlands Twin Register.
Collapse
Affiliation(s)
- Zhaoyu Yin
- Department of Biostatistics, University of North Carolina, Chapel Hill, North, Carolina, United States of America
| | - Kai Xia
- Department of Psychiatry, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Wonil Chung
- School of Public Health, Harvard, Boston, Massachusetts, United States of America
| | - Patrick F Sullivan
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Fei Zou
- Department of Biostatistics, University of North Carolina, Chapel Hill, North, Carolina, United States of America
| |
Collapse
|
14
|
Lipka AE, Kandianis CB, Hudson ME, Yu J, Drnevich J, Bradbury PJ, Gore MA. From association to prediction: statistical methods for the dissection and selection of complex traits in plants. CURRENT OPINION IN PLANT BIOLOGY 2015; 24:110-8. [PMID: 25795170 DOI: 10.1016/j.pbi.2015.02.010] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2014] [Revised: 02/24/2015] [Accepted: 02/27/2015] [Indexed: 05/02/2023]
Abstract
Quantification of genotype-to-phenotype associations is central to many scientific investigations, yet the ability to obtain consistent results may be thwarted without appropriate statistical analyses. Models for association can consider confounding effects in the materials and complex genetic interactions. Selecting optimal models enables accurate evaluation of associations between marker loci and numerous phenotypes including gene expression. Significant improvements in QTL discovery via association mapping and acceleration of breeding cycles through genomic selection are two successful applications of models using genome-wide markers. Given recent advances in genotyping and phenotyping technologies, further refinement of these approaches is needed to model genetic architecture more accurately and run analyses in a computationally efficient manner, all while accounting for false positives and maximizing statistical power.
Collapse
Affiliation(s)
- Alexander E Lipka
- University of Illinois, Department of Crop Sciences, Urbana, IL 61801, USA.
| | - Catherine B Kandianis
- Michigan State University, Department of Biochemistry and Molecular Biology, East Lansing, MI 48824, USA; Cornell University, Plant Breeding and Genetics Section, School of Integrative Plant Science, Ithaca, NY 14853, USA
| | - Matthew E Hudson
- University of Illinois, Department of Crop Sciences, Urbana, IL 61801, USA
| | - Jianming Yu
- Iowa State University, Department of Agronomy, Ames, IA 50011, USA
| | - Jenny Drnevich
- University of Illinois, High Performance Biological Computing Group and the Carver Biotechnology Center, Urbana, IL 61801, USA
| | - Peter J Bradbury
- United States Department of Agriculture (USDA) - Agricultural Research Service (ARS), Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| | - Michael A Gore
- Cornell University, Plant Breeding and Genetics Section, School of Integrative Plant Science, Ithaca, NY 14853, USA
| |
Collapse
|
15
|
Ajjuri RR, Hall M, Reiter LT, O’Donnell JM. Drosophila. Mov Disord 2015. [DOI: 10.1016/b978-0-12-405195-9.00005-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
16
|
Das SK, Sharma NK. Expression quantitative trait analyses to identify causal genetic variants for type 2 diabetes susceptibility. World J Diabetes 2014; 5:97-114. [PMID: 24748924 PMCID: PMC3990322 DOI: 10.4239/wjd.v5.i2.97] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Revised: 02/21/2014] [Accepted: 03/14/2014] [Indexed: 02/05/2023] Open
Abstract
Type 2 diabetes (T2D) is a common metabolic disorder which is caused by multiple genetic perturbations affecting different biological pathways. Identifying genetic factors modulating the susceptibility of this complex heterogeneous metabolic phenotype in different ethnic and racial groups remains challenging. Despite recent success, the functional role of the T2D susceptibility variants implicated by genome-wide association studies (GWAS) remains largely unknown. Genetic dissection of transcript abundance or expression quantitative trait (eQTL) analysis unravels the genomic architecture of regulatory variants. Availability of eQTL information from tissues relevant for glucose homeostasis in humans opens a new avenue to prioritize GWAS-implicated variants that may be involved in triggering a causal chain of events leading to T2D. In this article, we review the progress made in the field of eQTL research and knowledge gained from those studies in understanding transcription regulatory mechanisms in human subjects. We highlight several novel approaches that can integrate eQTL analysis with multiple layers of biological information to identify ethnic-specific causal variants and gene-environment interactions relevant to T2D pathogenesis. Finally, we discuss how the eQTL analysis mediated search for “missing heritability” may lead us to novel biological and molecular mechanisms involved in susceptibility to T2D.
Collapse
|