1
|
Hayat U, Ke C, Wang L, Zhu G, Fang W, Wang X, Chen C, Li Y, Wu J. Using Quantitative Trait Locus Mapping and Genomic Resources to Improve Breeding Precision in Peaches: Current Insights and Future Prospects. PLANTS (BASEL, SWITZERLAND) 2025; 14:175. [PMID: 39861529 PMCID: PMC11768884 DOI: 10.3390/plants14020175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 12/23/2024] [Accepted: 01/06/2025] [Indexed: 01/27/2025]
Abstract
Modern breeding technologies and the development of quantitative trait locus (QTL) mapping have brought about a new era in peach breeding. This study examines the complex genetic structure that underlies the morphology of peach fruits, paying special attention to the interaction between genome editing, genomic selection, and marker-assisted selection. Breeders now have access to precise tools that enhance crop resilience, productivity, and quality, facilitated by QTL mapping, which has significantly advanced our understanding of the genetic determinants underlying essential traits such as fruit shape, size, and firmness. New technologies like CRISPR/Cas9 and genomic selection enable the development of cultivars that can withstand climate change and satisfy consumer demands with unprecedented precision in trait modification. Genotype-environment interactions remain a critical challenge for modern breeding efforts, which can be addressed through high-throughput phenotyping and multi-environment trials. This work shows how combining genome-wide association studies and machine learning can improve the synthesis of multi-omics data and result in faster breeding cycles while preserving genetic diversity. This study outlines a roadmap that prioritizes the development of superior cultivars utilizing cutting-edge methods and technologies in order to address evolving agricultural and environmental challenges.
Collapse
Affiliation(s)
- Umar Hayat
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
- Zhongyuan Research Center, Chinese Academy of Agricultural Sciences, Xinxiang 453003, China
| | - Cao Ke
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
- Zhongyuan Research Center, Chinese Academy of Agricultural Sciences, Xinxiang 453003, China
| | - Lirong Wang
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| | - Gengrui Zhu
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| | - Weichao Fang
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| | - Xinwei Wang
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| | - Changwen Chen
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| | - Yong Li
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| | - Jinlong Wu
- The Key Laboratory of the Gene Resources Evaluation and Utilization of Horticultural Crop [Fruit Tree], Ministry of Agriculture, Zhengzhou Fruit Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou 450009, China
| |
Collapse
|
2
|
Jasmine F, Almazan A, Khamkevych Y, Argos M, Shahriar M, Islam T, Shea CR, Ahsan H, Kibriya MG. Gene-Environment Interaction: Small Deletions (DELs) and Transcriptomic Profiles in Non-Melanoma Skin Cancer (NMSC) and Potential Implications for Therapy. Cells 2025; 14:95. [PMID: 39851523 PMCID: PMC11764317 DOI: 10.3390/cells14020095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 01/09/2025] [Accepted: 01/09/2025] [Indexed: 01/26/2025] Open
Abstract
Arsenic (As) is a risk factor for non-melanoma skin cancer (NMSC). From a six-year follow-up study on 7000 adults exposed to As, we reported the associations of single-nucleotide variation in tumor tissue and gene expression. Here, we identify the associations of small deletions (DELs) and transcriptomic profiles in NMSC. Comparing the (a) NMSC tissue (n = 32) and corresponding blood samples from each patient, and (b) an independent set of non-lesional, healthy skin (n = 16) and paired blood, we identified NMSC-associated DELs. Differential expressions of certain gene pathways (TGF-β signaling pathway, IL-17 pathway, PD-L1 pathway, etc.) showed significant interactions with these somatic DELs and As exposure. In low-As-exposure cases, the DELs in APC were associated with the up-regulation of inflamed T-Cell-associated genes by a fold change (FC) of 8.9 (95% CI 4.5-17.6), compared to 5.7 (95% CI 2.9-10.8) without APC DELs; in high-As-exposure cases, the APC DELs were associated with an FC of 5.8 (95% CI 3.5-9.8) compared to 1.2 (95% CI -1.3 to 1.8) without APC DELs. We report, for the first time, the significant associations of somatic DELs (many in STR regions) in NMSC tissue and As exposure with many dysregulated gene pathways. These findings may help in selecting groups of patients for potential targeted therapy like PD-L1 inhibitors, IL-17 inhibitors, and TGF-β inhibitors in the future.
Collapse
Affiliation(s)
- Farzana Jasmine
- Institute for Population and Precision Health (IPPH), University of Chicago, Chicago, IL 60637, USA; (F.J.); (A.A.); (Y.K.); (M.S.); (H.A.)
| | - Armando Almazan
- Institute for Population and Precision Health (IPPH), University of Chicago, Chicago, IL 60637, USA; (F.J.); (A.A.); (Y.K.); (M.S.); (H.A.)
| | - Yuliia Khamkevych
- Institute for Population and Precision Health (IPPH), University of Chicago, Chicago, IL 60637, USA; (F.J.); (A.A.); (Y.K.); (M.S.); (H.A.)
| | - Maria Argos
- Department of Environmental Health, School of Public Health, Boston University, Boston, MA 02118, USA;
| | - Mohammad Shahriar
- Institute for Population and Precision Health (IPPH), University of Chicago, Chicago, IL 60637, USA; (F.J.); (A.A.); (Y.K.); (M.S.); (H.A.)
| | - Tariqul Islam
- UChicago Research Bangladesh (URB), University of Chicago, Dhaka 1230, Bangladesh;
| | - Christopher R. Shea
- Division of Dermatology, Department of Medicine, University of Chicago, Chicago, IL 60637, USA;
| | - Habibul Ahsan
- Institute for Population and Precision Health (IPPH), University of Chicago, Chicago, IL 60637, USA; (F.J.); (A.A.); (Y.K.); (M.S.); (H.A.)
- Department of Public Health Sciences, Biological Sciences Division, University of Chicago, Chicago, IL 60637, USA
| | - Muhammad G. Kibriya
- Institute for Population and Precision Health (IPPH), University of Chicago, Chicago, IL 60637, USA; (F.J.); (A.A.); (Y.K.); (M.S.); (H.A.)
- Department of Public Health Sciences, Biological Sciences Division, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
3
|
Chen L, Wang X, Xie N, Zhang Z, Xu X, Xue M, Yang Y, Liu L, Su L, Bjaanæs M, Karlsson A, Planck M, Staaf J, Helland Å, Esteller M, Christiani DC, Chen F, Zhang R. A two-phase epigenome-wide four-way gene-smoking interaction study of overall survival for early-stage non-small cell lung cancer. Mol Oncol 2025; 19:173-187. [PMID: 39630602 PMCID: PMC11705728 DOI: 10.1002/1878-0261.13766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2024] [Revised: 10/05/2024] [Accepted: 11/07/2024] [Indexed: 12/07/2024] Open
Abstract
High-order interactions associated with non-small cell lung cancer (NSCLC) survival may elucidate underlying molecular mechanisms and identify potential therapeutic targets. Our previous work has identified a three-way interaction among pack-year of smoking (the number of packs of cigarettes smoked per day multiplied by the number of years the person has smoked) and two DNA methylation probes (cg05293407TRIM27 and cg00060500KIAA0226). However, whether a four-way interaction exists remains unclear. Therefore, we adopted a two-phase design to identify the four-way gene-smoking interactions by a hill-climbing strategy on the basis of the previously detected three-way interaction. One CpG probe, cg16658473SHISA9, was identified with FDR-q ≤ 0.05 in the discovery phase and P ≤ 0.05 in the validation phase. Meanwhile, the four-way interaction improved the discrimination ability for the prognostic prediction model, as indicated by the area under the receiver operating characteristic curve (AUC) for both 3- and 5-year survival. In summary, we identified a four-way interaction associated with NSCLC survival among pack-year of smoking, cg05293407TRIM27, cg00060500KIAA0226 and g16658473SHISA9, providing novel insights into the complex mechanisms underlying NSCLC progression.
Collapse
Affiliation(s)
- Leyi Chen
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Xiang Wang
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Ning Xie
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Zhongwen Zhang
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Xiaowen Xu
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Maojie Xue
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
- Department of Health Inspection and Quarantine, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Yuqing Yang
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Liya Liu
- School of Public Health, Health Science CenterNingbo UniversityChina
| | - Li Su
- Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA
- Pulmonary and Critical Care Division, Department of MedicineMassachusetts General Hospital and Harvard Medical SchoolBostonMAUSA
| | - Maria Bjaanæs
- Department of Cancer Genetics, Institute for Cancer ResearchOslo University HospitalNorway
| | - Anna Karlsson
- Division of Oncology, Department of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversitySweden
| | - Maria Planck
- Division of Oncology, Department of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversitySweden
| | - Johan Staaf
- Division of Oncology, Department of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversitySweden
| | - Åslaug Helland
- Department of Cancer Genetics, Institute for Cancer ResearchOslo University HospitalNorway
- Institute of Clinical MedicineUniversity of OsloNorway
| | - Manel Esteller
- Josep Carreras Leukaemia Research InstituteBarcelonaSpain
- Centro de Investigacion Biomedica en Red CancerMadridSpain
- Institucio Catalana de Recerca i Estudis AvançatsBarcelonaSpain
- Physiological Sciences Department, School of Medicine and Health SciencesUniversity of BarcelonaSpain
| | - David C. Christiani
- Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA
- Pulmonary and Critical Care Division, Department of MedicineMassachusetts General Hospital and Harvard Medical SchoolBostonMAUSA
| | - Feng Chen
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
| | - Ruyang Zhang
- Department of Biostatistics, Center for Global Health, School of Public HealthNanjing Medical UniversityChina
- China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityChina
- Changzhou Medical CenterNanjing Medical UniversityChangzhouChina
- Information CenterThe Affiliated Changzhou Second People's Hospital of Nanjing Medical UniversityChangzhouChina
| |
Collapse
|
4
|
Liu Y, Ren J, Ma S, Wu C. The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies. Stat Med 2024; 43:4928-4983. [PMID: 39260448 PMCID: PMC11585335 DOI: 10.1002/sim.10196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 05/28/2024] [Accepted: 07/31/2024] [Indexed: 09/13/2024]
Abstract
Data irregularity in cancer genomics studies has been widely observed in the form of outliers and heavy-tailed distributions in the complex traits. In the past decade, robust variable selection methods have emerged as powerful alternatives to the nonrobust ones to identify important genes associated with heterogeneous disease traits and build superior predictive models. In this study, to keep the remarkable features of the quantile LASSO and fully Bayesian regularized quantile regression while overcoming their disadvantage in the analysis of high-dimensional genomics data, we propose the spike-and-slab quantile LASSO through a fully Bayesian spike-and-slab formulation under the robust likelihood by adopting the asymmetric Laplace distribution (ALD). The proposed robust method has inherited the prominent properties of selective shrinkage and self-adaptivity to the sparsity pattern from the spike-and-slab LASSO (Roc̆ková and George, J Am Stat Associat, 2018, 113(521): 431-444). Furthermore, the spike-and-slab quantile LASSO has a computational advantage to locate the posterior modes via soft-thresholding rule guided Expectation-Maximization (EM) steps in the coordinate descent framework, a phenomenon rarely observed for robust regularization with nondifferentiable loss functions. We have conducted comprehensive simulation studies with a variety of heavy-tailed errors in both homogeneous and heterogeneous model settings to demonstrate the superiority of the spike-and-slab quantile LASSO over its competing methods. The advantage of the proposed method has been further demonstrated in case studies of the lung adenocarcinomas (LUAD) and skin cutaneous melanoma (SKCM) data from The Cancer Genome Atlas (TCGA).
Collapse
Affiliation(s)
- Yuwen Liu
- Department of Statistics, Kansas State University, Manhattan, KS
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, CT
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS
| |
Collapse
|
5
|
Pośpiech E, Rudnicka J, Noroozi R, Pisarek-Pacek A, Wysocka B, Masny A, Boroń M, Migacz-Gruszka K, Pruszkowska-Przybylska P, Kobus M, Lisman D, Zielińska G, Cytacka S, Iljin A, Wiktorska JA, Michalczyk M, Kaczka P, Krzysztofik M, Sitek A, Spólnicka M, Ossowski A, Branicki W. DNA methylation at AHRR as a master predictor of smoke exposure and a biomarker for sleep and exercise. Clin Epigenetics 2024; 16:147. [PMID: 39425209 PMCID: PMC11490037 DOI: 10.1186/s13148-024-01757-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 10/01/2024] [Indexed: 10/21/2024] Open
Abstract
BACKGROUND DNA methylation profiling may provide a more accurate measure of the smoking status than self-report and may be useful in guiding clinical interventions and forensic investigations. In the current study, blood DNA methylation profiles of nearly 800 Polish individuals were assayed using Illuminia EPIC and the inference of smoking from epigenetic data was explored. In addition, we focused on the role of the AHRR gene as a top marker for smoking and investigated its responsiveness to other lifestyle behaviors. RESULTS We found > 450 significant CpGs associated with cigarette consumption, and overrepresented in various biological functions including cell communication, response to stress, blood vessel development, cell death, and atherosclerosis. The model consisting of cg05575921 in AHRR (p = 4.5 × 10-32) and three additional CpGs (cg09594361, cg21322436 in CNTNAP2 and cg09842685) was able to predict smoking status with a high accuracy of AUC = 0.8 in the test set. Importantly, a gradual increase in the probability of smoking was observed, starting from occasional smokers to regular heavy smokers. Furthermore, former smokers displayed the intermediate DNA methylation profiles compared to current and never smokers, and thus our results indicate the potential reversibility of DNA methylation after smoking cessation. The AHRR played a key role in a predictive analysis, explaining 21.5% of the variation in smoking. In addition, the AHRR methylation was analyzed for association with other modifiable lifestyle factors, and showed significance for sleep and physical activity. We also showed that the epigenetic score for smoking was significantly correlated with most of the epigenetic clocks tested, except for two first-generation clocks. CONCLUSIONS Our study suggests that a more rapid return to never-smoker methylation levels after smoking cessation may be achievable in people who change their lifestyle in terms of physical activity and sleep duration. As cigarette smoking has been implicated in the literature as a leading cause of epigenetic aging and AHRR appears to be modifiable by multiple exogenous factors, it emerges as a promising target for intervention and investment.
Collapse
Affiliation(s)
- Ewelina Pośpiech
- Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Powstańców Wlkp. 72, 70-111, Szczecin, Poland.
| | - Joanna Rudnicka
- Doctoral School of Exact and Natural Sciences, Jagiellonian University, Krakow, Poland
| | - Rezvan Noroozi
- Doctoral School of Exact and Natural Sciences, Jagiellonian University, Krakow, Poland
- Johns Hopkins University School of Medicine, Baltimore, USA
| | - Aleksandra Pisarek-Pacek
- Malopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Zoology and Biomedical Research of the Jagiellonian University, Krakow, Poland
| | - Bożena Wysocka
- Central Forensic Laboratory of the Police, Warsaw, Poland
| | | | - Michał Boroń
- Central Forensic Laboratory of the Police, Warsaw, Poland
| | | | | | - Magdalena Kobus
- Institute of Biological Sciences, Faculty of Biology and Environmental Sciences, Cardinal Stefan Wyszynski University in Warsaw, Warsaw, Poland
| | - Dagmara Lisman
- Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Powstańców Wlkp. 72, 70-111, Szczecin, Poland
| | - Grażyna Zielińska
- Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Powstańców Wlkp. 72, 70-111, Szczecin, Poland
| | - Sandra Cytacka
- Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Powstańców Wlkp. 72, 70-111, Szczecin, Poland
| | - Aleksandra Iljin
- Department of Plastic, Reconstructive and Aesthetic Surgery, Medical University of Lodz, Lodz, Poland
| | | | - Małgorzata Michalczyk
- Department of Sport Nutrition, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland
| | - Piotr Kaczka
- Department of Sport Nutrition, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland
| | - Michał Krzysztofik
- Institute of Sports Sciences, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland
| | - Aneta Sitek
- Department of Anthropology, Faculty of Biology and Environmental Protection, University of Lodz, Lodz, Poland
| | | | - Andrzej Ossowski
- Department of Forensic Genetics, Pomeranian Medical University in Szczecin, Powstańców Wlkp. 72, 70-111, Szczecin, Poland
| | - Wojciech Branicki
- Institute of Zoology and Biomedical Research of the Jagiellonian University, Krakow, Poland
- Institute of Forensic Research, Krakow, Poland
| |
Collapse
|
6
|
Fan K, Subedi S, Yang G, Lu X, Ren J, Wu C. Is Seeing Believing? A Practitioner's Perspective on High-Dimensional Statistical Inference in Cancer Genomics Studies. ENTROPY (BASEL, SWITZERLAND) 2024; 26:794. [PMID: 39330127 PMCID: PMC11430850 DOI: 10.3390/e26090794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 08/23/2024] [Accepted: 09/06/2024] [Indexed: 09/28/2024]
Abstract
Variable selection methods have been extensively developed for and applied to cancer genomics data to identify important omics features associated with complex disease traits, including cancer outcomes. However, the reliability and reproducibility of the findings are in question if valid inferential procedures are not available to quantify the uncertainty of the findings. In this article, we provide a gentle but systematic review of high-dimensional frequentist and Bayesian inferential tools under sparse models which can yield uncertainty quantification measures, including confidence (or Bayesian credible) intervals, p values and false discovery rates (FDR). Connections in high-dimensional inferences between the two realms have been fully exploited under the "unpenalized loss function + penalty term" formulation for regularization methods and the "likelihood function × shrinkage prior" framework for regularized Bayesian analysis. In particular, we advocate for robust Bayesian variable selection in cancer genomics studies due to its ability to accommodate disease heterogeneity in the form of heavy-tailed errors and structured sparsity while providing valid statistical inference. The numerical results show that robust Bayesian analysis incorporating exact sparsity has yielded not only superior estimation and identification results but also valid Bayesian credible intervals under nominal coverage probabilities compared with alternative methods, especially in the presence of heavy-tailed model errors and outliers.
Collapse
Affiliation(s)
- Kun Fan
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Srijana Subedi
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Gongshun Yang
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Xi Lu
- Department of Pharmaceutical Health Outcomes and Policy, College of Pharmacy, University of Houston, Houston, TX 77204, USA
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| |
Collapse
|
7
|
Motsinger-Reif AA, Reif DM, Akhtari FS, House JS, Campbell CR, Messier KP, Fargo DC, Bowen TA, Nadadur SS, Schmitt CP, Pettibone KG, Balshaw DM, Lawler CP, Newton SA, Collman GW, Miller AK, Merrick BA, Cui Y, Anchang B, Harmon QE, McAllister KA, Woychik R. Gene-environment interactions within a precision environmental health framework. CELL GENOMICS 2024; 4:100591. [PMID: 38925123 PMCID: PMC11293590 DOI: 10.1016/j.xgen.2024.100591] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/26/2024] [Accepted: 06/02/2024] [Indexed: 06/28/2024]
Abstract
Understanding the complex interplay of genetic and environmental factors in disease etiology and the role of gene-environment interactions (GEIs) across human development stages is important. We review the state of GEI research, including challenges in measuring environmental factors and advantages of GEI analysis in understanding disease mechanisms. We discuss the evolution of GEI studies from candidate gene-environment studies to genome-wide interaction studies (GWISs) and the role of multi-omics in mediating GEI effects. We review advancements in GEI analysis methods and the importance of large-scale datasets. We also address the translation of GEI findings into precision environmental health (PEH), showcasing real-world applications in healthcare and disease prevention. Additionally, we highlight societal considerations in GEI research, including environmental justice, the return of results to participants, and data privacy. Overall, we underscore the significance of GEI for disease prediction and prevention and advocate for integrating the exposome into PEH omics studies.
Collapse
Affiliation(s)
- Alison A Motsinger-Reif
- Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA.
| | - David M Reif
- Predictive Toxicology Branch, Division of Translational Toxicology, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Farida S Akhtari
- Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - John S House
- Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - C Ryan Campbell
- Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Kyle P Messier
- Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA; Predictive Toxicology Branch, Division of Translational Toxicology, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - David C Fargo
- Office of the Director, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Tiffany A Bowen
- Office of the Director, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Srikanth S Nadadur
- Exposure, Response, and Technology Branch, Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Charles P Schmitt
- Office of the Scientific Director, Office of Data Science, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Kristianna G Pettibone
- Program Analysis Branch, Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - David M Balshaw
- Office of the Director, National Institute of Environmental Health Sciences, Durham, NC, USA; Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Cindy P Lawler
- Genes, Environment, and Health Branch, Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Shelia A Newton
- Office of Scientific Coordination, Planning and Evaluation, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Gwen W Collman
- Office of the Director, National Institute of Environmental Health Sciences, Durham, NC, USA; Office of Scientific Coordination, Planning and Evaluation, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Aubrey K Miller
- Office of Scientific Coordination, Planning and Evaluation, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - B Alex Merrick
- Mechanistic Toxicology Branch, Division of Translational Toxicology, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Yuxia Cui
- Exposure, Response, and Technology Branch, Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Benedict Anchang
- Biostatistics and Computational Biology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Quaker E Harmon
- Epidemiology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Kimberly A McAllister
- Genes, Environment, and Health Branch, Division of Extramural Research and Training, National Institute of Environmental Health Sciences, Durham, NC, USA
| | - Rick Woychik
- Office of the Director, National Institute of Environmental Health Sciences, Durham, NC, USA
| |
Collapse
|
8
|
Ma J, Li J, Chen Y, Yang Z, He Y. Poor statistical power in population-based association study of gene interaction. BMC Med Genomics 2024; 17:111. [PMID: 38678264 PMCID: PMC11055307 DOI: 10.1186/s12920-024-01884-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 04/19/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND Statistical epistasis, or "gene-gene interaction" in genetic association studies, means the nonadditive effects between the polymorphic sites on two different genes affecting the same phenotype. In the genetic association analysis of complex traits, nevertheless, the researchers haven't found enough clues of statistical epistasis so far. METHODS We developed a statistical model where the statistical epistasis was presented as an extra linkage disequilibrium between the polymorphic sites of different risk genes. The power of statistical test for identifying the gene-gene interaction was calculated and then compared in different hypothesis scenarios. RESULTS Our results show the statistical power increases with the increasing of interaction coefficient, relative risk, and linkage disequilibrium with genetic markers. However, the power of interaction discovery is much lower than that of regular single-site association test. When rigorous criteria were employed in statistical tests, the identification of gene-gene interaction became a very difficult task. Since the criterion of significance was given to be p-value ≤ 5.0 × 10-8, the same as that of many genome-wide association studies, there is little chance to identify the gene-gene interaction in all kind of circumstances. CONCLUSIONS The lack of epistasis tends to be an inevitable result caused by the statistical principles of methods in the genetic association studies and therefore is the inherent characteristic of the research itself.
Collapse
Affiliation(s)
- Jiarui Ma
- Shanghai Key Laboratory of Medical Epigenetics, International Co-Laboratory of Medical Epigenetics and Metabolism (Ministry of Science and Technology), Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Jian Li
- Shanghai Key Laboratory of Medical Epigenetics, International Co-Laboratory of Medical Epigenetics and Metabolism (Ministry of Science and Technology), Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Yuqi Chen
- Shanghai Key Laboratory of Medical Epigenetics, International Co-Laboratory of Medical Epigenetics and Metabolism (Ministry of Science and Technology), Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
| | - Zhen Yang
- Center for Medical Research and Innovation of Pudong Hospital, Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China
| | - Yungang He
- Shanghai Fifth People's Hospital, Intelligent Medicine Institute, Fudan University, Shanghai, 200032, PR China.
| |
Collapse
|
9
|
Jin X, Shi G. Cauchy combination methods for the detection of gene-environment interactions for rare variants related to quantitative phenotypes. Heredity (Edinb) 2023; 131:241-252. [PMID: 37481617 PMCID: PMC10539363 DOI: 10.1038/s41437-023-00640-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 07/09/2023] [Accepted: 07/12/2023] [Indexed: 07/24/2023] Open
Abstract
The characterization of gene-environment interactions (GEIs) can provide detailed insights into the biological mechanisms underlying complex diseases. Despite recent interest in GEIs for rare variants, published GEI tests are underpowered for an extremely small proportion of causal rare variants in a gene or a region. By extending the aggregated Cauchy association test (ACAT), we propose three GEI tests to address this issue: a Cauchy combination GEI test with fixed main effects (CCGEI-F), a Cauchy combination GEI test with random main effects (CCGEI-R), and an omnibus Cauchy combination GEI test (CCGEI-O). ACAT was applied to combine p values of single-variant GEI analyses to obtain CCGEI-F and CCGEI-R and p values of multiple GEI tests were combined in CCGEI-O. Through numerical simulations, for small numbers of causal variants, CCGEI-F, CCGEI-R and CCGEI-O provided approximately 5% higher power than the existing GEI tests INT-FIX and INT-RAN; however, they had slightly higher power than the existing GEI test TOW-GE. For large numbers of causal variants, although CCGEI-F and CCGEI-R exhibited comparable or slightly lower power values than the competing tests, the results were still satisfactory. Among all simulation conditions evaluated, CCGEI-O provided significantly higher power than that of competing GEI tests. We further applied our GEI tests in genome-wide analyses of systolic blood pressure or diastolic blood pressure to detect gene-body mass index (BMI) interactions, using whole-exome sequencing data from UK Biobank. At a suggestive significance level of 1.0 × 10-4, KCNC4, GAR1, FAM120AOS and NT5C3B showed interactions with BMI by our GEI tests.
Collapse
Affiliation(s)
- Xiaoqin Jin
- State Key Laboratory of Integrated Services Networks, Xidian University, 2 South Taibai Road, Xi'an, Shaanxi, 710071, China.
| | - Gang Shi
- State Key Laboratory of Integrated Services Networks, Xidian University, 2 South Taibai Road, Xi'an, Shaanxi, 710071, China
| |
Collapse
|
10
|
Zhang W, Huang Q, Kang Y, Li H, Tan G. Which Factors Influence Healthy Aging? A Lesson from the Longevity Village of Bama in China. Aging Dis 2023; 14:825-839. [PMID: 37191421 PMCID: PMC10187713 DOI: 10.14336/ad.2022.1108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/08/2022] [Indexed: 11/18/2022] Open
Abstract
A growing aging population is associated with increasing incidences of aging-related diseases and socioeconomic burdens. Hence, research into healthy longevity and aging is urgently needed. Longevity is an important phenomenon in healthy aging. The present review summarizes the characteristics of longevity in the elderly population in Bama, China, where the proportion of centenarians is 5.7-fold greater than the international standard. We examined the impact of genetic and environmental factors on longevity from multiple perspectives. We proposed that the phenomenon of longevity in this region is of high value for future investigations in healthy aging and aging-related disease and may provide guidance for fostering the establishment and maintenance of a healthy aging society.
Collapse
Affiliation(s)
- Wei Zhang
- Department of Human Anatomy, Institute of Neuroscience and Guangxi Key Laboratory of Brain Science, Guangxi Health Commission Key Laboratory of Basic Research on Brain Function and Disease, School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China.
- Key Laboratory of Longevity and Aging-related Diseases of Chinese Ministry of Education, Nanning, Guangxi, China.
- China-ASEAN Research Center for Innovation and Development in Brain Science, Nanning, Guangxi, China.
| | - Qingyun Huang
- Department of Human Anatomy, Institute of Neuroscience and Guangxi Key Laboratory of Brain Science, Guangxi Health Commission Key Laboratory of Basic Research on Brain Function and Disease, School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China.
- Key Laboratory of Longevity and Aging-related Diseases of Chinese Ministry of Education, Nanning, Guangxi, China.
- China-ASEAN Research Center for Innovation and Development in Brain Science, Nanning, Guangxi, China.
| | - Yongxin Kang
- Department of Human Anatomy, Institute of Neuroscience and Guangxi Key Laboratory of Brain Science, Guangxi Health Commission Key Laboratory of Basic Research on Brain Function and Disease, School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China.
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-constructed by the Province and Ministry, Guangxi Key Laboratory of Regenerative Medicine, Nanning, Guangxi, China.
- China-ASEAN Research Center for Innovation and Development in Brain Science, Nanning, Guangxi, China.
| | - Hao Li
- Department of Human Anatomy, Institute of Neuroscience and Guangxi Key Laboratory of Brain Science, Guangxi Health Commission Key Laboratory of Basic Research on Brain Function and Disease, School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China.
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-constructed by the Province and Ministry, Guangxi Key Laboratory of Regenerative Medicine, Nanning, Guangxi, China.
- China-ASEAN Research Center for Innovation and Development in Brain Science, Nanning, Guangxi, China.
| | - Guohe Tan
- Department of Human Anatomy, Institute of Neuroscience and Guangxi Key Laboratory of Brain Science, Guangxi Health Commission Key Laboratory of Basic Research on Brain Function and Disease, School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China.
- Key Laboratory of Longevity and Aging-related Diseases of Chinese Ministry of Education, Nanning, Guangxi, China.
- Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-constructed by the Province and Ministry, Guangxi Key Laboratory of Regenerative Medicine, Nanning, Guangxi, China.
- China-ASEAN Research Center for Innovation and Development in Brain Science, Nanning, Guangxi, China.
| |
Collapse
|
11
|
Ren J, Zhou F, Li X, Ma S, Jiang Y, Wu C. Robust Bayesian variable selection for gene-environment interactions. Biometrics 2023; 79:684-694. [PMID: 35394058 PMCID: PMC11086965 DOI: 10.1111/biom.13670] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 03/23/2022] [Accepted: 03/28/2022] [Indexed: 11/30/2022]
Abstract
Gene-environment (G× E) interactions have important implications to elucidate the etiology of complex diseases beyond the main genetic and environmental effects. Outliers and data contamination in disease phenotypes of G× E studies have been commonly encountered, leading to the development of a broad spectrum of robust regularization methods. Nevertheless, within the Bayesian framework, the issue has not been taken care of in existing studies. We develop a fully Bayesian robust variable selection method for G× E interaction studies. The proposed Bayesian method can effectively accommodate heavy-tailed errors and outliers in the response variable while conducting variable selection by accounting for structural sparsity. In particular, for the robust sparse group selection, the spike-and-slab priors have been imposed on both individual and group levels to identify important main and interaction effects robustly. An efficient Gibbs sampler has been developed to facilitate fast computation. Extensive simulation studies, analysis of diabetes data with single-nucleotide polymorphism measurements from the Nurses' Health Study, and The Cancer Genome Atlas melanoma data with gene expression measurements demonstrate the superior performance of the proposed method over multiple competing alternatives.
Collapse
Affiliation(s)
- Jie Ren
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, Kansas, USA
| | - Xiaoxi Li
- Department of Statistics, Kansas State University, Manhattan, Kansas, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, Connecticut, USA
| | - Yu Jiang
- Division of Epidemiology, Biostatistics and Environmental Health, School of Public Health, University of Memphis, Memphis, Tennessee, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, Kansas, USA
| |
Collapse
|
12
|
Lavezzi AM, Ramos-Molina B. Environmental Exposure Science and Human Health. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:ijerph20105764. [PMID: 37239493 DOI: 10.3390/ijerph20105764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 05/05/2023] [Indexed: 05/28/2023]
Abstract
Human health and environmental exposure form an inseparable binomial [...].
Collapse
Affiliation(s)
- Anna M Lavezzi
- "Lino Rossi" Research Center for the Study and Prevention of Unexpected Perinatal Death and SIDS, Department of Biomedical, Surgical and Dental Sciences, University of Milan, 20122 Milan, Italy
| | - Bruno Ramos-Molina
- Obesity and Metabolism Laboratory, Biomedical Research Institute of Murcia (IMIB), 30120 Murcia, Spain
| |
Collapse
|
13
|
Zhou F, Liu Y, Ren J, Wang W, Wu C. Springer: An R package for bi-level variable selection of high-dimensional longitudinal data. Front Genet 2023; 14:1088223. [PMID: 37091810 PMCID: PMC10117642 DOI: 10.3389/fgene.2023.1088223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 02/28/2023] [Indexed: 04/09/2023] Open
Abstract
In high-dimensional data analysis, the bi-level (or the sparse group) variable selection can simultaneously conduct penalization on the group level and within groups, which has been developed for continuous, binary, and survival responses in the literature. Zhou et al. (2022) (PMID: 35766061) has further extended it under the longitudinal response by proposing a quadratic inference function-based penalization method in gene-environment interaction studies. This study introduces "springer," an R package implementing the bi-level variable selection within the QIF framework developed in Zhou et al. (2022). In addition, R package "springer" has also implemented the generalized estimating equation-based sparse group penalization method. Alternative methods focusing only on the group level or individual level have also been provided by the package. In this study, we have systematically introduced the longitudinal penalization methods implemented in the "springer" package. We demonstrate the usage of the core and supporting functions, which is followed by the numerical examples and discussions. R package "springer" is available at https://cran.r-project.org/package=springer.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Yuwen Liu
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Weiqun Wang
- Department of Food, Nutrition, Dietetics and Health, Kansas State University, Manhattan, KS, United States
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| |
Collapse
|
14
|
Wu S, Xu Y, Zhang Q, Ma S. Gene-environment interaction analysis via deep learning. Genet Epidemiol 2023; 47:261-286. [PMID: 36807383 PMCID: PMC10244912 DOI: 10.1002/gepi.22518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/17/2022] [Accepted: 02/01/2023] [Indexed: 02/21/2023]
Abstract
Gene-environment (G-E) interaction analysis plays an important role in studying complex diseases. Extensive methodological research has been conducted on G-E interaction analysis, and the existing methods are mostly based on regression techniques. In many fields including biomedicine and omics, it has been increasingly recognized that deep learning may outperform regression with its unique flexibility (e.g., in accommodating unspecified nonlinear effects) and superior prediction performance. However, there has been a lack of development in deep learning for G-E interaction analysis. In this article, we fill this important knowledge gap and develop a new analysis approach based on deep neural network in conjunction with penalization. The proposed approach can simultaneously conduct model estimation and selection (of important main G effects and G-E interactions), while uniquely respecting the "main effects, interactions" variable selection hierarchy. Simulation shows that it has superior prediction and feature selection performance. The analysis of data on lung adenocarcinoma and skin cutaneous melanoma overall survival further establishes its practical utility. Overall, this study can advance G-E interaction analysis by delivering a powerful new analysis approach based on modern deep learning.
Collapse
Affiliation(s)
- Shuni Wu
- The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China
| | - Yaqing Xu
- Department of Epidemiology and Biostatistics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Qingzhao Zhang
- The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China
- Department of Statistics and Data Science, School of Economics and Fujian Key Lab of Statistics, Xiamen University, Xiamen, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| |
Collapse
|
15
|
Polygenic Risk of Hypertriglyceridemia Is Modified by BMI. Int J Mol Sci 2022; 23:ijms23179837. [PMID: 36077235 PMCID: PMC9456481 DOI: 10.3390/ijms23179837] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 08/23/2022] [Accepted: 08/24/2022] [Indexed: 12/03/2022] Open
Abstract
Background: Genetic risk scores (GRSs) have partially improved the understanding of the etiology of moderate hypertriglyceridemia (HTG), which until recently was mainly assessed by secondary predisposing causes. The main objective of this study was to assess whether this variability is due to the interaction between clinical variables and GRS. Methods: We analyzed 276 patients with suspected polygenic HTG. An unweighted GRS was developed with the following variants: c.724C > G (ZPR1 gene), c.56C > G (APOA5 gene), c.1337T > C (GCKR gene), g.19986711A > G (LPL gene), c.107 + 1647T > C (BAZ1B gene) and g.125478730A > T (TRIB gene). Interactions between the GRS and clinical variables (body mass index (BMI), diabetes mellitus, diet, physical activity, alcohol consumption, age and gender) were evaluated. Results: The GRS was associated with triglyceride (TG) concentrations. There was a significant interaction between BMI and GRS, with the intensity of the relationship between the number of alleles and the TG concentration being greater in individuals with a higher BMI. Conclusions: GRS is associated with plasma TG concentrations and is markedly influenced by BMI. This finding could improve the stratification of patients with a high genetic risk for HTG who could benefit from more intensive healthcare interventions.
Collapse
|
16
|
Zhou F, Lu X, Ren J, Fan K, Ma S, Wu C. Sparse group variable selection for gene-environment interactions in the longitudinal study. Genet Epidemiol 2022; 46:317-340. [PMID: 35766061 DOI: 10.1002/gepi.22461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/31/2022] [Accepted: 03/15/2022] [Indexed: 11/06/2022]
Abstract
Penalized variable selection for high-dimensional longitudinal data has received much attention as it can account for the correlation among repeated measurements while providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies, the potential of penalization methods is far from fully understood for accommodating structured sparsity. In this article, we develop a sparse group penalization method to conduct the bi-level gene-environment (G × $\times $ E) interaction study under the repeatedly measured phenotype. Within the quadratic inference function framework, the proposed method can achieve simultaneous identification of main and interaction effects on both the group and individual levels. Simulation studies have shown that the proposed method outperforms major competitors. In the case study of asthma data from the Childhood Asthma Management Program, we conduct G × $\times $ E study by using high-dimensional single nucleotide polymorphism data as genetic factors and the longitudinal trait, forced expiratory volume in 1 s, as the phenotype. Our method leads to improved prediction and identification of main and interaction effects with important implications.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Xi Lu
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana, 46202, USA
| | - Kun Fan
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, Connecticut, 06520, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| |
Collapse
|
17
|
Wang JH, Wang KH, Chen YH. Overlapping group screening for detection of gene-environment interactions with application to TCGA high-dimensional survival genomic data. BMC Bioinformatics 2022; 23:202. [PMID: 35637439 PMCID: PMC9150322 DOI: 10.1186/s12859-022-04750-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 05/25/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the context of biomedical and epidemiological research, gene-environment (G-E) interaction is of great significance to the etiology and progression of many complex diseases. In high-dimensional genetic data, two general models, marginal and joint models, are proposed to identify important interaction factors. Most existing approaches for identifying G-E interactions are limited owing to the lack of robustness to outliers/contamination in response and predictor data. In particular, right-censored survival outcomes make the associated feature screening even challenging. In this article, we utilize the overlapping group screening (OGS) approach to select important G-E interactions related to clinical survival outcomes by incorporating the gene pathway information under a joint modeling framework. RESULTS Simulation studies under various scenarios are carried out to compare the performances of our proposed method with some commonly used methods. In the real data applications, we use our proposed method to identify G-E interactions related to the clinical survival outcomes of patients with head and neck squamous cell carcinoma, and esophageal carcinoma in The Cancer Genome Atlas clinical survival genetic data, and further establish corresponding survival prediction models. Both simulation and real data studies show that our method performs well and outperforms existing methods in the G-E interaction selection, effect estimation, and survival prediction accuracy. CONCLUSIONS The OGS approach is useful for selecting important environmental factors, genes and G-E interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The same idea of the OGS approach can apply to other outcome models, such as the proportional odds survival time model, the logistic regression model for binary outcomes, and the multinomial logistic regression model for multi-class outcomes.
Collapse
Affiliation(s)
- Jie-Huei Wang
- Department of Statistics, Feng Chia University, Seatwen, Taichung, 40724, Taiwan.
| | - Kang-Hsin Wang
- Department of Statistics, Feng Chia University, Seatwen, Taichung, 40724, Taiwan
| | - Yi-Hau Chen
- Institute of Statistical Science, Academia Sinica, Nankang, Taipei, 11529, Taiwan
| |
Collapse
|
18
|
Pośpiech E, Karłowska-Pik J, Kukla-Bartoszek M, Woźniak A, Boroń M, Zubańska M, Jarosz A, Bronikowska A, Grzybowski T, Płoski R, Spólnicka M, Branicki W. Overlapping association signals in the genetics of hair-related phenotypes in humans and their relevance to predictive DNA analysis. Forensic Sci Int Genet 2022; 59:102693. [DOI: 10.1016/j.fsigen.2022.102693] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 02/25/2022] [Accepted: 03/22/2022] [Indexed: 01/02/2023]
|
19
|
Graham DP, Harding MJ, Nielsen DA. Pharmacogenetics of Addiction Therapy. Methods Mol Biol 2022; 2547:437-490. [PMID: 36068473 DOI: 10.1007/978-1-0716-2573-6_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Drug addiction is a serious relapsing disease that has high costs to society and to the individual addicts. Treatment of these addictions is still in its nascency, with only a few examples of successful therapies. Therapeutic response depends upon genetic, biological, social, and environmental components. A role for genetic makeup in the response to treatment has been shown for several addiction pharmacotherapies with response to treatment based on individual genetic makeup. In this chapter, we will discuss the role of genetics in pharmacotherapies, specifically for cocaine, alcohol, and opioid dependences. The continued elucidation of the role of genetics should aid in the development of new treatments and increase the efficacy of existing treatments.
Collapse
Affiliation(s)
- David P Graham
- Michael E. DeBakey Veterans Affairs Medical Center, and the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Mark J Harding
- Michael E. DeBakey Veterans Affairs Medical Center, and the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - David A Nielsen
- Michael E. DeBakey Veterans Affairs Medical Center, and the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
20
|
Lu X, Fan K, Ren J, Wu C. Identifying Gene-Environment Interactions With Robust Marginal Bayesian Variable Selection. Front Genet 2021; 12:667074. [PMID: 34956304 PMCID: PMC8693717 DOI: 10.3389/fgene.2021.667074] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 07/13/2021] [Indexed: 01/02/2023] Open
Abstract
In high-throughput genetics studies, an important aim is to identify gene–environment interactions associated with the clinical outcomes. Recently, multiple marginal penalization methods have been developed and shown to be effective in G×E studies. However, within the Bayesian framework, marginal variable selection has not received much attention. In this study, we propose a novel marginal Bayesian variable selection method for G×E studies. In particular, our marginal Bayesian method is robust to data contamination and outliers in the outcome variables. With the incorporation of spike-and-slab priors, we have implemented the Gibbs sampler based on Markov Chain Monte Carlo (MCMC). The proposed method outperforms a number of alternatives in extensive simulation studies. The utility of the marginal robust Bayesian variable selection method has been further demonstrated in the case studies using data from the Nurse Health Study (NHS). Some of the identified main and interaction effects from the real data analysis have important biological implications.
Collapse
Affiliation(s)
- Xi Lu
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Kun Fan
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Jie Ren
- Department of Biostatistics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| |
Collapse
|
21
|
Yu QY, Lu TP, Hsiao TH, Lin CH, Wu CY, Tzeng JY, Hsiao CK. An Integrative Co-localization (INCO) Analysis for SNV and CNV Genomic Features With an Application to Taiwan Biobank Data. Front Genet 2021; 12:709555. [PMID: 34567069 PMCID: PMC8456116 DOI: 10.3389/fgene.2021.709555] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/23/2021] [Indexed: 11/13/2022] Open
Abstract
Genomic studies have been a major approach to elucidating disease etiology and to exploring potential targets for treatments of many complex diseases. Statistical analyses in these studies often face the challenges of multiplicity, weak signals, and the nature of dependence among genetic markers. This situation becomes even more complicated when multi-omics data are available. To integrate the data from different platforms, various integrative analyses have been adopted, ranging from the direct union or intersection operation on sets derived from different single-platform analysis to complex hierarchical multi-level models. The former ignores the biological relationship between molecules while the latter can be hard to interpret. We propose in this study an integrative approach that combines both single nucleotide variants (SNVs) and copy number variations (CNVs) in the same genomic unit to co-localize the concurrent effect and to deal with the sparsity due to rare variants. This approach is illustrated with simulation studies to evaluate its performance and is applied to low-density lipoprotein cholesterol and triglyceride measurements from Taiwan Biobank. The results show that the proposed method can more effectively detect the collective effect from both SNVs and CNVs compared to traditional methods. For the biobank analysis, the identified genetic regions including the gene VNN2 could be novel and deserve further investigation.
Collapse
Affiliation(s)
- Qi-You Yu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, National Taiwan University, Taipei, Taiwan
| | - Tzu-Hung Hsiao
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Ching-Heng Lin
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Chi-Yun Wu
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, United States.,Department of Statistics, University of Pennsylvania, Philadelphia, PA, United States
| | - Jung-Ying Tzeng
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Statistics and Bioinformatics Research Center, North Carolina State University, Raleigh, NC, United States
| | - Chuhsing Kate Hsiao
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
22
|
Du Y, Fan K, Lu X, Wu C. Integrating Multi–Omics Data for Gene-Environment Interactions. BIOTECH 2021; 10:biotech10010003. [PMID: 35822775 PMCID: PMC9245467 DOI: 10.3390/biotech10010003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 01/22/2021] [Accepted: 01/22/2021] [Indexed: 01/05/2023] Open
Abstract
Gene-environment (G×E) interaction is critical for understanding the genetic basis of complex disease beyond genetic and environment main effects. In addition to existing tools for interaction studies, penalized variable selection emerges as a promising alternative for dissecting G×E interactions. Despite the success, variable selection is limited in terms of accounting for multidimensional measurements. Published variable selection methods cannot accommodate structured sparsity in the framework of integrating multiomics data for disease outcomes. In this paper, we have developed a novel variable selection method in order to integrate multi-omics measurements in G×E interaction studies. Extensive studies have already revealed that analyzing omics data across multi-platforms is not only sensible biologically, but also resulting in improved identification and prediction performance. Our integrative model can efficiently pinpoint important regulators of gene expressions through sparse dimensionality reduction, and link the disease outcomes to multiple effects in the integrative G×E studies through accommodating a sparse bi-level structure. The simulation studies show the integrative model leads to better identification of G×E interactions and regulators than alternative methods. In two G×E lung cancer studies with high dimensional multi-omics data, the integrative model leads to an improved prediction and findings with important biological implications.
Collapse
|