1
|
Chentoufi FE, Redouane S, Barakat A, Benrahma H, Charoute H. Computational study of the potential impact of WHRN protein missense SNPs on WHRN-MYO15A protein complex interaction and their association with Usher syndrome. J Biomol Struct Dyn 2025:1-26. [PMID: 40389825 DOI: 10.1080/07391102.2025.2507152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 04/11/2025] [Indexed: 05/21/2025]
Abstract
Usher syndrome is a rare genetic condition characterized by both hearing and vision impairment that occurs through mutations of multiple genes, including WHRN and MYO15A. In this computational work, we intend to explore how missense SNPs within the WHRN protein affect its interaction with the MYO15A protein, a crucial component of the Usher interactome. Therefore, the identification of missense SNPs that has a potential effect on the function of the WHRN protein was realized using various computational prediction tools, including VEP, SIFT, PolyPhen-2, CADD, REVEL, and Mutation Assessor. Further evaluation of the stability of mutated proteins was conducted through SDM2, MCSM, DeepDDG and CUP-SAT. We used ConSurf web server to identify conserved regions in the WHRN protein. Yasara and Haddock analysis tools were used to minimize the energy of protein 3D structures and to dock protein-protein complexes, respectively. and then the binding energy of the complexes was calculated through PRODIGY. Mutation pathogenicity prediction tools showed that in total, 18 missense SNPs, predicted as deleterious. However, a comprehensive analysis revealed that only SIX single nucleotide polymorphisms were predicted to be the most deleterious with high conservation and less stability. Furthermore, we conducted molecular dynamics analysis to fully comprehend the impact of these variations on the dynamic behavior of the WHRN-MYO15A protein complex, which revealed significant insights into the destabilizing effects of the deleterious SNPs impacting the protein's binding affinity and stability that occurs during the binding process of the WHRN-MYO15A protein complex.
Collapse
Affiliation(s)
- Fatima Ezzahra Chentoufi
- Research Unit of Epidemiology, Biostatistics and Bioinformatics, Institut Pasteur du Maroc, Casablanca, Morocco
- Interdisciplinary Laboratory of Biotechnology and Health, Mohammed VI Higher Institute of Biosciences and Biotechnology, Mohammed VI University of sciences and Health (UM6SS), Casablanca, Morocco
| | - Salaheddine Redouane
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Abdelhamid Barakat
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Houda Benrahma
- Interdisciplinary Laboratory of Biotechnology and Health, Mohammed VI Higher Institute of Biosciences and Biotechnology, Mohammed VI University of sciences and Health (UM6SS), Casablanca, Morocco
| | - Hicham Charoute
- Research Unit of Epidemiology, Biostatistics and Bioinformatics, Institut Pasteur du Maroc, Casablanca, Morocco
| |
Collapse
|
2
|
Liu S, Bush WS, Akinyemi RO, Byrd GS, Caban-Holt AM, Rajabli F, Reitz C, Kunkle BW, Tosto G, Vance JM, Pericak-Vance M, Haines JL, Williams SM, Crawford DC. Alzheimer disease is (sometimes) highly heritable: Drivers of variation in heritability estimates for binary traits, a systematic review. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.04.29.25326648. [PMID: 40343016 PMCID: PMC12060970 DOI: 10.1101/2025.04.29.25326648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 05/11/2025]
Abstract
Estimating heritability has been fundamental in understanding the genetic contributions to complex disorders like late-onset Alzheimer's disease (LOAD), and provides a rationale for identifying genetic factors associated with disease susceptibility. While numerous studies have established substantial genetic contribution for LOAD, the interpretation of heritability estimates remains challenging. These challenges are further complicated by the binary nature of LOAD status, where estimation and interpretation require additional considerations. Through a systematic review, we identified LOAD heritability estimates from 6 twin studies and 17 genome-wide association studies, all conducted in populations of European ancestry. We demonstrate that these heritability estimates for LOAD vary considerably. The variation reflects not only differences in study design and methodological approaches but also the underlying study population characteristics. Our findings indicate that commonly cited heritability estimates, often treated as universal values, should be interpreted within specific population contexts and methodological frameworks.
Collapse
Affiliation(s)
- Shiying Liu
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Rufus Olusola Akinyemi
- Neuroscience and Ageing Research Unit, Institute for Advanced Medical Research and Training, College of Medicine, University of Ibadan, Ibadan, Oyo, Nigeria
| | - Goldie S Byrd
- Maya Angelou Center for Health Equity, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Allison Mercedes Caban-Holt
- Department of Behavioral Science, College of Medicine and Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA
| | - Farid Rajabli
- John P. Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Department of Neurology, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Christiane Reitz
- Gertrude H. Sergievsky Center, Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Department of Neurology Columbia University, Department of Epidemiology, Columbia University, New York, NY, USA
| | - Brian W Kunkle
- John P. Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Department of Neurology, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Giuseppe Tosto
- The Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, Columbia University; Department of Neurology, Columbia University Irving Medical Center, Columbia University; The Gertrude H. Sergievsky Center, College of Physicians and Surgeons, Columbia University Irving Medical Center, Columbia University, New York, NY USA
| | - Jeffery M Vance
- John P. Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Department of Neurology, University of Miami Miller School of Medicine, Miami, FL, USA
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, Florida, USA
| | - Margaret Pericak-Vance
- John P. Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Department of Neurology, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Jonathan L Haines
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Scott M Williams
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Dana C Crawford
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| |
Collapse
|
3
|
Lori A, Patel AV, Westmaas JL, Diver WR. A novel smoking cessation behavior based on quit attempts may identify new genes associated with long-term abstinence. Addict Behav 2025; 161:108192. [PMID: 39504611 DOI: 10.1016/j.addbeh.2024.108192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 10/23/2024] [Accepted: 10/25/2024] [Indexed: 11/08/2024]
Abstract
BACKGROUND Smoking cessation at any age has been shown to improve quality of life, decrease illness, and reduce mortality. About half of smokers attempt to quit each year, but only ∼ 7 % maintain long-term abstinence unaided. Few genetic factors have been consistently associated with smoking cessation, possibly due to poor phenotype definition. METHODS We performed a genome-wide association study (GWAS) with an alternative phenotype based on the difficulty of quitting smoking (DQS) in the Cancer Prevention Study-3 cohort. Difficult quitters were defined as having made at least ten quit attempts, whether successful or not, and easy quitters as having quit after only one attempt. Only individuals of European ancestry were selected for the study. Among 10,004 smokers (5,071 difficult quitters, 4,933 easy quitters), we assessed the genetic heritability of DQS and evaluated associations between DQS and each genome-wide variant using logistic regression while adjusting for confounders, including smoking intensity (cigarettes per day). RESULTS The genetic heritability of the DQS phenotype was 13 %, comparable to, or higher than, the reported heritability of other smoking behaviors (e.g., smoking intensity, cessation). Although no variants were genome-wide significant, several genes were identified at a subthreshold level (p < 10-4). A variant in MEGF9 (rs149760032), a transmembrane protein largely expressed in the central nervous system, showed the strongest association with DQS (OR = 0.60, p = 1.3x10-7). Additional variants associated with DQS independently by smoking intensity were also detected in GLRA3 (rs73006492, OR = 0.77, p = 5.6x10-7) and FOCAD (rs112251973, OR = 1.96, p = 1.8x10-6) and are plausibly related to smoking cessation through pathways in the brain and respiratory system. CONCLUSIONS The use of an alternative cessation phenotype based on difficulty quitting smoking facilitated the identification of new pathways that could lead to unique smoking treatments.
Collapse
Affiliation(s)
- Adriana Lori
- Department of Population Science, American Cancer Society, Atlanta, GA, USA.
| | - Alpa V Patel
- Department of Population Science, American Cancer Society, Atlanta, GA, USA
| | - J Lee Westmaas
- Department of Population Science, American Cancer Society, Atlanta, GA, USA
| | - W Ryan Diver
- Department of Population Science, American Cancer Society, Atlanta, GA, USA
| |
Collapse
|
4
|
Grinde KE, Browning BL, Reiner AP, Thornton TA, Browning SR. Adjusting for principal components can induce collider bias in genome-wide association studies. PLoS Genet 2024; 20:e1011242. [PMID: 39680601 PMCID: PMC11684764 DOI: 10.1371/journal.pgen.1011242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 12/30/2024] [Accepted: 11/14/2024] [Indexed: 12/18/2024] Open
Abstract
Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA. However, these suggestions are not universally implemented and the implications for GWAS are not fully understood, especially in the context of admixed populations. In this paper, we investigate the impact of pre-processing and the number of PCs included in GWAS models in African American samples from the Women's Health Initiative SNP Health Association Resource and two Trans-Omics for Precision Medicine Whole Genome Sequencing Project contributing studies (Jackson Heart Study and Genetic Epidemiology of Chronic Obstructive Pulmonary Disease Study). In all three samples, we find the first PC is highly correlated with genome-wide ancestry whereas later PCs often capture local genomic features. The pattern of which, and how many, genetic variants are highly correlated with individual PCs differs from what has been observed in prior studies focused on European populations and leads to distinct downstream consequences: adjusting for such PCs yields biased effect size estimates and elevated rates of spurious associations due to the phenomenon of collider bias. Excluding high LD regions identified in previous studies does not resolve these issues. LD pruning proves more effective, but the optimal choice of thresholds varies across datasets. Altogether, our work highlights unique issues that arise when using PCA to control for ancestral heterogeneity in admixed populations and demonstrates the importance of careful pre-processing and diagnostics to ensure that PCs capturing multiple local genomic features are not included in GWAS models.
Collapse
Affiliation(s)
- Kelsey E. Grinde
- Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, Minnesota, United States of America
| | - Brian L. Browning
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Alexander P. Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
| | - Timothy A. Thornton
- Regeneron Genetics Center, Tarrytown, New York, United States of America
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
5
|
Giles JB, Martinez KL, Steiner HE, Klein A, Ooi A, Pryor J, Sweitzer N, Fuchs D, Karnes JH. Association of Metal Cations with the Anti-PF4/Heparin Antibody Response in Heparin-Induced Thrombocytopenia. Cardiovasc Toxicol 2024; 24:968-981. [PMID: 39017812 DOI: 10.1007/s12012-024-09895-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 07/08/2024] [Indexed: 07/18/2024]
Abstract
Heparin-induced thrombocytopenia (HIT) is an antibody-mediated immune response against complexes of heparin and platelet factor 4 (PF4). The electrostatic interaction between heparin and PF4 is critical for the anti-PF4/heparin antibody response seen in HIT. The binding of metal cations to heparin induces conformational changes and charge neutralization of the heparin molecule, and cation-heparin binding can modulate the specificity and affinity for heparin-binding partners. However, the effects of metal cation binding to heparin in the context of anti-PF4/heparin antibody response have not been determined. Here, we utilized inductively coupled plasma mass spectrometry (ICP-MS) to quantify 16 metal cations in patient plasma and tested for correlation with anti-PF4/heparin IgG levels and platelet count after clinical suspicion of HIT in a cohort of heparin-treated patients. The average age of the cohort (n = 32) was 60.53 (SD = 14.31) years old, had a mean anti-PF4/heparin antibody optical density [OD405] of 0.93 (SD = 1.21) units, and was primarily female (n = 23). Patients with positive anti-PF4/heparin antibody test results (OD405 ≥ 0.5 units) were younger, had increased weight and BMI, and were more likely to have a positive serotonin release assay (SRA) result compared to antibody-negative patients. We observed statistical differences between antibody-positive and -negative groups for sodium and aluminum and significant correlations of anti-PF4/heparin antibody levels with sodium and silver. While differences in sodium concentrations were associated with antibody-positive status and correlated with antibody levels, no replication was performed. Additional studies are warranted to confirm our observed association, including in vitro binding studies and larger observational cohorts.
Collapse
Affiliation(s)
- Jason B Giles
- Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Kiana L Martinez
- Department of Pharmacy Practice and Science, University of Arizona College of Pharmacy, 1295 N Martin AVE, Tucson, AZ, 85721, USA
| | - Heidi E Steiner
- Department of Pharmacy Practice and Science, University of Arizona College of Pharmacy, 1295 N Martin AVE, Tucson, AZ, 85721, USA
| | - Andrew Klein
- Department of Pharmacy Practice and Science, University of Arizona College of Pharmacy, 1295 N Martin AVE, Tucson, AZ, 85721, USA
| | - Aikseng Ooi
- Department of Pharmacology and Toxicology, University of Arizona College of Pharmacy, Tucson, AZ, USA
| | - Julie Pryor
- Banner University Medical Center-Tucson, Tucson, AZ, USA
| | - Nancy Sweitzer
- John T Milliken Department of Internal Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Deborah Fuchs
- Banner University Medical Center-Tucson, Tucson, AZ, USA
| | - Jason H Karnes
- Department of Pharmacy Practice and Science, University of Arizona College of Pharmacy, 1295 N Martin AVE, Tucson, AZ, 85721, USA.
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
6
|
Lu S, Huang Y, Shen WX, Cao YL, Cai M, Chen Y, Tan Y, Jiang YY, Chen YZ. Raman spectroscopic deep learning with signal aggregated representations for enhanced cell phenotype and signature identification. PNAS NEXUS 2024; 3:pgae268. [PMID: 39192845 PMCID: PMC11348106 DOI: 10.1093/pnasnexus/pgae268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 06/21/2024] [Indexed: 08/29/2024]
Abstract
Feature representation is critical for data learning, particularly in learning spectroscopic data. Machine learning (ML) and deep learning (DL) models learn Raman spectra for rapid, nondestructive, and label-free cell phenotype identification, which facilitate diagnostic, therapeutic, forensic, and microbiological applications. But these are challenged by high-dimensional, unordered, and low-sample spectroscopic data. Here, we introduced novel 2D image-like dual signal and component aggregated representations by restructuring Raman spectra and principal components, which enables spectroscopic DL for enhanced cell phenotype and signature identification. New ConvNet models DSCARNets significantly outperformed the state-of-the-art (SOTA) ML and DL models on six benchmark datasets, mostly with >2% improvement over the SOTA performance of 85-97% accuracies. DSCARNets also performed well on four additional datasets against SOTA models of extremely high performances (>98%) and two datasets without a published supervised phenotype classification model. Explainable DSCARNets identified Raman signatures consistent with experimental indications.
Collapse
Affiliation(s)
- Songlin Lu
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
- Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, 9 Kexue Avenue, Guangming District, Shenzhen 518132, Guangdong, P. R. China
| | - Yuanfang Huang
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
| | - Wan Xiang Shen
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore
| | - Yu Lin Cao
- Tangyi and Tsinghua Shenzhen International Graduate School Collaborative Program, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
| | - Mengna Cai
- Tangyi and Tsinghua Shenzhen International Graduate School Collaborative Program, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
| | - Yan Chen
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
- Shenzhen Kivita Innovative Drug Discovery Institute, Shenzhen 518057, Guangdong, P. R. China
| | - Ying Tan
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
- Institute of Drug Discovery Technology, Ningbo University, 818 Fenghua Road, Ningbo 315211, Zhejiang, P. R. China
| | - Yu Yang Jiang
- School of Pharmaceutical Sciences, Tsinghua University, 30 Shuangqing Road, Haidian District, Beijing 100084, P. R. China
| | - Yu Zong Chen
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, 2279 Lishui Road, Nanshan District, Shenzhen 518055, Guangdong, P. R. China
- Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, 9 Kexue Avenue, Guangming District, Shenzhen 518132, Guangdong, P. R. China
| |
Collapse
|
7
|
Briscoe Runquist R, Moeller DA. Isolation by environment and its consequences for range shifts with global change: Landscape genomics of the invasive plant common tansy. Mol Ecol 2024; 33:e17462. [PMID: 38993027 DOI: 10.1111/mec.17462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/29/2024] [Accepted: 05/30/2024] [Indexed: 07/13/2024]
Abstract
Invasive species are a growing global economic and ecological problem. However, it is not well understood how environmental factors mediate invasive range expansion. In this study, we investigated the recent and rapid range expansion of common tansy across environmental gradients in Minnesota, USA. We densely sampled individuals across the expanding range and performed reduced representation sequencing to generate a dataset of 3071 polymorphic loci for 176 individuals. We used non-spatial and spatially explicit analyses to determine the relative influences of geographic distance and environmental variation on patterns of genomic variation. We found no evidence for isolation by distance but strong evidence for isolation by environment, indicating that environmental factors may have modulated patterns of range expansion. Land use classification and soils were particularly important variables related to population structure although they operated on different spatial scales; land use classification was related to broad-scale patterns and soils were related to fine-scale patterns. All analyses indicated a distinctive genetic cluster in the most recently invaded portion of the range. Individuals from the far northwestern range margin were separated from the remainder of the range by reduced migration, which was associated with environmental resistance. This portion of the range was invaded primarily in the last 15 years. Ecological niche models also indicated that this cluster was associated with the expansion of the niche. While invasion is often assumed to be primarily influenced by dispersal limitation, our results suggest that ongoing invasion and range shifts with climate change may be strongly affected by environmental heterogeneity.
Collapse
Affiliation(s)
- Ryan Briscoe Runquist
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| | - David A Moeller
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| |
Collapse
|
8
|
Hou J, Ji X, Chu X, Shi Z, Wang B, Sun K, Wei H, Song Z, Wen F. Comprehensive lipidomic analysis revealed the effects of fermented Morus alba L. intake on lipid profile in backfat and muscle tissue of Yuxi black pigs. J Anim Physiol Anim Nutr (Berl) 2024; 108:764-777. [PMID: 38305489 DOI: 10.1111/jpn.13932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/08/2023] [Accepted: 01/18/2024] [Indexed: 02/03/2024]
Abstract
Mulberry leaf is a widely used protein feed and is often used as a strategy to reduce feed costs and improve meat quality in the livestock industry. However, to date, there is a lack of research on the improvement of meat quality using mulberry leaves, and the exact mechanisms are not yet known. The results showed that fermented mulberry leaves significantly reduced backfat content but had no significant effect on intramuscular fat (IMF). Lipidomic analysis showed that 98 and 303 differential lipid molecules (p < 0.05) were identified in adipose and muscle tissues, respectively, including triglycerides (TG), phosphatidylcholine, phosphatidylethanolamine, sphingolipids, and especially TG; therefore, we analysed the acyl carbon atom number of TG. The statistical results of acyl with different carbon atom numbers of TG in adipose tissue showed that the acyl group containing 13 carbon atoms (C13) in TG was significantly upregulated, whereas C15, C16, C17, and C23 were significantly downregulated, whereas in muscle tissue, the C12, C19, C23, C25, and C26 in TG were significantly downregulated. Acyl changes in TG were different for different numbers of carbon atoms in different tissues. We found that the correlations of C (14-18) in adipose tissue were higher, but in muscle tissue, the correlations of C (18-26) were higher. Through pathway enrichment analysis, we identified six and four metabolic pathways with the highest contributions of differential lipid metabolites in adipose and muscle tissues respectively. These findings suggest that fermented mulberry leaves improve meat quality mainly by inhibiting TG deposition by downregulating medium- and short-chain fatty acids in backfat tissue and long-chain fatty acids in muscle tissue.
Collapse
Affiliation(s)
- Junjie Hou
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Xiang Ji
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Xiaoran Chu
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Zhuoyan Shi
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Binjie Wang
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Kangle Sun
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Haibo Wei
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Zhen Song
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
- The Kay Laboratory of High Quality Livestock and Poultry Germplasm Resources and Genetic Breeding of Luoyang, Henan University of Science and Technology, Luoyang, China
| | - Fengyun Wen
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
- The Kay Laboratory of High Quality Livestock and Poultry Germplasm Resources and Genetic Breeding of Luoyang, Henan University of Science and Technology, Luoyang, China
| |
Collapse
|
9
|
Grinde KE, Browning BL, Reiner AP, Thornton TA, Browning SR. Adjusting for principal components can induce spurious associations in genome-wide association studies in admixed populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.02.587682. [PMID: 38617337 PMCID: PMC11014513 DOI: 10.1101/2024.04.02.587682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/24/2024]
Abstract
Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA. However, these suggestions are not universally implemented and the implications for GWAS are not fully understood, especially in the context of admixed populations. In this paper, we investigate the impact of pre-processing and the number of PCs included in GWAS models in African American samples from the Women's Women's Health Initiative SNP Health Association Resource and two Trans-Omics for Precision Medicine Whole Genome Sequencing Project contributing studies (Jackson Heart Study and Genetic Epidemiology of Chronic Obstructive Pulmonary Disease Study). In all three samples, we find the first PC is highly correlated with genome-wide ancestry whereas later PCs often capture local genomic features. The pattern of which, and how many, genetic variants are highly correlated with individual PCs differs from what has been observed in prior studies focused on European populations and leads to distinct downstream consequences: adjusting for such PCs yields biased effect size estimates and elevated rates of spurious associations due to the phenomenon of collider bias. Excluding high LD regions identified in previous studies does not resolve these issues. LD pruning proves more effective, but the optimal choice of thresholds varies across datasets. Altogether, our work highlights unique issues that arise when using PCA to control for ancestral heterogeneity in admixed populations and demonstrates the importance of careful pre-processing and diagnostics to ensure that PCs capturing multiple local genomic features are not included in GWAS models.
Collapse
Affiliation(s)
- Kelsey E. Grinde
- Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, Minnesota, 55105, USA
| | - Brian L. Browning
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, 98195, USA
| | - Alexander P. Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
- Department of Epidemiology, University of Washington, Seattle, Washington, 98195, USA
| | - Timothy A. Thornton
- Regeneron Genetics Center, Tarrytown, New York, 10591, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, 98195, USA
| | - Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle, Washington, 98195, USA
| |
Collapse
|
10
|
Liu Z, Turkmen AS, Lin S. Bayesian LASSO for population stratification correction in rare haplotype association studies. Stat Appl Genet Mol Biol 2024; 23:sagmb-2022-0034. [PMID: 38235525 PMCID: PMC10794901 DOI: 10.1515/sagmb-2022-0034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 12/19/2023] [Indexed: 01/19/2024]
Abstract
Population stratification (PS) is one major source of confounding in both single nucleotide polymorphism (SNP) and haplotype association studies. To address PS, principal component regression (PCR) and linear mixed model (LMM) are the current standards for SNP associations, which are also commonly borrowed for haplotype studies. However, the underfitting and overfitting problems introduced by PCR and LMM, respectively, have yet to be addressed. Furthermore, there have been only a few theoretical approaches proposed to address PS specifically for haplotypes. In this paper, we propose a new method under the Bayesian LASSO framework, QBLstrat, to account for PS in identifying rare and common haplotypes associated with a continuous trait of interest. QBLstrat utilizes a large number of principal components (PCs) with appropriate priors to sufficiently correct for PS, while shrinking the estimates of unassociated haplotypes and PCs. We compare the performance of QBLstrat with the Bayesian counterparts of PCR and LMM and a current method, haplo.stats. Extensive simulation studies and real data analyses show that QBLstrat is superior in controlling false positives while maintaining competitive power for identifying true positives under PS.
Collapse
Affiliation(s)
- Zilu Liu
- Department of Statistics, The Ohio State University, Columbus, OH43210, USA
| | | | - Shili Lin
- Department of Statistics, The Ohio State University, Columbus, OH43210, USA
| |
Collapse
|
11
|
Liu Z, Turkmen AS, Lin S. Population stratification correction using Bayesian shrinkage priors for genetic association studies. Ann Hum Genet 2023; 87:302-315. [PMID: 37771252 PMCID: PMC11624906 DOI: 10.1111/ahg.12527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 08/20/2023] [Accepted: 08/24/2023] [Indexed: 09/30/2023]
Abstract
INTRODUCTION Population stratification (PS) is a major source of confounding in population-based genetic association studies of quantitative traits. Principal component regression (PCR) and linear mixed model (LMM) are two commonly used approaches to account for PS in association studies. Previous studies have shown that LMM can be interpreted as including all principal components (PCs) as random-effect covariates. However, including all PCs in LMM may dilute the influence of relevant PCs in some scenarios, while including only a few preselected PCs in PCR may fail to fully capture the genetic diversity. MATERIALS AND METHODS To address these shortcomings, we introduce Bayestrat-a method to detect associated variants with PS correction under the Bayesian LASSO framework. To adjust for PS, Bayestrat accommodates a large number of PCs and utilizes appropriate shrinkage priors to shrink the effects of nonassociated PCs. RESULTS Simulation results show that Bayestrat consistently controls type I error rates and achieves higher power compared to its non-shrinkage counterparts, especially when the number of PCs included in the model is large. As a demonstration of the utility of Bayestrat, we apply it to the Multi-Ethnic Study of Atherosclerosis (MESA). Variants and genes associated with serum triglyceride or HDL cholesterol are identified in our analyses. DISCUSSION The automatic and self-selection features of Bayestrat make it particularly suited in situations with complex underlying PS scenarios, where it is unknown a priori which PCs are potential confounders, yet the number that needs to be considered could be large in order to fully account for PS.
Collapse
Affiliation(s)
- Zilu Liu
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| | - Asuman S. Turkmen
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| | - Shili Lin
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
12
|
Jiang C, Li M, Hu Y, Du X, Li X, He L, Lai Y, Chen T, Li Y, Guo X, Jiang C, Tang R, Sang C, Long D, Xie G, Dong J, Ma C. Identification of atrial fibrillation phenotypes at low risk of stroke in patients with CHA2DS2-VASc ≥2: Insight from the China-AF study. Pacing Clin Electrophysiol 2023; 46:1203-1211. [PMID: 37736697 DOI: 10.1111/pace.14829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 08/07/2023] [Accepted: 09/07/2023] [Indexed: 09/23/2023]
Abstract
OBJECTIVE Patients with atrial fibrillation (AF) are highly heterogeneous, and current risk stratification scores are only modestly good at predicting an individual's stroke risk. We aim to identify distinct AF clinical phenotypes with cluster analysis to optimize stroke prevention practices. METHODS From the prospective Chinese Atrial Fibrillation Registry cohort study, we included 4337 AF patients with CHA2 DS2 -VASc≥2 for males and 3 for females who were not treated with oral anticoagulation. We randomly split the patients into derivation and validation sets by a ratio of 7:3. In the derivation set, we used outcome-driven patient clustering with metric learning to group patients into clusters with different risk levels of ischemic stroke and systemic embolism, and identify clusters of patients with low risks. Then we tested the results in the validation set, using the clustering rules generated from the derivation set. Finally, the survival decision tree was applied as a sensitivity analysis to confirm the results. RESULTS Up to the follow-up of 1 year, 140 thromboembolic events (ischemic stroke or systemic embolism) occurred. After supervised metric learning from six variables involved in CHA2 DS2 -VASc scheme, we identified a cluster of patients (255/3035, 8.4%) at an annual thromboembolism risk of 0.8% in the derivation set. None of the patients in the low-risk cluster had prior thromboembolism, heart failure, diabetes, or age older than 70 years. After applying the regularities from metric learning on the validation set, we also identified a cluster of patients (137/1302, 10.5%) with an incident thromboembolism rate of 0.7%. Sensitivity analysis based on the survival decision tree approach selected a subgroup of patients with the same phenotypes as the metric-learning algorithm. CONCLUSIONS Cluster analysis identified a distinct clinical phenotype at low risk of stroke among high-risk [CHA2 DS2 -VASc≥2 (3 for females)] patients with AF. The use of the novel analytic approach has the potential to prevent a subset of AF patients from unnecessary anticoagulation and avoid the associated risk of major bleeding.
Collapse
Affiliation(s)
- Chao Jiang
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Mingxiao Li
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Yiying Hu
- Ping An Health Technology, Beijing, China
| | - Xin Du
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
- Heart Health Research Center, Beijing, China
| | - Xiang Li
- Ping An Health Technology, Beijing, China
| | - Liu He
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Yiwei Lai
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Tiange Chen
- School of Public Health, Peking University Health Science Center, Beijing, China
| | - Yingxue Li
- Ping An Health Technology, Beijing, China
| | - Xueyuan Guo
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Chenxi Jiang
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Ribo Tang
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Caihua Sang
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | - Deyong Long
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| | | | - Jianzeng Dong
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
- Department of Cardiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan Province, China
| | - Changsheng Ma
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, National Clinical Research Center for Cardiovascular Diseases, Beijing, China
| |
Collapse
|
13
|
Zhang Z, Wei X. Artificial intelligence-assisted selection and efficacy prediction of antineoplastic strategies for precision cancer therapy. Semin Cancer Biol 2023; 90:57-72. [PMID: 36796530 DOI: 10.1016/j.semcancer.2023.02.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/12/2023] [Accepted: 02/13/2023] [Indexed: 02/16/2023]
Abstract
The rapid development of artificial intelligence (AI) technologies in the context of the vast amount of collectable data obtained from high-throughput sequencing has led to an unprecedented understanding of cancer and accelerated the advent of a new era of clinical oncology with a tone of precision treatment and personalized medicine. However, the gains achieved by a variety of AI models in clinical oncology practice are far from what one would expect, and in particular, there are still many uncertainties in the selection of clinical treatment options that pose significant challenges to the application of AI in clinical oncology. In this review, we summarize emerging approaches, relevant datasets and open-source software of AI and show how to integrate them to address problems from clinical oncology and cancer research. We focus on the principles and procedures for identifying different antitumor strategies with the assistance of AI, including targeted cancer therapy, conventional cancer therapy, and cancer immunotherapy. In addition, we also highlight the current challenges and directions of AI in clinical oncology translation. Overall, we hope this article will provide researchers and clinicians with a deeper understanding of the role and implications of AI in precision cancer therapy, and help AI move more quickly into accepted cancer guidelines.
Collapse
Affiliation(s)
- Zhe Zhang
- Laboratory of Aging Research and Cancer Drug Target, State Key Laboratory of Biotherapy and Cancer Center, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610041, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu 610041, PR China
| | - Xiawei Wei
- Laboratory of Aging Research and Cancer Drug Target, State Key Laboratory of Biotherapy and Cancer Center, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610041, PR China.
| |
Collapse
|
14
|
Thia JA. Guidelines for standardizing the application of discriminant analysis of principal components to genotype data. Mol Ecol Resour 2023; 23:523-538. [PMID: 36039574 DOI: 10.1111/1755-0998.13706] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 08/07/2022] [Accepted: 08/11/2022] [Indexed: 11/29/2022]
Abstract
Despite the popularity of discriminant analysis of principal components (DAPC) for studying population structure, there has been little discussion of best practice for this method. In this work, I provide guidelines for standardizing the application of DAPC to genotype data sets. An often overlooked fact is that DAPC generates a model describing genetic differences among a set of populations defined by a researcher. Appropriate parameterization of this model is critical for obtaining biologically meaningful results. I show that the number of leading PC axes used as predictors of among-population differences, paxes , should not exceed the k-1 biologically informative PC axes that are expected for k effective populations in a genotype data set. This k-1 criterion for paxes specification is more appropriate compared to the widely used proportional variance criterion, which often results in a choice of paxes ≫ k-1. DAPC parameterized with no more than the leading k-1 PC axes: (i) is more parsimonious; (ii) captures maximal among-population variation on biologically relevant predictors; (iii) is less sensitive to unintended interpretations of population structure; and (iv) is more generally applicable to independent sample sets. Assessing model fit should be routine practice and aids interpretation of population structure. It is imperative that researchers articulate their study goals, that is, testing a priori expectations vs. studying de novo inferred populations, because this has implications on how their DAPC results should be interpreted. The discussion and practical recommendations in this work provide the molecular ecology community with a roadmap for using DAPC in population genetic investigations.
Collapse
Affiliation(s)
- Joshua A Thia
- Bio21 Institute, School of BioSciences, The University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
15
|
Xiang Q, Xie Q, Liu Z, Mu G, Zhang H, Zhou S, Wang Z, Wang Z, Zhang Y, Zhao Z, Yuan D, Guo L, Wang N, Xiang J, Song H, Sun J, Jiang J, Cui Y. Genetic variations in relation to bleeding and pharmacodynamics of dabigatran in Chinese patients with nonvalvular atrial fibrillation: A nationwide multicentre prospective cohort study. Clin Transl Med 2022; 12:e1104. [PMID: 36453946 PMCID: PMC9714378 DOI: 10.1002/ctm2.1104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 10/19/2022] [Accepted: 10/21/2022] [Indexed: 12/05/2022] Open
Abstract
INTRODUCTION To identify the potential factors responsible for the individual variability of dabigatran, we investigated the genetic variations associated with clinical outcomes and pharmacodynamics (PD) in Chinese patients with nonvalvular atrial fibrillation (NVAF). MATERIALS AND METHODS Chinese patients with NVAF taking dabigatran etexilate with therapeutic doses were enrolled. The primary (bleeding events) and secondary (thromboembolic and major adverse cardiac events) outcomes for a 2-year follow-up were evaluated. Peak and trough PD parameters (anti-FIIa activity, activated partial thromboplastin time and prothrombin time) were detected. Whole-exome sequencing, genome-wide sequencing and candidate gene association analyses were performed. RESULTS There were 170 patients with NVAF treated with dabigatran (110 mg twice daily) who were finally included. Two single-nucleotide polymorphisms (SNPs) were significantly related with bleeding, which include UBASH3B rs2276408 (odds ratio [OR] = 8.79, 95% confidence interval [CI]: 2.99-25.83, p = 7.77 × 10-5 at sixth month visit) and FBN2 rs3805625 (OR = 8.29, 95% CI: 2.87-23.89, p = 9.08 × 10-5 at 12th month visit), as well as with increased trends at other visits (p < .05). Furthermore, minor allele carriers of 16 new SNPs increased PD levels, and those of one new SNP decreased PD values (p < 1.0 × 10-5 ). Lastly, 33 new SNPs were found to be associated with bleeding and PD among 14 candidate genes. Unfortunately, the low number of secondary outcomes precluded further association analyses. CONCLUSIONS Genetic variations indeed affected bleeding and PD in Chinese patients with NVAF treated with dabigatran. The functions of these suggestive genes and SNPs might further be explored and verified in more in vivo and in vitro investigations.
Collapse
Affiliation(s)
- Qian Xiang
- Department of PharmacyPeking University First HospitalBeijingChina
| | - Qiufen Xie
- Department of PharmacyPeking University First HospitalBeijingChina
| | - Zhiyan Liu
- Department of PharmacyPeking University First HospitalBeijingChina
| | - Guangyan Mu
- Department of PharmacyPeking University First HospitalBeijingChina
| | - Hanxu Zhang
- Department of PharmacyPeking University First HospitalBeijingChina
- School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
| | - Shuang Zhou
- Department of PharmacyPeking University First HospitalBeijingChina
| | - Zhe Wang
- Department of PharmacyPeking University First HospitalBeijingChina
- School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
| | - Zining Wang
- Department of PharmacyPeking University First HospitalBeijingChina
| | - Yatong Zhang
- Department of PharmacyBeijing HospitalBeijingChina
| | - Zinan Zhao
- Department of PharmacyBeijing HospitalBeijingChina
| | - Dongdong Yuan
- Department of PharmacyZhengzhou Seventh People's HospitalZhengzhouChina
| | - Liping Guo
- Department of PharmacyZhengzhou Seventh People's HospitalZhengzhouChina
| | - Na Wang
- Department of PharmacyThe Second Affiliated Hospital of Chongqing Medical UniversityChongqingChina
| | - Jing Xiang
- Department of PharmacyThe Second Affiliated Hospital of Chongqing Medical UniversityChongqingChina
| | - Hongtao Song
- Department of Pharmacy900 Hospital of the Joint Logistics TeamFuzhouChina
| | - Jianjun Sun
- Department of PharmacyThe Affiliated Hospital of Inner Mongolia Medical UniversityHuhehaoteChina
| | - Jie Jiang
- Department of CardiologyPeking University First HospitalBeijingChina
| | - Yimin Cui
- Department of PharmacyPeking University First HospitalBeijingChina
- School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
- Institute of Clinical PharmacologyPeking UniversityBeijingChina
| |
Collapse
|
16
|
Correa R, Alonso-Pupo N, Hernández Rodríguez EW. Multi-omics data integration approaches for precision oncology. Mol Omics 2022; 18:469-479. [DOI: 10.1039/d1mo00411e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Next-generation sequencing (NGS) has been pivotal to enhance the molecular characterization of human malignancies, allowing multiple omics data types to be available for cancer researchers and practitioners. In this context,...
Collapse
|
17
|
Abegaz F, Van Lishout F, Mahachie John JM, Chiachoompu K, Bhardwaj A, Duroux D, Gusareva ES, Wei Z, Hakonarson H, Van Steen K. Performance of model-based multifactor dimensionality reduction methods for epistasis detection by controlling population structure. BioData Min 2021; 14:16. [PMID: 33608043 PMCID: PMC7893746 DOI: 10.1186/s13040-021-00247-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 02/07/2021] [Indexed: 12/15/2022] Open
Abstract
Background In genome-wide association studies the extent and impact of confounding due to population structure have been well recognized. Inadequate handling of such confounding is likely to lead to spurious associations, hampering replication, and the identification of causal variants. Several strategies have been developed for protecting associations against confounding, the most popular one is based on Principal Component Analysis. In contrast, the extent and impact of confounding due to population structure in gene-gene interaction association epistasis studies are much less investigated and understood. In particular, the role of nonlinear genetic population substructure in epistasis detection is largely under-investigated, especially outside a regression framework. Methods To identify causal variants in synergy, to improve interpretability and replicability of epistasis results, we introduce three strategies based on a model-based multifactor dimensionality reduction approach for structured populations, namely MBMDR-PC, MBMDR-PG, and MBMDR-GC. Results Simulation results comparing the performance of various approaches show that in the presence of population structure MBMDR-PC and MBMDR-PG consistently better control type I error rate at the nominal level than MBMDR-GC. Moreover, our proposed three methods of population structure correction outperform MDR-SP in terms of statistical power. Conclusion We demonstrate through extensive simulation studies the effect of various degrees of genetic population structure and relatedness on epistasis detection and propose appropriate remedial measures based on linear and nonlinear sample genetic similarity. Supplementary Information The online version contains supplementary material available at 10.1186/s13040-021-00247-w.
Collapse
Affiliation(s)
- Fentaw Abegaz
- GIGA-R, Medical Genomics - BIO3, University of Liège, Liège, Belgium.
| | | | | | | | - Archana Bhardwaj
- GIGA-R, Medical Genomics - BIO3, University of Liège, Liège, Belgium
| | - Diane Duroux
- GIGA-R, Medical Genomics - BIO3, University of Liège, Liège, Belgium
| | - Elena S Gusareva
- GIGA-R, Medical Genomics - BIO3, University of Liège, Liège, Belgium
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Department of Pediatrics, Division of Human Genetics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Kristel Van Steen
- GIGA-R, Medical Genomics - BIO3, University of Liège, Liège, Belgium.,WELBIO (Walloon Excellence in Lifesciences and Biotechnology), University of Liège, Liège, Belgium
| |
Collapse
|
18
|
Chen Y, Hong Y, Yang D, He Z, Lin X, Wang G, Yu W. Simultaneous determination of phenolic metabolites in Chinese citrus and grape cultivars. PeerJ 2020; 8:e9083. [PMID: 32547855 PMCID: PMC7275686 DOI: 10.7717/peerj.9083] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 04/08/2020] [Indexed: 11/22/2022] Open
Abstract
Background As the major bioactive compounds in citrus and grape, it is significant to use the contents of flavonoids and phenolic acids as quality evaluation criteria to provide a better view of classifying the quality and understanding the potential health benefits of each fruit variety. Methods A total of 15 varieties of citrus and 12 varieties of grapes were collected from Fujian, China. High-performance liquid chromatography method was used for the simultaneous determination of 17 phenolic compounds, including gallic acid, chlorogenic acid, caffeic acid, syringic acid, ρ-coumaric acid, ferulic acid, benzoic acid, salicylic acid, catechin, epicatechin, resveratrol, rutin, naringin, hesperidin, quercetin, nobiletin and tangeritin in the peels of citrus and grape cultivars. Further, the cultivars of citrus and grape were classified using principal component analysis (PCA) and hierarchical cluster analysis (HCA). Results A thorough separation of the 17 compounds was achieved within 100 min. The tested method exhibited good linearity (the limits of detection and limits of quantification were in the range of 0.03–1.83 µg/mL and 0.09–5.55 µg/mL, respectively), precision (the relative standard deviations of repeatability were 1.02–1.97%), and recovery (92.2–102.82%) for all the compounds, which could be used for the simultaneous determination of phenolic compounds in citrus and grape. Hesperidin (12.93–26,160.98 µg/g DW) and salicylic acid (5.35–751.02 µg/g DW) were the main flavonoids and phenolic acids in 15 citrus varieties, respectively. Besides, the hesperidin (ND to 605.48 µg/g DW) and salicylic acid (ND to 1,461.79 µg/g DW) were found as the highest flavonoid and the most abundant phenolic acid in grapes, respectively. A total of 15 citrus and 12 grape samples were classified into two main groups by PCA and HCA with strong consistency.
Collapse
Affiliation(s)
- Yuan Chen
- Institute of Agricultural Engineering and Technology, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China.,Harbor Branch Oceanographic Institute, Florida Atlantic University, Fort Pierce, FL, USA.,Fujian Key Laboratory of Agricultural Product Food Processing (FAAS), Fuzhou, Fujian, China
| | - Yanyun Hong
- Hunan Provincial Key Laboratory for Biology and Control of Plant Pests, College of Plant Protection, Hunan Agricultural University, Changsha, Hunan, China
| | - Daofu Yang
- Fujian Academy of Agricultural Sciences, Fuzhou, China
| | - Zhigang He
- Institute of Agricultural Engineering and Technology, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China.,Fujian Key Laboratory of Agricultural Product Food Processing (FAAS), Fuzhou, Fujian, China
| | - Xiaozi Lin
- Institute of Agricultural Engineering and Technology, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian, China.,Fujian Key Laboratory of Agricultural Product Food Processing (FAAS), Fuzhou, Fujian, China
| | - Guojun Wang
- Harbor Branch Oceanographic Institute, Florida Atlantic University, Fort Pierce, FL, USA
| | - Wenquan Yu
- Fujian Academy of Agricultural Sciences, Fuzhou, China
| |
Collapse
|
19
|
Mitchell BL, Cuéllar-Partida G, Grasby KL, Campos AI, Strike LT, Hwang LD, Okbay A, Thompson PM, Medland SE, Martin NG, Wright MJ, Rentería ME. Educational attainment polygenic scores are associated with cortical total surface area and regions important for language and memory. Neuroimage 2020; 212:116691. [PMID: 32126298 DOI: 10.1016/j.neuroimage.2020.116691] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 02/06/2020] [Accepted: 02/26/2020] [Indexed: 02/01/2023] Open
Abstract
It is well established that higher cognitive ability is associated with larger brain size. However, individual variation in intelligence exists despite brain size and recent studies have shown that a simple unifactorial view of the neurobiology underpinning cognitive ability is probably unrealistic. Educational attainment (EA) is often used as a proxy for cognitive ability since it is easily measured, resulting in large sample sizes and, consequently, sufficient statistical power to detect small associations. This study investigates the association between three global (total surface area (TSA), intra-cranial volume (ICV) and average cortical thickness) and 34 regional cortical measures with educational attainment using a polygenic scoring (PGS) approach. Analyses were conducted on two independent target samples of young twin adults with neuroimaging data, from Australia (N = 1097) and the USA (N = 723), and found that higher EA-PGS were significantly associated with larger global brain size measures, ICV and TSA (R2 = 0.006 and 0.016 respectively, p < 0.001) but not average thickness. At the regional level, we identified seven cortical regions-in the frontal and temporal lobes-that showed variation in surface area and average cortical thickness over-and-above the global effect. These regions have been robustly implicated in language, memory, visual recognition and cognitive processing. Additionally, we demonstrate that these identified brain regions partly mediate the association between EA-PGS and cognitive test performance. Altogether, these findings advance our understanding of the neurobiology that underpins educational attainment and cognitive ability, providing focus points for future research.
Collapse
Affiliation(s)
- Brittany L Mitchell
- Department of Genetics & Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia; School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD, Australia.
| | - Gabriel Cuéllar-Partida
- The University of Queensland Diamantina Institute, The University of Queensland, Brisbane, QLD, Australia
| | - Katrina L Grasby
- Department of Genetics & Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Adrian I Campos
- Department of Genetics & Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia; Faculty of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Lachlan T Strike
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia
| | - Liang-Dar Hwang
- The University of Queensland Diamantina Institute, The University of Queensland, Brisbane, QLD, Australia
| | - Aysu Okbay
- Department of Economics, School of Business and Economics, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Paul M Thompson
- Imaging Genetics Center, Mark & Mary Stevens Institute for Neuroimaging & Informatics, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Sarah E Medland
- Department of Genetics & Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Nicholas G Martin
- Department of Genetics & Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia; School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD, Australia
| | - Margaret J Wright
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia; Centre for Advanced Imaging, The University of Queensland, Brisbane, QLD, Australia
| | - Miguel E Rentería
- Department of Genetics & Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia; School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD, Australia
| |
Collapse
|
20
|
Chaichoompu K, Abegaz F, Cavadas B, Fernandes V, Müller-Myhsok B, Pereira L, Van Steen K. A different view on fine-scale population structure in Western African populations. Hum Genet 2020; 139:45-59. [PMID: 31630246 PMCID: PMC6942040 DOI: 10.1007/s00439-019-02069-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 10/09/2019] [Indexed: 01/03/2023]
Abstract
Due to its long genetic evolutionary history, Africans exhibit more genetic variation than any other population in the world. Their genetic diversity further lends itself to subdivisions of Africans into groups of individuals with a genetic similarity of varying degrees of granularity. It remains challenging to detect fine-scale structure in a computationally efficient and meaningful way. In this paper, we present a proof-of-concept of a novel fine-scale population structure detection tool with Western African samples. These samples consist of 1396 individuals from 25 ethnic groups (two groups are African American descendants). The strategy is based on a recently developed tool called IPCAPS. IPCAPS, or Iterative Pruning to CApture Population Structure, is a genetic divisive clustering strategy that enhances iterative pruning PCA, is robust to outliers and does not require a priori computation of haplotypes. Our strategy identified in total 12 groups and 6 groups were revealed as fine-scale structure detected in the samples from Cameroon, Gambia, Mali, Southwest USA, and Barbados. Our finding helped to explain evolutionary processes in the analyzed West African samples and raise awareness for fine-scale structure resolution when conducting genome-wide association and interaction studies.
Collapse
Affiliation(s)
- Kridsadakorn Chaichoompu
- GIGA-R Medical Genomics-BIO3, University of Liege, Avenue de l’Hôpital 11, 4000 Liege, Belgium
- Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Fentaw Abegaz
- GIGA-R Medical Genomics-BIO3, University of Liege, Avenue de l’Hôpital 11, 4000 Liege, Belgium
| | - Bruno Cavadas
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto (i3S), Rua Alfredo Allen, 208, 4200-135 Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Rua Júlio Amaral de Carvalho, 45, 4200-135 Porto, Portugal
| | - Verónica Fernandes
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto (i3S), Rua Alfredo Allen, 208, 4200-135 Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Rua Júlio Amaral de Carvalho, 45, 4200-135 Porto, Portugal
| | | | - Luísa Pereira
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto (i3S), Rua Alfredo Allen, 208, 4200-135 Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Rua Júlio Amaral de Carvalho, 45, 4200-135 Porto, Portugal
| | - Kristel Van Steen
- GIGA-R Medical Genomics-BIO3, University of Liege, Avenue de l’Hôpital 11, 4000 Liege, Belgium
- WELBIO (Walloon Excellence in Lifesciences and Biotechnology), Avenue Pasteur 6, 1300 Wavre, Belgium
| |
Collapse
|
21
|
Population Structure and Genetic Diversity of Italian Beef Breeds as a Tool for Planning Conservation and Selection Strategies. Animals (Basel) 2019; 9:ani9110880. [PMID: 31671823 PMCID: PMC6912484 DOI: 10.3390/ani9110880] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/18/2019] [Accepted: 10/22/2019] [Indexed: 02/01/2023] Open
Abstract
The aim was to investigate the population structure of eight beef breeds: three local Tuscan breeds under extinction, Calvana (CAL), Mucca Pisana (MUP), and Pontremolese (PON); three local unselected breeds reared in Sardinia, Sarda (SAR), Sardo Bruna (SAB), and Sardo Modicana (SAM); and two cosmopolitan breeds, Charolais (CHA) and Limousine (LIM), reared in the same regions. An effective population size ranges between 14.62 (PON) to 39.79 (SAM) in local breeds, 90.29 for CHA, and 135.65 for LIM. The average inbreeding coefficients were higher in Tuscan breeds (7.25%, 5.10%, and 3.64% for MUP, CAL, and PON, respectively) compared to the Sardinian breeds (1.23%, 1.66%, and 1.90% in SAB, SAM, and SAR, respectively), while for CHA and LIM they were <1%. The highest rates of mating between half-siblings were observed for CAL and MUP (~9% and 6.5%, respectively), while the highest rate of parent-offspring mating was ~8% for MUP. Our findings describe the urgent situation of the three Tuscan breeds and support the application of conservation measures and/or the development of breeding programs. Development of breeding strategies is suggested for the Sardinian breeds.
Collapse
|