1
|
Nikolitsa EK, Kontou PI, Bagos PG. metacp: a versatile software package for combining dependent or independent p-values. BMC Bioinformatics 2025; 26:109. [PMID: 40253343 PMCID: PMC12008841 DOI: 10.1186/s12859-025-06126-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2025] [Accepted: 04/01/2025] [Indexed: 04/21/2025] Open
Abstract
BACKGROUND We present metacp an open-source software package which implements an abundance of statistical methods for the combination of both independent p-values, with methods such as Fisher's, Stouffer's and Edgington's, and dependent p-values, with methods such as Brown's method and the Cauchy Combination Test. RESULTS The tool is available in Python and STATA, it is very fast, and it is easy to use, requiring only minimal input. It offers a useful resource for combining both independent and dependent p-values, responding to diverse analytical needs for practitioners performing meta-analyses and bioinformaticians developing tools for a variety of applications. Depending on the input data it can be used for gene-based testing, for analysis of multiple traits in GWAS, or for combining diverse multi-omics data such as those of a TWAS, a colocalization or an RNA-seq study. CONCLUSIONS Compared to other similar packages (like poolr or metap), metacp implements the largest collection of statistical methods for this problem, offering users the flexibility to choose from a wide variety of approaches. Being available both as a standalone Python tool and as a STATA command, metacp is accessible to a broad and diverse audience, including practitioners conducting meta-analyses across various fields and bioinformaticians developing new tools where p-value combination is a crucial component.
Collapse
Affiliation(s)
- Evgenia K Nikolitsa
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35100, Lamia, Greece
| | | | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35100, Lamia, Greece.
| |
Collapse
|
2
|
Tong H, Guo X, Jacques M, Luo Q, Eynon N, Teschendorff AE. Cell-type specific epigenetic clocks to quantify biological age at cell-type resolution. Aging (Albany NY) 2024; 16:13452-13504. [PMID: 39760516 PMCID: PMC11723652 DOI: 10.18632/aging.206184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Accepted: 12/12/2024] [Indexed: 01/07/2025]
Abstract
The ability to accurately quantify biological age could help monitor and control healthy aging. Epigenetic clocks have emerged as promising tools for estimating biological age, yet they have been developed from heterogeneous bulk tissues, and are thus composites of two aging processes, one reflecting the change of cell-type composition with age and another reflecting the aging of individual cell-types. There is thus a need to dissect and quantify these two components of epigenetic clocks, and to develop epigenetic clocks that can yield biological age estimates at cell-type resolution. Here we demonstrate that in blood and brain, approximately 39% and 12% of an epigenetic clock's accuracy is driven by underlying shifts in lymphocyte and neuronal subsets, respectively. Using brain and liver tissue as prototypes, we build and validate neuron and hepatocyte specific DNA methylation clocks, and demonstrate that these cell-type specific clocks yield improved estimates of chronological age in the corresponding cell and tissue-types. We find that neuron and glia specific clocks display biological age acceleration in Alzheimer's Disease with the effect being strongest for glia in the temporal lobe. Moreover, CpGs from these clocks display a small but significant overlap with the causal DamAge-clock, mapping to key genes implicated in neurodegeneration. The hepatocyte clock is found accelerated in liver under various pathological conditions. In contrast, non-cell-type specific clocks do not display biological age-acceleration, or only do so marginally. In summary, this work highlights the importance of dissecting epigenetic clocks and quantifying biological age at cell-type resolution.
Collapse
Affiliation(s)
- Huige Tong
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institute for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiaolong Guo
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institute for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Macsue Jacques
- Australian Regenerative Medicine Institute (ARMI), Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria 3800, Australia
| | - Qi Luo
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institute for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Nir Eynon
- Australian Regenerative Medicine Institute (ARMI), Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria 3800, Australia
| | - Andrew E. Teschendorff
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institute for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
3
|
He L, Sui Y, Che Y, Wang H, Rashid KY, Cloutier S, You FM. Genome-wide association studies using multi-models and multi-SNP datasets provide new insights into pasmo resistance in flax. FRONTIERS IN PLANT SCIENCE 2023; 14:1229457. [PMID: 37954993 PMCID: PMC10634603 DOI: 10.3389/fpls.2023.1229457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 07/24/2023] [Indexed: 11/14/2023]
Abstract
Introduction Flax (Linum usitatissimum L.) is an economically important crop due to its oil and fiber. However, it is prone to various diseases, including pasmo caused by the fungus Septoria linicola. Methods In this study, we conducted field evaluations of 445 flax accessions over a five-year period (2012-2016) to assess their resistance to pasmo A total of 246,035 single nucleotide polymorphisms (SNPs) were used for genetic analysis. Four statistical models, including the single-locus model GEMMA and the multi-locus models FarmCPU, mrMLM, and 3VmrMLM, were assessed to identify quantitative trait nucleotides (QTNs) associated with pasmo resistance. Results We identified 372 significant QTNs or 132 tag QTNs associated with pasmo resistance from five pasmo resistance datasets (PAS2012-PAS2016 and the 5-year average, namely PASmean) and three genotypic datasets (the all SNPs/ALL, the gene-based SNPs/GB and the RGA-based SNPs/RGAB). The tag QTNs had R2 values of 0.66-16.98% from the ALL SNP dataset, 0.68-20.54%from the GB SNP dataset, and 0.52-22.42% from the RGAB SNP dataset. Of these tag QTNs, 93 were novel. Additionally, 37 resistance gene analogs (RGAs)co-localizing with 39 tag QTNs were considered as potential candidates for controlling pasmo resistance in flax and 50 QTN-by-environment interactions(QEIs) were identified to account for genes by environmental interactions. Nine RGAs were predicted as candidate genes for ten QEIs. Discussion Our results suggest that pasmo resistance in flax is polygenic and potentially influenced by environmental factors. The identified QTNs provide potential targets for improving pasmo resistance in flax breeding programs. This study sheds light on the genetic basis of pasmo resistance and highlights the importance of considering both genetic and environmental factors in breeding programs for flax.
Collapse
Affiliation(s)
- Liqiang He
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
- School of Tropical Agriculture and Forestry, School of Tropical Crops, Hainan University, Haikou, China
| | - Yao Sui
- School of Tropical Agriculture and Forestry, School of Tropical Crops, Hainan University, Haikou, China
| | - Yanru Che
- School of Tropical Agriculture and Forestry, School of Tropical Crops, Hainan University, Haikou, China
| | - Huixian Wang
- School of Tropical Agriculture and Forestry, School of Tropical Crops, Hainan University, Haikou, China
| | - Khalid Y. Rashid
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
| | - Sylvie Cloutier
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
| | - Frank M. You
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
| |
Collapse
|
4
|
Sajal IH, Biswas S. Bivariate quantitative Bayesian LASSO for detecting association of rare haplotypes with two correlated continuous phenotypes. Front Genet 2023; 14:1104727. [PMID: 36968609 PMCID: PMC10033866 DOI: 10.3389/fgene.2023.1104727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/21/2023] [Indexed: 03/12/2023] Open
Abstract
In genetic association studies, the multivariate analysis of correlated phenotypes offers statistical and biological advantages compared to analyzing one phenotype at a time. The joint analysis utilizes additional information contained in the correlation and avoids multiple testing. It also provides an opportunity to investigate and understand shared genetic mechanisms of multiple phenotypes. Bivariate logistic Bayesian LASSO (LBL) was proposed earlier to detect rare haplotypes associated with two binary phenotypes or one binary and one continuous phenotype jointly. There is currently no haplotype association test available that can handle multiple continuous phenotypes. In this study, by employing the framework of bivariate LBL, we propose bivariate quantitative Bayesian LASSO (QBL) to detect rare haplotypes associated with two continuous phenotypes. Bivariate QBL removes unassociated haplotypes by regularizing the regression coefficients and utilizing a latent variable to model correlation between two phenotypes. We carry out extensive simulations to investigate the performance of bivariate QBL and compare it with that of a standard (univariate) haplotype association test, Haplo.score (applied twice to two phenotypes individually). Bivariate QBL performs better than Haplo.score in all simulations with varying degrees of power gain. We analyze Genetic Analysis Workshop 19 exome sequencing data on systolic and diastolic blood pressures and detect several rare haplotypes associated with the two phenotypes.
Collapse
Affiliation(s)
| | - Swati Biswas
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX, United States
| |
Collapse
|
5
|
Belonogova NM, Svishcheva GR, Kirichenko AV, Zorkoltseva IV, Tsepilov YA, Axenovich TI. sumSTAAR: A flexible framework for gene-based association studies using GWAS summary statistics. PLoS Comput Biol 2022; 18:e1010172. [PMID: 35653402 PMCID: PMC9197066 DOI: 10.1371/journal.pcbi.1010172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 06/14/2022] [Accepted: 05/05/2022] [Indexed: 11/19/2022] Open
Abstract
Gene-based association analysis is an effective gene-mapping tool. Many gene-based methods have been proposed recently. However, their power depends on the underlying genetic architecture, which is rarely known in complex traits, and so it is likely that a combination of such methods could serve as a universal approach. Several frameworks combining different gene-based methods have been developed. However, they all imply a fixed set of methods, weights and functional annotations. Moreover, most of them use individual phenotypes and genotypes as input data. Here, we introduce sumSTAAR, a framework for gene-based association analysis using summary statistics obtained from genome-wide association studies (GWAS). It is an extended and modified version of STAAR framework proposed by Li and colleagues in 2020. The sumSTAAR framework offers a wider range of gene-based methods to combine. It allows the user to arbitrarily define a set of these methods, weighting functions and probabilities of genetic variants being causal. The methods used in the framework were adapted to analyse genes with large number of SNPs to decrease the running time. The framework includes the polygene pruning procedure to guard against the influence of the strong GWAS signals outside the gene. We also present new improved matrices of correlations between the genotypes of variants within genes. These matrices estimated on a sample of 265,000 individuals are a state-of-the-art replacement of widely used matrices based on the 1000 Genomes Project data.
Collapse
Affiliation(s)
- Nadezhda M. Belonogova
- Laboratory of Segregation and Recombination Analyses, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Gulnara R. Svishcheva
- Laboratory of Segregation and Recombination Analyses, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- Laboratory of Animal Genetics, Vavilov Institute of General Genetics, the Russian Academy of Sciences, Moscow, Russia
| | - Anatoly V. Kirichenko
- Laboratory of Segregation and Recombination Analyses, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Irina V. Zorkoltseva
- Laboratory of Segregation and Recombination Analyses, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Yakov A. Tsepilov
- Laboratory of Segregation and Recombination Analyses, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Tatiana I. Axenovich
- Laboratory of Segregation and Recombination Analyses, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
6
|
Adam Y, Samtal C, Brandenburg JT, Falola O, Adebiyi E. Performing post-genome-wide association study analysis: overview, challenges and recommendations. F1000Res 2021; 10:1002. [PMID: 35222990 PMCID: PMC8847724 DOI: 10.12688/f1000research.53962.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/22/2021] [Indexed: 12/17/2022] Open
Abstract
Genome-wide association studies (GWAS) provide huge information on statistically significant single-nucleotide polymorphisms (SNPs) associated with various human complex traits and diseases. By performing GWAS studies, scientists have successfully identified the association of hundreds of thousands to millions of SNPs to a single phenotype. Moreover, the association of some SNPs with rare diseases has been intensively tested. However, classic GWAS studies have not yet provided solid, knowledgeable insight into functional and biological mechanisms underlying phenotypes or mechanisms of diseases. Therefore, several post-GWAS (pGWAS) methods have been recommended. Currently, there is no simple scientific document to provide a quick guide for performing pGWAS analysis. pGWAS is a crucial step for a better understanding of the biological machinery beyond the SNPs. Here, we provide an overview to performing pGWAS analysis and demonstrate the challenges behind each method. Furthermore, we direct readers to key articles for each pGWAS method and present the overall issues in pGWAS analysis. Finally, we include a custom pGWAS pipeline to guide new users when performing their research.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Chaimae Samtal
- Laboratory of Biotechnology, Environment, Agri-food and Health, Sidi Mohammed Ben Abdellah University, Fez, Fez-Meknes, 30000, Morocco
| | - Jean-tristan Brandenburg
- Sydney Brenner Institute for Molecular Bioscience (SBIMB), University of the Witwatersrand, Johannesburg, South Africa
| | - Oluwadamilare Falola
- Laboratory of Biotechnology, Environment, Agri-food and Health, Sidi Mohammed Ben Abdellah University, Fez, Fez-Meknes, 30000, Morocco
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
- Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence, Covenant University, Ota, Ogun, 112233, Nigeria
- Applied Bioinformatics Division, German Cancer Center DKFZ - Heidelberg University, Heidelberg, Baden-Württemberg, 69120, Germany
| |
Collapse
|