1
|
Zheng Y, Lunetta KL, Liu C, Smith AK, Sherva R, Miller MW, Logue MW. A novel principal component based method for identifying differentially methylated regions in Illumina Infinium MethylationEPIC BeadChip data. Epigenetics 2023; 18:2207959. [PMID: 37196182 PMCID: PMC10193914 DOI: 10.1080/15592294.2023.2207959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 03/22/2023] [Accepted: 04/19/2023] [Indexed: 05/19/2023] Open
Abstract
Differentially methylated regions (DMRs) are genomic regions with methylation patterns across multiple CpG sites that are associated with a phenotype. In this study, we proposed a Principal Component (PC) based DMR analysis method for use with data generated using the Illumina Infinium MethylationEPIC BeadChip (EPIC) array. We obtained methylation residuals by regressing the M-values of CpGs within a region on covariates, extracted PCs of the residuals, and then combined association information across PCs to obtain regional significance. Simulation-based genome-wide false positive (GFP) rates and true positive rates were estimated under a variety of conditions before determining the final version of our method, which we have named DMRPC. Then, DMRPC and another DMR method, coMethDMR, were used to perform epigenome-wide analyses of several phenotypes known to have multiple associated methylation loci (age, sex, and smoking) in a discovery and a replication cohort. Among regions that were analysed by both methods, DMRPC identified 50% more genome-wide significant age-associated DMRs than coMethDMR. The replication rate for the loci that were identified by only DMRPC was higher than the rate for those that were identified by only coMethDMR (90% for DMRPC vs. 76% for coMethDMR). Furthermore, DMRPC identified replicable associations in regions of moderate between-CpG correlation which are typically not analysed by coMethDMR. For the analyses of sex and smoking, the advantage of DMRPC was less clear. In conclusion, DMRPC is a new powerful DMR discovery tool that retains power in genomic regions with moderate correlation across CpGs.
Collapse
Affiliation(s)
- Yuanchao Zheng
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Kathryn L. Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Chunyu Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Alicia K. Smith
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA
| | - Richard Sherva
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Department of Psychiatry, Boston University School of Medicine, Boston, MA, USA
| | - Mark W. Miller
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Biomedical Genetics, Boston University School of Medicine, Boston, MA, USA
| | - Mark W. Logue
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Department of Psychiatry, Boston University School of Medicine, Boston, MA, USA
- Biomedical Genetics, Boston University School of Medicine, Boston, MA, USA
| |
Collapse
|
2
|
Genetic and environment effects on structural neuroimaging endophenotype for bipolar disorder: a novel molecular approach. Transl Psychiatry 2022; 12:137. [PMID: 35379780 PMCID: PMC8980067 DOI: 10.1038/s41398-022-01892-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 03/03/2022] [Accepted: 03/10/2022] [Indexed: 12/15/2022] Open
Abstract
We investigated gene-environment effects on structural brain endophenotype in bipolar disorder (BD) using a novel method of combining polygenic risk scores with epigenetic signatures since traditional methods of examining the family history and trauma effects have significant limitations. The study enrolled 119 subjects, including 55 BD spectrum (BDS) subjects diagnosed with BD or major depressive disorder (MDD) with subthreshold BD symptoms and 64 non-BDS subjects comprising 32 MDD subjects without BD symptoms and 32 healthy subjects. The blood samples underwent genome-wide genotyping and methylation quantification. We derived polygenic risk score (PRS) and methylation profile score (MPS) as weighted summations of risk single nucleotide polymorphisms and methylation probes, respectively, which were considered as molecular measures of genetic and environmental risks for BD. Linear regression was used to relate PRS, MPS, and their interaction to 44 brain structure measures quantified from magnetic resonance imaging (MRI) on 47 BDS subjects, and the results were compared with those based on family history and childhood trauma. After multiplicity corrections using false discovery rate (FDR), MPS was found to be negatively associated with the volume of the medial geniculate thalamus (FDR = 0.059, partial R2 = 0.208). Family history, trauma scale, and PRS were not associated with any brain measures. PRS and MPS show significant interactions on whole putamen (FDR = 0.09, partial R2 = 0.337). No significant gene-environment interactions were identified for the family history and trauma scale. PRS and MPS generally explained greater proportions of variances of the brain measures (range of partial R2 = [0.008, 0.337]) than the clinical risk factors (range = [0.004, 0.228]).
Collapse
|
3
|
Juvinao-Quintero DL, Cardenas A, Perron P, Bouchard L, Lutz SM, Hivert MF. Associations between an integrated component of maternal glycemic regulation in pregnancy and cord blood DNA methylation. Epigenomics 2021; 13:1459-1472. [PMID: 34596421 DOI: 10.2217/epi-2021-0220] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Background: Previous studies suggest that fetal programming to hyperglycemia in pregnancy is due to modulation of DNA methylation (DNAm), but they have been limited in their maternal glycemic characterization. Methods: In the Gen3G study, we used a principal component analysis to integrate multiple glucose and insulin values measured during the second trimester oral glucose tolerance test. We investigated associations between principal components and cord blood DNAm levels in an epigenome-wide analysis among 430 mother-child pairs. Results: The first principal component was robustly associated with lower DNAm at cg26974062 (TXNIP; p = 9.9 × 10-9) in cord blood. TXNIP is a well-known DNAm marker for type 2 diabetes in adults. Conclusion: We hypothesize that abnormal glucose metabolism in pregnancy may program dysregulation of TXNIP across the life course.
Collapse
Affiliation(s)
- Diana L Juvinao-Quintero
- Division of Chronic Disease Research Across the Life Course, Department of Population Medicine, Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, MA 02215, USA
| | - Andres Cardenas
- Division of Environmental Health Sciences, School of Public Health & Center for Computational Biology, University of California, Berkeley, CA 94720-7360, USA
| | - Patrice Perron
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, J1H 5N4, Canada.,Department of Medicine, Université de Sherbrooke, Sherbrooke, QC, J1H 5N4, Canada
| | - Luigi Bouchard
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, J1H 5N4, Canada.,Department of Medical Biology, Centre Intégré Universitaire en Santé et Services Sociaux Saguenay-Lac-Saint-Jean, Hôpital Universitaire de Chicoutimi, Saguenay, QC, G7H 5H6, Canada.,Department of Biochemistry & Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, J1K 2R1, Canada
| | - Sharon M Lutz
- Division of Chronic Disease Research Across the Life Course, Department of Population Medicine, Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, MA 02215, USA.,Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA 02215, USA
| | - Marie-France Hivert
- Division of Chronic Disease Research Across the Life Course, Department of Population Medicine, Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, MA 02215, USA.,Department of Medicine, Université de Sherbrooke, Sherbrooke, QC, J1H 5N4, Canada.,Diabetes Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| |
Collapse
|
4
|
Odom GJ, Ban Y, Colaprico A, Liu L, Silva TC, Sun X, Pico AR, Zhang B, Wang L, Chen X. PathwayPCA: an R/Bioconductor Package for Pathway Based Integrative Analysis of Multi-Omics Data. Proteomics 2020; 20:e1900409. [PMID: 32430990 PMCID: PMC7677175 DOI: 10.1002/pmic.201900409] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 05/01/2020] [Indexed: 01/01/2023]
Abstract
The authors present pathwayPCA, an R/Bioconductor package for integrative pathway analysis that utilizes modern statistical methodology, including supervised and adaptive, elastic-net, sparse principal component analysis. pathwayPCA can be applied to continuous, binary, and survival outcomes in studies with multiple covariates and/or interaction effects. It outperforms several alternative methods at identifying disease-associated pathways in integrative analysis using both simulated and real datasets. In addition, several case studies are provided to illustrate pathwayPCA analysis with gene selection, estimating, and visualizing sample-specific pathway activities, identifying sex-specific pathway effects in kidney cancer, and building integrative models for predicting patient prognosis. pathwayPCA is an open-source R package, freely available through the Bioconductor repository. pathwayPCA is expected to be a useful tool for empowering the wider scientific community to analyze and interpret the wealth of available proteomics data, along with other types of molecular data recently made available by Clinical Proteomic Tumor Analysis Consortium and other large consortiums.
Collapse
Affiliation(s)
- Gabriel J. Odom
- Department of Biostatistics, Florida International University, Stempel College of Public Health, Miami, FL 33199, USA
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| | - Yuguang Ban
- Sylvester Comprehensive Cancer Center, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| | - Antonio Colaprico
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| | - Lizhong Liu
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| | - Tiago Chedraoui Silva
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| | - Xiaodian Sun
- Sylvester Comprehensive Cancer Center, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| | - Alexander R. Pico
- Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston TX 77030, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX 77030, USA
| | - Lily Wang
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
- Dr. John T Macdonald Foundation Department of Human Genetics, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Xi Chen
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miller School of Medicine, Miami, FL 33136, USA
| |
Collapse
|
5
|
Cerniglia L, Cimino S, Bevilacqua A, Ballarotto G, Marzilli E, Adriani W, Tambelli R. Patterns of DNA methylation at specific loci of the dopamine transporter 1 gene and psychopathological risk in trios of mothers, fathers and children. EUROPEAN JOURNAL OF DEVELOPMENTAL PSYCHOLOGY 2020. [DOI: 10.1080/17405629.2020.1816166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Luca Cerniglia
- Faculty of Psychology, International Telematic University Uninettuno, Rome, Italy
| | - Silvia Cimino
- Department of Dynamic and Clinical Psychology, Sapienza University of Rome, Rome, Italy
| | - Arturo Bevilacqua
- Department of Dynamic and Clinical Psychology, Sapienza University of Rome, Rome, Italy
- Research Center in Neurobiology Daniel Bovet” (Crin), Rome, Italy
- Systems Biology Group Lab, Rome, Italy
| | - Giulia Ballarotto
- Department of Dynamic and Clinical Psychology, Sapienza University of Rome, Rome, Italy
| | - Eleonora Marzilli
- Department of Dynamic and Clinical Psychology, Sapienza University of Rome, Rome, Italy
| | - Walter Adriani
- Center for Behavioral Sciences and Mental Health, Istituto Superiore Di Sanità, Rome, Italy
| | - Renata Tambelli
- Department of Dynamic and Clinical Psychology, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
6
|
Gomez L, Odom GJ, Young JI, Martin ER, Liu L, Chen X, Griswold AJ, Gao Z, Zhang L, Wang L. coMethDMR: accurate identification of co-methylated and differentially methylated regions in epigenome-wide association studies with continuous phenotypes. Nucleic Acids Res 2019; 47:e98. [PMID: 31291459 PMCID: PMC6753499 DOI: 10.1093/nar/gkz590] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2019] [Revised: 06/09/2019] [Accepted: 07/08/2019] [Indexed: 12/12/2022] Open
Abstract
Recent technology has made it possible to measure DNA methylation profiles in a cost-effective and comprehensive genome-wide manner using array-based technology for epigenome-wide association studies. However, identifying differentially methylated regions (DMRs) remains a challenging task because of the complexities in DNA methylation data. Supervised methods typically focus on the regions that contain consecutive highly significantly differentially methylated CpGs in the genome, but may lack power for detecting small but consistent changes when few CpGs pass stringent significance threshold after multiple comparison. Unsupervised methods group CpGs based on genomic annotations first and then test them against phenotype, but may lack specificity because the regional boundaries of methylation are often not well defined. We present coMethDMR, a flexible, powerful, and accurate tool for identifying DMRs. Instead of testing all CpGs within a genomic region, coMethDMR carries out an additional step that selects co-methylated sub-regions first. Next, coMethDMR tests association between methylation levels within the sub-region and phenotype via a random coefficient mixed effects model that models both variations between CpG sites within the region and differential methylation simultaneously. coMethDMR offers well-controlled Type I error rate, improved specificity, focused testing of targeted genomic regions, and is available as an open-source R package.
Collapse
Affiliation(s)
- Lissette Gomez
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Gabriel J Odom
- Division of Biostatistics, Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Juan I Young
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA.,Dr. John T. Macdonald Foundation, Department of Human Genetics, University of Miami, Miami, FL 33136, USA
| | - Eden R Martin
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA.,Dr. John T. Macdonald Foundation, Department of Human Genetics, University of Miami, Miami, FL 33136, USA
| | - Lizhong Liu
- Division of Biostatistics, Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Xi Chen
- Division of Biostatistics, Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA.,Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Anthony J Griswold
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA.,Dr. John T. Macdonald Foundation, Department of Human Genetics, University of Miami, Miami, FL 33136, USA
| | - Zhen Gao
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Lanyu Zhang
- Division of Biostatistics, Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Lily Wang
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA.,Division of Biostatistics, Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA.,Dr. John T. Macdonald Foundation, Department of Human Genetics, University of Miami, Miami, FL 33136, USA.,Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| |
Collapse
|
7
|
Wang B, Lunetta KL, Dupuis J, Lubitz SA, Trinquart L, Yao L, Ellinor PT, Benjamin EJ, Lin H. Integrative Omics Approach to Identifying Genes Associated With Atrial Fibrillation. Circ Res 2019; 126:350-360. [PMID: 31801406 DOI: 10.1161/circresaha.119.315179] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Rationale: GWAS (Genome-Wide Association Studies) have identified hundreds of genetic loci associated with atrial fibrillation (AF). However, these loci explain only a small proportion of AF heritability. Objective: To develop an approach to identify additional AF-related genes by integrating multiple omics data. Methods and Results: Three types of omics data were integrated: (1) summary statistics from the AFGen 2017 GWAS; (2) a whole blood EWAS (Epigenome-Wide Association Study) of AF; and (3) a whole blood TWAS (Transcriptome-Wide Association Study) of AF. The variant-level GWAS results were collapsed into gene-level associations using fast set-based association analysis. The CpG-level EWAS results were also collapsed into gene-level associations by an adapted SNP-set Kernel Association Test approach. Both GWAS and EWAS gene-based associations were then meta-analyzed with TWAS using a fixed-effects model weighted by the sample size of each data set. A tissue-specific network was subsequently constructed using the NetWAS (Network-Wide Association Study). The identified genes were then compared with the AFGen 2018 GWAS that contained more than triple the number of AF cases compared with AFGen 2017 GWAS. We observed that the multiomics approach identified many more relevant AF-related genes than using AFGen 2018 GWAS alone (1931 versus 206 genes). Many of these genes are involved in the development and regulation of heart- and muscle-related biological processes. Moreover, the gene set identified by multiomics approach explained much more AF variance than those identified by GWAS alone (10.4% versus 3.5%). Conclusions: We developed a strategy to integrate multiple omics data to identify AF-related genes. Our integrative approach may be useful to improve the power of traditional GWAS, which might be particularly useful for rare traits and diseases with limited sample size.
Collapse
Affiliation(s)
- Biqi Wang
- From the Department of Biostatistics (B.W., K.L.L., J.D., L.T.), Boston University School of Public Health, MA
| | - Kathryn L Lunetta
- From the Department of Biostatistics (B.W., K.L.L., J.D., L.T.), Boston University School of Public Health, MA.,Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.)
| | - Josée Dupuis
- From the Department of Biostatistics (B.W., K.L.L., J.D., L.T.), Boston University School of Public Health, MA.,Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.)
| | - Steven A Lubitz
- Department of Epidemiology (E.J.B.), Boston University School of Public Health, MA.,Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.).,Cardiac Arrhythmia Service (S.A.L., P.T.E.), Massachusetts General Hospital, Boston.,Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA (S.A.L., P.T.E.)
| | - Ludovic Trinquart
- From the Department of Biostatistics (B.W., K.L.L., J.D., L.T.), Boston University School of Public Health, MA.,Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.)
| | - Lixia Yao
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN (L.Y.), Boston University School of Medicine, MA
| | - Patrick T Ellinor
- Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.).,Cardiovascular Research Center (S.A.L., P.T.E.), Massachusetts General Hospital, Boston.,Cardiac Arrhythmia Service (S.A.L., P.T.E.), Massachusetts General Hospital, Boston.,Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA (S.A.L., P.T.E.)
| | - Emelia J Benjamin
- Department of Epidemiology (E.J.B.), Boston University School of Public Health, MA.,Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.).,Department of Medicine, Sections of Preventive Medicine and Cardiovascular Medicine (E.J.B.), Boston University School of Medicine, MA
| | - Honghuang Lin
- Boston University and National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA (K.L.L., J.D., L.T., E.J.B., H.L.).,Department of Medicine, Section of Computational Biomedicine (H.L.), Boston University School of Medicine, MA
| |
Collapse
|
8
|
Mallik S, Odom GJ, Gao Z, Gomez L, Chen X, Wang L. An evaluation of supervised methods for identifying differentially methylated regions in Illumina methylation arrays. Brief Bioinform 2019; 20:2224-2235. [PMID: 30239597 PMCID: PMC6954393 DOI: 10.1093/bib/bby085] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Revised: 07/24/2018] [Accepted: 08/16/2018] [Indexed: 01/19/2023] Open
Abstract
Epigenome-wide association studies (EWASs) have become increasingly popular for studying DNA methylation (DNAm) variations in complex diseases. The Illumina methylation arrays provide an economical, high-throughput and comprehensive platform for measuring methylation status in EWASs. A number of software tools have been developed for identifying disease-associated differentially methylated regions (DMRs) in the epigenome. However, in practice, we found these tools typically had multiple parameter settings that needed to be specified and the performance of the software tools under different parameters was often unclear. To help users better understand and choose optimal parameter settings when using DNAm analysis tools, we conducted a comprehensive evaluation of 4 popular DMR analysis tools under 60 different parameter settings. In addition to evaluating power, precision, area under precision-recall curve, Matthews correlation coefficient, F1 score and type I error rate, we also compared several additional characteristics of the analysis results, including the size of the DMRs, overlap between the methods and execution time. The results showed that none of the software tools performed best under their default parameter settings, and power varied widely when parameters were changed. Overall, the precision of these software tools were good. In contrast, all methods lacked power when effect size was consistent but small. Across all simulation scenarios, comb-p consistently had the best sensitivity as well as good control of false-positive rate.
Collapse
Affiliation(s)
- Saurav Mallik
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL, USA
- Joint First Authors
| | - Gabriel J Odom
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL, USA
- Joint First Authors
| | - Zhen Gao
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Lissette Gomez
- Dr. John T. Macdonald Foundation, Department of Human Genetics, and John P. Hussman Institute for Human Genomics, University of Miami, Miami, FL, USA
| | - Xi Chen
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL, USA
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Lily Wang
- Division of Biostatistics, Department of Public Health Sciences, University of Miami, Miller School of Medicine, Miami, FL, USA
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
- Dr. John T. Macdonald Foundation, Department of Human Genetics, and John P. Hussman Institute for Human Genomics, University of Miami, Miami, FL, USA
| |
Collapse
|
9
|
Lent S, Xu H, Wang L, Wang Z, Sarnowski C, Hivert MF, Dupuis J. Comparison of novel and existing methods for detecting differentially methylated regions. BMC Genet 2018; 19:84. [PMID: 30255775 PMCID: PMC6156895 DOI: 10.1186/s12863-018-0637-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Single-probe analyses in epigenome-wide association studies (EWAS) have identified associations between DNA methylation and many phenotypes, but do not take into account information from neighboring probes. Methods to detect differentially methylated regions (DMRs) (clusters of neighboring probes associated with a phenotype) may provide more power to detect associations between DNA methylation and diseases or phenotypes of interest. RESULTS We proposed a novel approach, GlobalP, and perform comparisons with 3 methods-DMRcate, Bumphunter, and comb-p-to identify DMRs associated with log triglycerides (TGs) in real GAW20 data before and after fenofibrate treatment. We applied these methods to the summary statistics from an EWAS performed on the methylation data. Comb-p, DMRcate, and GlobalP detected very similar DMRs near the gene CPT1A on chromosome 11 in both the pre- and posttreatment data. In addition, GlobalP detected 2 DMRs before fenofibrate treatment in the genes ETV6 and ABCG1. Bumphunter identified several DMRs on chromosomes 1 and 20, which did not overlap with DMRs detected by other methods. CONCLUSIONS Our novel method detected the same DMR identified by two existing methods and detected two additional DMRs not identified by any of the existing methods we compared.
Collapse
Affiliation(s)
- Samantha Lent
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, 3rd Floor, Boston, MA 02118 USA
| | - Hanfei Xu
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, 3rd Floor, Boston, MA 02118 USA
| | - Lan Wang
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, 3rd Floor, Boston, MA 02118 USA
| | - Zhe Wang
- Bioinformatics Program, Boston University, 44 Cummington Mall, Boston, MA 02215 USA
| | - Chloé Sarnowski
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, 3rd Floor, Boston, MA 02118 USA
| | - Marie-France Hivert
- Obesity Prevention Program, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 401 Park Drive, Suite 401 East, Boston, MA 02215 USA
- Diabetes Unit, Massachusetts General Hospital, 50 Staniford Street, Suite 340, Boston, MA 02144 USA
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, 3rd Floor, Boston, MA 02118 USA
- Bioinformatics Program, Boston University, 44 Cummington Mall, Boston, MA 02215 USA
| |
Collapse
|
10
|
Wang B, DeStefano AL, Lin H. Integrative methylation score to identify epigenetic modifications associated with lipid changes resulting from fenofibrate treatment in families. BMC Proc 2018; 12:28. [PMID: 30275882 PMCID: PMC6157127 DOI: 10.1186/s12919-018-0125-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Epigenome-wide association studies (EWAS) have traditionally focused on the association test of single epigenetic markers with complex traits. However, it is possible that multiple cytosine-phosphate-guanine (CpG) sites at the same locus could jointly exert their effects on human traits. Therefore, a region-based test that combines multiple markers could be more powerful. We used 2 different region-based tests to investigate the association between changes in DNA methylation and drug response, including the median methylation level test (MMLT) and sequence kernel association test (SKAT). No genes were found to be significantly associated with the drug response (for triglycerides, the false discovery rate ranged from 0.855 to 0.999; for high-density lipoprotein cholesterol, and the false discovery rate ranged from 0.584 to 0.915). Further evidence is needed to explore potential application of gene-level methylation association analysis.
Collapse
Affiliation(s)
- Biqi Wang
- 1Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118 USA
| | - Anita L DeStefano
- 1Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118 USA
| | - Honghuang Lin
- 2National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, 73 Mount Wayte Avenue, Framingham, MA 01702 USA.,3Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine, 72 E Concord St, B-616, Boston, MA 02118 USA
| |
Collapse
|
11
|
Fuady AM, Lent S, Sarnowski C, Tintle NL. Application of novel and existing methods to identify genes with evidence of epigenetic association: results from GAW20. BMC Genet 2018; 19:72. [PMID: 30255777 PMCID: PMC6157126 DOI: 10.1186/s12863-018-0647-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The rise in popularity and accessibility of DNA methylation data to evaluate epigenetic associations with disease has led to numerous methodological questions. As part of GAW20, our working group of 8 research groups focused on gene searching methods. RESULTS Although the methods were varied, we identified 3 main themes within our group. First, many groups tackled the question of how best to use pedigree information in downstream analyses, finding that (a) the use of kinship matrices is common practice, (b) ascertainment corrections may be necessary, and (c) pedigree information may be useful for identifying parent-of-origin effects. Second, many groups also considered multimarker versus single-marker tests. Multimarker tests had modestly improved power versus single-marker methods on simulated data, and on real data identified additional associations that were not identified with single-marker methods, including identification of a gene with a strong biological interpretation. Finally, some of the groups explored methods to combine single-nucleotide polymorphism (SNP) and DNA methylation into a single association analysis. CONCLUSIONS A causal inference method showed promise at discovering new mechanisms of SNP activity; gene-based methods of summarizing SNP and DNA methylation data also showed promise. Even though numerous questions still remain in the analysis of DNA methylation data, our discussions at GAW20 suggest some emerging best practices.
Collapse
Affiliation(s)
- Angga M. Fuady
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Einthovenweg 20, 2333 Leiden, ZC Netherlands
| | - Samantha Lent
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118 USA
| | - Chloé Sarnowski
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118 USA
| | - Nathan L. Tintle
- Department of Mathematics and Statistics, Dordt College, Sioux Center, IA 51250 USA
| |
Collapse
|