1
|
Yan F, G. Telonis A, Yang Q, Jiang L, E. Garrett-Bakelman F, Sekeres MA, Santini V, Ceccarelli M, Goel N, Garcia-Martinez L, Morey L, Figueroa ME, Guo Y. Genome-wide methylome modeling via generative AI incorporating long- and short-range interactions. SCIENCE ADVANCES 2025; 11:eadt4152. [PMID: 40215314 PMCID: PMC11988400 DOI: 10.1126/sciadv.adt4152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 03/05/2025] [Indexed: 04/14/2025]
Abstract
Using millions of methylation segments, we developed DiffuCpG, a generative artificial intelligence (AI) diffusion model designed to solve the critical challenge of missing data in high-throughput methylation technologies. DiffuCpG goes beyond conventional methods by leveraging both short-range interactions including nearby CpGs from both latitude and longitude of the dataset, local DNA sequences, and long-range interactions, including three-dimensional genome architecture and long-distance correlations, to comprehensively model the methylome. Compared to previous methods, through extensive independent validations across different tissue types, cancers, and technologies (whole-genome bisulfite sequencing, enhanced reduced representation bisulfite sequencing, single-cell bisulfite sequencing, and methylation arrays), DiffuCpG has demonstrated superior performance in accuracy, scalability, and versatility. On average, bisulfite sequencing dataset, DiffuCpG can extend the original dataset by millions of additional CpGs. As an alternative application of generative AI, DiffuCpG addresses a key bottleneck in epigenetic research and will substantially benefit studies relying on high-throughput methylation data.
Collapse
Affiliation(s)
- Fengyao Yan
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Aristeidis G. Telonis
- Department of Biochemistry and Molecular Biology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Qin Yang
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Limin Jiang
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Francine E. Garrett-Bakelman
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
- Department of Medicine, University of Virginia, Charlottesville, VA 22908, USA
- Comprehensive Cancer Center, University of Virginia, Charlottesville, VA 22908, USA
| | - Mikkael A. Sekeres
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Division of Hematology, Department of Medicine, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Valeria Santini
- MDS Unit, DMSC, University of Florence, AOU Careggi, Florence 50134, Italy
| | - Michele Ceccarelli
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Surgery, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Neha Goel
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Surgery, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Liliana Garcia-Martinez
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Lluis Morey
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Maria E. Figueroa
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Biochemistry and Molecular Biology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Division of Hematology, Department of Medicine, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Yan Guo
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Department of Public Health and Sciences, University of Miami, Miami, FL 33136, USA
| |
Collapse
|
2
|
Wu Z, Zhang T, Ma X, Guo S, Zhou Q, Zahoor A, Deng G. Recent advances in anti-inflammatory active components and action mechanisms of natural medicines. Inflammopharmacology 2023; 31:2901-2937. [PMID: 37947913 DOI: 10.1007/s10787-023-01369-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 09/16/2023] [Indexed: 11/12/2023]
Abstract
Inflammation is a series of reactions caused by the body's resistance to external biological stimuli. Inflammation affects the occurrence and development of many diseases. Anti-inflammatory drugs have been used widely to treat inflammatory diseases, but long-term use can cause toxic side-effects and affect human functions. As immunomodulators with long-term conditioning effects and no drug residues, natural products are being investigated increasingly for the treatment of inflammatory diseases. In this review, we focus on the inflammatory process and cellular mechanisms in the development of diseases such as inflammatory bowel disease, atherosclerosis, and coronavirus disease-2019. Also, we focus on three signaling pathways (Nuclear factor-kappa B, p38 mitogen-activated protein kinase, Janus kinase/signal transducer and activator of transcription-3) to explain the anti-inflammatory effect of natural products. In addition, we also classified common natural products based on secondary metabolites and explained the association between current bidirectional prediction progress of natural product targets and inflammatory diseases.
Collapse
Affiliation(s)
- Zhimin Wu
- Department of Clinical Veterinary Medicine, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Tao Zhang
- College of Animal Science and Technology, Anhui Agricultural University, Hefei, China
| | - Xiaofei Ma
- College of Veterinary Medicine, Gansu Agriculture University, Lanzhou, China
| | - Shuai Guo
- Department of Clinical Veterinary Medicine, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Qingqing Zhou
- Department of Clinical Veterinary Medicine, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Arshad Zahoor
- College of Veterinary Sciences, The University of Agriculture Peshawar, Peshawar, Pakistan
| | - Ganzhen Deng
- Department of Clinical Veterinary Medicine, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, China.
| |
Collapse
|
3
|
Li L, Zhang H, Holloway JW, Ewart S, Relton CL, Arshad SH, Karmaus W. Does DNA methylation mediate the association of age at puberty with FVC or FEV1? ERJ Open Res 2022; 8:00476-2021. [PMID: 35237685 PMCID: PMC8883177 DOI: 10.1183/23120541.00476-2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 12/30/2021] [Indexed: 11/05/2022] Open
Abstract
Background Age of pubertal onset is associated with lung function in adulthood. However, the underlying role of epigenetics as a mediator of this association remains unknown. Methods DNA methylation (DNAm) in peripheral blood was measured at age 18 years in the Isle of Wight birth cohort (IOWBC) along with data on age of pubertal events, forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1) at 26 years. Structural equation models were applied to examine mediation effects of DNAm on the association of age at pubertal events with FVC and FEV1. Findings were further tested in the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Results In the IOWBC, for females, 21 cytosine-phosphate-guanine sites (CpGs) were shown to mediate the association of age at puberty with FVC or FEV1 at 26 years (p<0.05). In males, DNAm at 20 CpGs was found to mediate the association of age at puberty with FVC (p<0.05). At almost all these CpGs, indirect effects (effects of age at pubertal events on FVC or FEV1via DNAm) contributed a smaller portion to the total effects compared to direct effects (e.g. at cg08680129, ∼22% of the estimated total effect of age at menarche on FVC at age 26 was contributed by an indirect effect). Among the IOWBC-discovered CpGs available in ALSPAC, none of them was replicated in ALSPAC (p>0.05). Conclusions Our findings suggest that post-adolescence DNAm in peripheral blood is likely not to mediate the association of age at pubertal onset with young adulthood FVC or FEV1. The association between age at pubertal onset and lung function parameters FVC or FEV1 in young adulthood is not likely to be mediated by DNA methylation in peripheral bloodhttps://bit.ly/31G8hDi
Collapse
|
4
|
Wei S, Tao J, Xu J, Chen X, Wang Z, Zhang N, Zuo L, Jia Z, Chen H, Sun H, Yan Y, Zhang M, Lv H, Kong F, Duan L, Ma Y, Liao M, Xu L, Feng R, Liu G, Project TEWAS, Jiang Y. Ten Years of EWAS. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2021; 8:e2100727. [PMID: 34382344 PMCID: PMC8529436 DOI: 10.1002/advs.202100727] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/11/2021] [Indexed: 06/13/2023]
Abstract
Epigenome-wide association study (EWAS) has been applied to analyze DNA methylation variation in complex diseases for a decade, and epigenome as a research target has gradually become a hot topic of current studies. The DNA methylation microarrays, next-generation, and third-generation sequencing technologies have prepared a high-quality platform for EWAS. Here, the progress of EWAS research is reviewed, its contributions to clinical applications, and mainly describe the achievements of four typical diseases. Finally, the challenges encountered by EWAS and make bold predictions for its future development are presented.
Collapse
Affiliation(s)
- Siyu Wei
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Junxian Tao
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Jing Xu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Xingyu Chen
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Zhaoyang Wang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Nan Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Lijiao Zuo
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Zhe Jia
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Haiyan Chen
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Hongmei Sun
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Yubo Yan
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Mingming Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Hongchao Lv
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Fanwu Kong
- The EWAS ProjectHarbinChina
- Department of NephrologyThe Second Affiliated HospitalHarbin Medical UniversityHarbin150001China
| | - Lian Duan
- The EWAS ProjectHarbinChina
- The First Affiliated Hospital of Wenzhou Medical UniversityWenzhou325000China
| | - Ye Ma
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Mingzhi Liao
- The EWAS ProjectHarbinChina
- College of Life SciencesNorthwest A&F UniversityYanglingShanxi712100China
| | - Liangde Xu
- The EWAS ProjectHarbinChina
- School of Biomedical EngineeringWenzhou Medical UniversityWenzhou325035China
| | - Rennan Feng
- The EWAS ProjectHarbinChina
- Department of Nutrition and Food HygienePublic Health CollegeHarbin Medical UniversityHarbin150081China
| | - Guiyou Liu
- The EWAS ProjectHarbinChina
- Beijing Institute for Brain DisordersCapital Medical UniversityBeijing100069China
| | | | - Yongshuai Jiang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| |
Collapse
|
5
|
Jiang H, Cao K, Fan C, Cui X, Ma Y, Liu J. Transcriptome-Wide High-Throughput m6A Sequencing of Differential m6A Methylation Patterns in the Human Rheumatoid Arthritis Fibroblast-Like Synoviocytes Cell Line MH7A. J Inflamm Res 2021; 14:575-586. [PMID: 33658830 PMCID: PMC7920605 DOI: 10.2147/jir.s296006] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 02/10/2021] [Indexed: 12/12/2022] Open
Abstract
Introduction N6-methyladenosine (m6A) is the most frequent internal modification in eukaryotic mRNAs and is closely related to the occurrence and development of many diseases, especially tumors. However, the relationship between m6A methylation and rheumatoid arthritis (RA) is still a mystery. Methods Two high-throughput sequencing methods, namely, m6A modified RNA immunoprecipitation sequence (m6A-seq) and RNA sequence (RNA-seq) were performed to identify the differentially expressed m6A methylation in human rheumatoid arthritis fibroblast-like synoviocytes cell line MH7A after stimulation with TNF-α. Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to obtain enriched GO terms and significant KEGG pathways. Then, four candidate genes, Wilms tumor 1-associating protein (WTAP), receptor-interacting serine/threonine protein kinase 2 (RIPK2), Janus kinase 3 (JAK3) and tumor necrosis factor receptor SF10A (TNFRSF10A) were selected to further validate the m6A methylation, mRNA and protein expression levels in MH7A cells and synovial tissues of adjuvant arthritis (AA) rats by RT-qPCR and Western blot. Results Using m6A-seq, we identified a total of 206 genes with differentially expressed m6A methylation, of which 118 were significantly upregulated and 88 genes were significantly downregulated. Likewise, 1207 differentially mRNA expressed mRNAs were obtained by RNA-seq, of which 793 were upregulated and 414 downregulated. Further joint analysis showed that the m6A methylation and mRNA expression levels of 88 genes changed significantly, of which 30 genes displayed increased m6A methylation and decreased mRNA expression, 57 genes displayed decreased m6A methylation and increased mRNA expression increased, and 1 gene displayed increased m6A methylation and increased mRNA expression. GO and KEGG analyses indicated that these unique genes were mainly enriched in inflammation-related pathways, cell proliferation and apoptosis. In addition, the validations of WTAP, RIPK2, JAK3 and TNFRSF10A were in accordance with the m6A and RNA sequencing results. Conclusion This study established the transcriptional map of m6A in MH7A cells and revealed the potential relationship between RNA methylation modification and RA related genes. The results suggested that m6A modification was associated with the occurrence and course of RA to some extent.
Collapse
Affiliation(s)
- Hui Jiang
- Experimental Center of Clinical Research, First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China.,School of Pharmacy, Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China
| | - Kefeng Cao
- Departments of Laboratory Medicine, Traditional Chinese Medical Hospital of Taihe County, Fuyang, Anhui, People's Republic of China
| | - Chang Fan
- Experimental Center of Clinical Research, First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China.,School of Pharmacy, Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China
| | - Xiaoya Cui
- Experimental Center of Clinical Research, First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China.,School of Pharmacy, Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China
| | - Yanzhen Ma
- Experimental Center of Clinical Research, First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China.,School of Pharmacy, Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China
| | - Jian Liu
- Experimental Center of Clinical Research, First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui, People's Republic of China
| |
Collapse
|
6
|
The progress on the estimation of DNA methylation level and the detection of abnormal methylation. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-022-0289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
7
|
Yu F, Xu C, Deng HW, Shen H. A novel computational strategy for DNA methylation imputation using mixture regression model (MRM). BMC Bioinformatics 2020; 21:552. [PMID: 33261550 PMCID: PMC7708217 DOI: 10.1186/s12859-020-03865-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Accepted: 11/09/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND DNA methylation is an important heritable epigenetic mark that plays a crucial role in transcriptional regulation and the pathogenesis of various human disorders. The commonly used DNA methylation measurement approaches, e.g., Illumina Infinium HumanMethylation-27 and -450 BeadChip arrays (27 K and 450 K arrays) and reduced representation bisulfite sequencing (RRBS), only cover a small proportion of the total CpG sites in the human genome, which considerably limited the scope of the DNA methylation analysis in those studies. RESULTS We proposed a new computational strategy to impute the methylation value at the unmeasured CpG sites using the mixture of regression model (MRM) of radial basis functions, integrating information of neighboring CpGs and the similarities in local methylation patterns across subjects and across multiple genomic regions. Our method achieved a better imputation accuracy over a set of competing methods on both simulated and empirical data, particularly when the missing rate is high. By applying MRM to an RRBS dataset from subjects with low versus high bone mineral density (BMD), we recovered methylation values of ~ 300 K CpGs in the promoter regions of chromosome 17 and identified some novel differentially methylated CpGs that are significantly associated with BMD. CONCLUSIONS Our method is well applicable to the numerous methylation studies. By expanding the coverage of the methylation dataset to unmeasured sites, it can significantly enhance the discovery of novel differential methylation signals and thus reveal the mechanisms underlying various human disorders/traits.
Collapse
Affiliation(s)
- Fangtang Yu
- Center for Bioinformatics and Genomics, Department of Biostatistics and Data Science, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Chao Xu
- Center for Bioinformatics and Genomics, Department of Biostatistics and Data Science, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Hong-Wen Deng
- Center for Bioinformatics and Genomics, Department of Biostatistics and Data Science, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Hui Shen
- Center for Bioinformatics and Genomics, Department of Biostatistics and Data Science, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, 70112, USA.
| |
Collapse
|
8
|
Guo S, Xu L, Chang C, Zhang R, Jin Y, He D. Epigenetic Regulation Mediated by Methylation in the Pathogenesis and Precision Medicine of Rheumatoid Arthritis. Front Genet 2020; 11:811. [PMID: 32849810 PMCID: PMC7417338 DOI: 10.3389/fgene.2020.00811] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 07/06/2020] [Indexed: 12/11/2022] Open
Abstract
Rheumatoid arthritis (RA) is a complex disease triggered by the interaction between genetics and the environment, especially through the shared epitope (SE) and cell surface calreticulin (CSC) theory. However, the available evidence shows that genetic diversity and environmental exposure cannot explain all the clinical characteristics and heterogeneity of RA. In contrast, recent studies demonstrate that epigenetics play important roles in the pathogenesis of RA, especially DNA methylation and histone modification. DNA methylation and histone methylation are involved in innate and adaptive immune cell differentiation and migration, proliferation, apoptosis, and mesenchymal characteristics of fibroblast-like synoviocytes (FLS). Epigenetic-mediated regulation of immune-related genes and inflammation pathways explains the dynamic expression network of RA. In this review, we summarize the comprehensive evidence to show that methylation of DNA and histones is significantly involved in the pathogenesis of RA and could be applied as a promising biomarker in the disease progression and drug-response prediction. We also explain the advantages and challenges of the current epigenetics research in RA. In summary, epigenetic modules provide a possible interface through which genetic and environmental risk factors connect to contribute to the susceptibility and pathogenesis of RA. Additionally, epigenetic regulators provide promising drug targets to develop novel therapeutic drugs for RA. Finally, DNA methylation and histone modifications could be important features for providing a better RA subtype identification to accelerate personalized treatment and precision medicine.
Collapse
Affiliation(s)
- Shicheng Guo
- Department of Medical Genetics, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States.,Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, WI, United States
| | - Lingxia Xu
- Department of Rheumatology, Guanghua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Cen Chang
- Department of Rheumatology, Guanghua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Runrun Zhang
- Department of Rheumatology, Guanghua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Yehua Jin
- Department of Rheumatology, Guanghua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Dongyi He
- Department of Rheumatology, Guanghua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
9
|
Tang J, Zou J, Zhang X, Fan M, Tian Q, Fu S, Gao S, Fan S. PretiMeth: precise prediction models for DNA methylation based on single methylation mark. BMC Genomics 2020; 21:364. [PMID: 32414326 PMCID: PMC7227319 DOI: 10.1186/s12864-020-6768-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 05/04/2020] [Indexed: 11/29/2022] Open
Abstract
Background The computational prediction of methylation levels at single CpG resolution is promising to explore the methylation levels of CpGs uncovered by existing array techniques, especially for the 450 K beadchip array data with huge reserves. General prediction models concentrate on improving the overall prediction accuracy for the bulk of CpG loci while neglecting whether each locus is precisely predicted. This leads to the limited application of the prediction results, especially when performing downstream analysis with high precision requirements. Results Here we reported PretiMeth, a method for constructing precise prediction models for each single CpG locus. PretiMeth used a logistic regression algorithm to build a prediction model for each interested locus. Only one DNA methylation feature that shared the most similar methylation pattern with the CpG locus to be predicted was applied in the model. We found that PretiMeth outperformed other algorithms in the prediction accuracy, and kept robust across platforms and cell types. Furthermore, PretiMeth was applied to The Cancer Genome Atlas data (TCGA), the intensive analysis based on precise prediction results showed that several CpG loci and genes (differentially methylated between the tumor and normal samples) were worthy for further biological validation. Conclusion The precise prediction of single CpG locus is important for both methylation array data expansion and downstream analysis of prediction results. PretiMeth achieved precise modeling for each CpG locus by using only one significant feature, which also suggested that our precise prediction models could be probably used for reference in the probe set design when the DNA methylation beadchip update. PretiMeth is provided as an open source tool via https://github.com/JxTang-bioinformatics/PretiMeth.
Collapse
Affiliation(s)
- Jianxiong Tang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Jianxiao Zou
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Xiaoran Zhang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.,Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Mei Fan
- Chengdu Women's and Children's Central Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Qi Tian
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shuyao Fu
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shihong Gao
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Shicai Fan
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China. .,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| |
Collapse
|
10
|
Zhang W, Li Z, Wei N, Wu HJ, Zheng X. Detection of differentially methylated CpG sites between tumor samples with uneven tumor purities. Bioinformatics 2020; 36:2017-2024. [PMID: 31769783 DOI: 10.1093/bioinformatics/btz885] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 11/14/2019] [Accepted: 11/23/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Inference of differentially methylated (DM) CpG sites between two groups of tumor samples with different geno- or pheno-types is a critical step to uncover the epigenetic mechanism of tumorigenesis, and identify biomarkers for cancer subtyping. However, as a major source of confounding factor, uneven distributions of tumor purity between two groups of tumor samples will lead to biased discovery of DM sites if not properly accounted for. RESULTS We here propose InfiniumDM, a generalized least square model to adjust tumor purity effect for differential methylation analysis. Our method is applicable to a variety of experimental designs including with or without normal controls, different sources of normal tissue contaminations. We compared our method with conventional methods including minfi, limma and limma corrected by tumor purity using simulated datasets. Our method shows significantly better performance at different levels of differential methylation thresholds, sample sizes, mean purity deviations and so on. We also applied the proposed method to breast cancer samples from TCGA database to further evaluate its performance. Overall, both simulation and real data analyses demonstrate favorable performance over existing methods serving similar purpose. AVAILABILITY AND IMPLEMENTATION InfiniumDM is a part of R package InfiniumPurify, which is freely available from GitHub (https://github.com/Xiaoqizheng/InfiniumPurify). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weiwei Zhang
- Department of Mathematics, School of Science, East China University of Technology, Nanchang, Jiangxi 330013, China
| | - Ziyi Li
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Nana Wei
- Department of Mathematics, Shanghai Normal University, Shanghai 200234, China
| | - Hua-Jun Wu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA 02215, USA
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai 200234, China
| |
Collapse
|
11
|
Tian Q, Zou J, Tang J, Fang Y, Yu Z, Fan S. MRCNN: a deep learning model for regression of genome-wide DNA methylation. BMC Genomics 2019; 20:192. [PMID: 30967120 PMCID: PMC6457069 DOI: 10.1186/s12864-019-5488-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Background Determination of genome-wide DNA methylation is significant for both basic research and drug development. As a key epigenetic modification, this biochemical process can modulate gene expression to influence the cell differentiation which can possibly lead to cancer. Due to the involuted biochemical mechanism of DNA methylation, obtaining a precise prediction is a considerably tough challenge. Existing approaches have yielded good predictions, but the methods either need to combine plenty of features and prerequisites or deal with only hypermethylation and hypomethylation. Results In this paper, we propose a deep learning method for prediction of the genome-wide DNA methylation, in which the Methylation Regression is implemented by Convolutional Neural Networks (MRCNN). Through minimizing the continuous loss function, experiments show that our model is convergent and more precise than the state-of-art method (DeepCpG) according to results of the evaluation. MRCNN also achieves the discovery of de novo motifs by analysis of features from the training process. Conclusions Genome-wide DNA methylation could be evaluated based on the corresponding local DNA sequences of target CpG loci. With the autonomous learning pattern of deep learning, MRCNN enables accurate predictions of genome-wide DNA methylation status without predefined features and discovers some de novo methylation-related motifs that match known motifs by extracting sequence patterns. Electronic supplementary material The online version of this article (10.1186/s12864-019-5488-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qi Tian
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Jianxiao Zou
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Jianxiong Tang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Yuan Fang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Zhongli Yu
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Shicai Fan
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China. .,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| |
Collapse
|
12
|
Ma B, Allard C, Bouchard L, Perron P, Mittleman MA, Hivert MF, Liang L. Locus-specific DNA methylation prediction in cord blood and placenta. Epigenetics 2019; 14:405-420. [PMID: 30885044 DOI: 10.1080/15592294.2019.1588685] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
DNA methylation is known to be responsive to prenatal exposures, which may be a part of the mechanism linking early developmental exposures to future chronic diseases. Many studies use blood to measure DNA methylation, yet we know that DNA methylation is tissue specific. Placenta is central to fetal growth and development, but it is rarely feasible to collect this tissue in large epidemiological studies; on the other hand, cord blood samples are more accessible. In this study, based on paired samples of both placenta and cord blood tissues from 169 individuals, we investigated the methylation concordance between placenta and cord blood. We then employed a machine-learning-based model to predict locus-specific DNA methylation levels in placenta using DNA methylation levels in cord blood. We found that methylation correlation between placenta and cord blood is lower than other tissue pairs, consistent with existing observations that placenta methylation has a distinct pattern. Nonetheless, there are still a number of CpG sites showing robust association between the two tissues. We built prediction models for placenta methylation based on cord blood data and documented a subset of 1,012 CpG sites with high correlation between measured and predicted placenta methylation levels. The resulting list of CpG sites and prediction models could help to reveal the loci where internal or external influences may affect DNA methylation in both placenta and cord blood, and provide a reference data to predict the effects on placenta in future study even when the tissue is not available in an epidemiological study.
Collapse
Affiliation(s)
- Baoshan Ma
- a College of Information Science and Technology , Dalian Maritime University , Dalian , Liaoning Province , China
| | - Catherine Allard
- b Centre de Recherche du Center Hospitalier Universitaire de Sherbrooke , Sherbrooke , Quebec , Canada
| | - Luigi Bouchard
- b Centre de Recherche du Center Hospitalier Universitaire de Sherbrooke , Sherbrooke , Quebec , Canada.,c Department of Biochemistry, Faculty of Medicine and Health Sciences , Université de Sherbrooke , Sherbrooke , Quebec , Canada.,d ECOGENE-21 Biocluster , CSSS de Chicoutimi , Chicoutimi , Quebec , Canada
| | - Patrice Perron
- b Centre de Recherche du Center Hospitalier Universitaire de Sherbrooke , Sherbrooke , Quebec , Canada.,e Department of Medicine, Faculty of Medicine and Life Sciences , Université de Sherbrooke , Sherbrooke , Quebec , Canada
| | - Murray A Mittleman
- f Department of Epidemiology , Harvard T.H. Chan School of Public Health , Boston , MA , USA.,g Cardiovascular Epidemiology Research Unit , Beth Israel Deaconess Medical Center , Boston , MA , USA
| | - Marie-France Hivert
- b Centre de Recherche du Center Hospitalier Universitaire de Sherbrooke , Sherbrooke , Quebec , Canada.,e Department of Medicine, Faculty of Medicine and Life Sciences , Université de Sherbrooke , Sherbrooke , Quebec , Canada.,h Department of Population Medicine , Harvard Pilgrim Health Care Institute, Harvard Medical School , Boston , MA , USA.,i Diabetes Unit , Massachusetts General Hospital , Boston , MA , USA
| | - Liming Liang
- f Department of Epidemiology , Harvard T.H. Chan School of Public Health , Boston , MA , USA.,j Department of Biostatistics , Harvard T.H. Chan School of Public Health , Boston , MA , USA
| |
Collapse
|
13
|
Fan S, Tang J, Li N, Zhao Y, Ai R, Zhang K, Wang M, Du W, Wang W. Integrative analysis with expanded DNA methylation data reveals common key regulators and pathways in cancers. NPJ Genom Med 2019; 4:2. [PMID: 30729033 PMCID: PMC6358616 DOI: 10.1038/s41525-019-0077-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 01/02/2019] [Indexed: 11/09/2022] Open
Abstract
The integration of genomic and DNA methylation data has been demonstrated as a powerful strategy in understanding cancer mechanisms and identifying therapeutic targets. The TCGA consortium has mapped DNA methylation in thousands of cancer samples using Illumina Infinium Human Methylation 450 K BeadChip (Illumina 450 K array) that only covers about 1.5% of CpGs in the human genome. Therefore, increasing the coverage of the DNA methylome would significantly leverage the usage of the TCGA data. Here, we present a new model called EAGLING that can expand the Illumina 450 K array data 18 times to cover about 30% of the CpGs in the human genome. We applied it to analyze 13 cancers in TCGA. By integrating the expanded methylation, gene expression, and somatic mutation data, we identified the genes showing differential patterns in each of the 13 cancers. Many of the triple-evidenced genes identified in majority of the cancers are biomarkers or potential biomarkers. Pan-cancer analysis also revealed the pathways in which the triple-evidenced genes are enriched, which include well known ones as well as new ones, such as axonal guidance signaling pathway and pathways related to inflammatory processing or inflammation response. Triple-evidenced genes, particularly TNXB, RRM2, CELSR3, SLC16A3, FANCI, MMP9, MMP11, SIK1, and TRIM59 showed superior predictive power in both tumor diagnosis and prognosis. These results have demonstrated that the integrative analysis using the expanded methylation data is powerful in identifying critical genes/pathways that may serve as new therapeutic targets.
Collapse
Affiliation(s)
- Shicai Fan
- 1School of Automation Engineering, University of Electronic Science and Technology of China, 611731 Chengdu, Sichuan China.,2Center for Informational Biology, University of Electronic Science and Technology of China, 611731 Chengdu, Sichuan China.,3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA.,4Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, 130012 Changchun, China
| | - Jianxiong Tang
- 1School of Automation Engineering, University of Electronic Science and Technology of China, 611731 Chengdu, Sichuan China
| | - Nan Li
- 3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA
| | - Ying Zhao
- 3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA
| | - Rizi Ai
- 3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA
| | - Kai Zhang
- 3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA
| | - Mengchi Wang
- 3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA
| | - Wei Du
- 4Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, 130012 Changchun, China
| | - Wei Wang
- 3Department of Chemistry and Biochemistry, University of California, San Diego, CA 92093-0359 USA.,5Department of Cellular and Molecular Medicine, University of California, San Diego, CA 92093-0359 USA
| |
Collapse
|
14
|
Fan S, Tang J, Tian Q, Wu C. A robust fuzzy rule based integrative feature selection strategy for gene expression data in TCGA. BMC Med Genomics 2019; 12:14. [PMID: 30704464 PMCID: PMC6357346 DOI: 10.1186/s12920-018-0451-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Lots of researches have been conducted in the selection of gene signatures that could distinguish the cancer patients from the normal. However, it is still an open question on how to extract the robust gene features. METHODS In this work, a gene signature selection strategy for TCGA data was proposed by integrating the gene expression data, the methylation data and the prior knowledge about cancer biomarkers. Different from the traditional integration method, the expanded 450 K methylation data were applied instead of the original 450 K array data, and the reported biomarkers were weighted in the feature selection. Fuzzy rule based classification method and cross validation strategy were applied in the model construction for performance evaluation. RESULTS Our selected gene features showed prediction accuracy close to 100% in the cross validation with fuzzy rule based classification model on 6 cancers from TCGA. The cross validation performance of our proposed model is similar to other integrative models or RNA-seq only model, while the prediction performance on independent data is obviously better than other 5 models. The gene signatures extracted with our fuzzy rule based integrative feature selection strategy were more robust, and had the potential to get better prediction results. CONCLUSION The results indicated that the integration of expanded methylation data would cover more genes, and had greater capacity to retrieve the signature genes compared with the original 450 K methylation data. Also, the integration of the reported biomarkers was a promising way to improve the performance. PTCHD3 gene was selected as a discriminating gene in 3 out of the 6 cancers, which suggested that it might play important role in the cancer risk and would be worthy for the intensive investigation.
Collapse
Affiliation(s)
- Shicai Fan
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731 Sichuan China
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 611731 Sichuan China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| | - Jianxiong Tang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731 Sichuan China
| | - Qi Tian
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731 Sichuan China
| | - Chunguo Wu
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| |
Collapse
|
15
|
Yu J, Peng J, Luan Z, Zheng F, Su W. MicroRNAs as a Novel Tool in the Diagnosis of Liver Lipid Dysregulation and Fatty Liver Disease. Molecules 2019; 24:molecules24020230. [PMID: 30634538 PMCID: PMC6358728 DOI: 10.3390/molecules24020230] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Revised: 12/23/2018] [Accepted: 12/24/2018] [Indexed: 02/07/2023] Open
Abstract
In recent years, metabolic disorder, especially fatty liver disease, has been considered a major challenge to global health. The attention of researchers focused on expanding knowledge of the regulation mechanism behind these diseases and towards the new diagnostics tools and treatments. The pathophysiology of the fatty liver disease is undoubtedly complex. Abnormal hepatic lipid accumulation is a major symptom of most metabolic diseases. Therefore, the identification of novel regulation factors of lipid metabolism is important and meaningful. As a new diagnostic tool, the function of microRNAs during fatty liver disease has recently come into notice in biological research. Accumulating evidence supports the influence of miRNAs in lipid metabolism. In this review, we discuss the potential role of miRNAs in liver lipid metabolism and the pathogenesis of fatty liver disease.
Collapse
Affiliation(s)
- Jingwei Yu
- Shenzhen University Medical Center, Shenzhen University Health Science Center, Shenzhen 518060, China.
- Department of Biology, Guangdong Pharmaceutical University, Guangzhou 510006, China.
| | - Jun Peng
- Shenzhen University Medical Center, Shenzhen University Health Science Center, Shenzhen 518060, China.
| | - Zhilin Luan
- Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, Liaoning 116044, China.
| | - Feng Zheng
- Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, Liaoning 116044, China.
| | - Wen Su
- Shenzhen University Medical Center, Shenzhen University Health Science Center, Shenzhen 518060, China.
| |
Collapse
|
16
|
Abstract
Aberrant DNA methylation is considered to be one of the most common hallmarks of cancer. Several recent advances in assessing the DNA methylome provide great promise for deciphering the cancer-specific DNA methylation patterns. Herein, we present the current key technologies used to detect high-throughput genome-wide DNA methylation, and the available cancer-associated methylation databases. Additionally, we focus on the computational methods for preprocessing, analyzing and interpreting the cancer methylome data. It not only discusses the challenges of the differentially methylated region calling and the prediction model construction but also highlights the biomarker investigation for cancer diagnosis, prognosis and response to treatment. Finally, some emerging challenges in the computational analysis of cancer methylome data are summarized.
Collapse
|