1
|
Jiang B, Quinn-Bohmann N, Diener C, Nathan VB, Han-Hallett Y, Reddivari L, Gibbons SM, Baloni P. Understanding disease-associated metabolic changes in human colon epithelial cells using i ColonEpithelium metabolic reconstruction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.22.619644. [PMID: 39484551 PMCID: PMC11526933 DOI: 10.1101/2024.10.22.619644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
The colon epithelium plays a key role in the host-microbiome interactions, allowing uptake of various nutrients and driving important metabolic processes. To unravel detailed metabolic activities in the human colon epithelium, our present study focuses on the generation of the first cell-type specific genome-scale metabolic model (GEM) of human colonic epithelial cells, named iColonEpithelium. GEMs are powerful tools for exploring reactions and metabolites at systems level and predicting the flux distributions at steady state. Our cell-type-specific iColonEpithelium metabolic reconstruction captures genes specifically expressed in the human colonic epithelial cells. The iColonEpithelium is also capable of performing metabolic tasks specific to the cell type. A unique transport reaction compartment has been included to allow simulation of metabolic interactions with the gut microbiome. We used iColonEpithelium to identify metabolic signatures associated with inflammatory bowel disease. We integrated single-cell RNA sequencing data from Crohn's Diseases (CD) and ulcerative colitis (UC) samples with the iColonEpithelium metabolic network to predict metabolic signatures of colonocytes between CD and UC compared to healthy samples. We identified reactions in nucleotide interconversion, fatty acid synthesis and tryptophan metabolism were differentially regulated in CD and UC conditions, which were in accordance with experimental results. The iColonEpithelium metabolic network can be used to identify mechanisms at the cellular level, and our network has the potential to be integrated with gut microbiome models to explore the metabolic interactions between host and gut microbiota under various conditions.
Collapse
|
2
|
Ma X, Thela SR, Zhao F, Yao B, Wen Z, Jin P, Zhao J, Chen L. Deep5hmC: predicting genome-wide 5-hydroxymethylcytosine landscape via a multimodal deep learning model. Bioinformatics 2024; 40:btae528. [PMID: 39196755 PMCID: PMC11379467 DOI: 10.1093/bioinformatics/btae528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 08/09/2024] [Accepted: 08/27/2024] [Indexed: 08/30/2024] Open
Abstract
MOTIVATION 5-Hydroxymethylcytosine (5hmC), a crucial epigenetic mark with a significant role in regulating tissue-specific gene expression, is essential for understanding the dynamic functions of the human genome. Despite its importance, predicting 5hmC modification across the genome remains a challenging task, especially when considering the complex interplay between DNA sequences and various epigenetic factors such as histone modifications and chromatin accessibility. RESULTS Using tissue-specific 5hmC sequencing data, we introduce Deep5hmC, a multimodal deep learning framework that integrates both the DNA sequence and epigenetic features such as histone modification and chromatin accessibility to predict genome-wide 5hmC modification. The multimodal design of Deep5hmC demonstrates remarkable improvement in predicting both qualitative and quantitative 5hmC modification compared to unimodal versions of Deep5hmC and state-of-the-art machine learning methods. This improvement is demonstrated through benchmarking on a comprehensive set of 5hmC sequencing data collected at four developmental stages during forebrain organoid development and across 17 human tissues. Compared to DeepSEA and random forest, Deep5hmC achieves close to 4% and 17% improvement of Area Under the Receiver Operating Characteristic (AUROC) across four forebrain developmental stages, and 6% and 27% across 17 human tissues for predicting binary 5hmC modification sites; and 8% and 22% improvement of Spearman correlation coefficient across four forebrain developmental stages, and 17% and 30% across 17 human tissues for predicting continuous 5hmC modification. Notably, Deep5hmC showcases its practical utility by accurately predicting gene expression and identifying differentially hydroxymethylated regions (DhMRs) in a case-control study of Alzheimer's disease (AD). Deep5hmC significantly improves our understanding of tissue-specific gene regulation and facilitates the development of new biomarkers for complex diseases. AVAILABILITY AND IMPLEMENTATION Deep5hmC is available via https://github.com/lichen-lab/Deep5hmC.
Collapse
Affiliation(s)
- Xin Ma
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| | - Sai Ritesh Thela
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| | - Fengdi Zhao
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| | - Bing Yao
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States
| | - Zhexing Wen
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA 30322, United States
| | - Peng Jin
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States
| | - Jinying Zhao
- Department of Epidemiology, University of Florida, Gainesville, FL 32603, United States
| | - Li Chen
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| |
Collapse
|
3
|
Jin W, Xia Y, Thela SR, Liu Y, Chen L. In silico generation and augmentation of regulatory variants from massively parallel reporter assay using conditional variational autoencoder. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600715. [PMID: 38979263 PMCID: PMC11230389 DOI: 10.1101/2024.06.25.600715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Predicting the functional consequences of genetic variants in non-coding regions is a challenging problem. Massively parallel reporter assays (MPRAs), which are an in vitro high-throughput method, can simultaneously test thousands of variants by evaluating the existence of allele specific regulatory activity. Nevertheless, the identified labelled variants by MPRAs, which shows differential allelic regulatory effects on the gene expression are usually limited to the scale of hundreds, limiting their potential to be used as the training set for achieving a robust genome-wide prediction. To address the limitation, we propose a deep generative model, MpraVAE, to in silico generate and augment the training sample size of labelled variants. By benchmarking on several MPRA datasets, we demonstrate that MpraVAE significantly improves the prediction performance for MPRA regulatory variants compared to the baseline method, conventional data augmentation approaches as well as existing variant scoring methods. Taking autoimmune diseases as one example, we apply MpraVAE to perform a genome-wide prediction of regulatory variants and find that predicted regulatory variants are more enriched than background variants in enhancers, active histone marks, open chromatin regions in immune-related cell types, and chromatin states associated with promoter, enhancer activity and binding sites of cMyC and Pol II that regulate gene expression. Importantly, predicted regulatory variants are found to link immune-related genes by leveraging chromatin loop and accessible chromatin, demonstrating the importance of MpraVAE in genetic and gene discovery for complex traits.
Collapse
Affiliation(s)
- Weijia Jin
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Yi Xia
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Sai Ritesh Thela
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Li Chen
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| |
Collapse
|
4
|
Jou V, Peña SM, Lehoczky JA. Regeneration-specific promoter switching facilitates Mest expression in the mouse digit tip to modulate neutrophil response. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.12.598713. [PMID: 38915675 PMCID: PMC11195169 DOI: 10.1101/2024.06.12.598713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
The mouse digit tip regenerates following amputation, a process mediated by a cellularly heterogeneous blastema. We previously found the gene Mest to be highly expressed in mesenchymal cells of the blastema and a strong candidate pro-regenerative gene. We now show Mest digit expression is regeneration-specific and not upregulated in post-amputation fibrosing proximal digits. Mest homozygous knockout mice exhibit delayed bone regeneration though no phenotype is found in paternal knockout mice, inconsistent with the defined maternal genomic imprinting of Mest. We demonstrate that promoter switching, not loss of imprinting, regulates biallelic Mest expression in the blastema and does not occur during embryogenesis, indicating a regeneration-specific mechanism. Requirement for Mest expression is tied to modulating neutrophil response, as revealed by scRNAseq and FACS comparing wildtype and knockout blastemas. Collectively, the imprinted gene Mest is required for proper digit tip regeneration and its blastema expression is facilitated by promoter switching for biallelic expression.
Collapse
Affiliation(s)
- Vivian Jou
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Department of Orthopedic Surgery, Brigham and Women’s Hospital, Boston, MA, USA
| | - Sophia M. Peña
- Department of Orthopedic Surgery, Brigham and Women’s Hospital, Boston, MA, USA
| | - Jessica A. Lehoczky
- Department of Orthopedic Surgery, Brigham and Women’s Hospital, Boston, MA, USA
| |
Collapse
|
5
|
Ma X, Thela SR, Zhao F, Yao B, Wen Z, Jin P, Zhao J, Chen L. Deep5hmC: Predicting genome-wide 5-Hydroxymethylcytosine landscape via a multimodal deep learning model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.04.583444. [PMID: 38496575 PMCID: PMC10942288 DOI: 10.1101/2024.03.04.583444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
5-hydroxymethylcytosine (5hmC), a critical epigenetic mark with a significant role in regulating tissue-specific gene expression, is essential for understanding the dynamic functions of the human genome. Using tissue-specific 5hmC sequencing data, we introduce Deep5hmC, a multimodal deep learning framework that integrates both the DNA sequence and the histone modification information to predict genome-wide 5hmC modification. The multimodal design of Deep5hmC demonstrates remarkable improvement in predicting both qualitative and quantitative 5hmC modification compared to unimodal versions of Deep5hmC and state-of-the-art machine learning methods. This improvement is demonstrated through benchmarking on a comprehensive set of 5hmC sequencing data collected at four time points during forebrain organoid development and across 17 human tissues. Notably, Deep5hmC showcases its practical utility by accurately predicting gene expression and identifying differentially hydroxymethylated regions in a case-control study of Alzheimer's disease.
Collapse
Affiliation(s)
- Xin Ma
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Sai Ritesh Thela
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Fengdi Zhao
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Bing Yao
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Zhexing Wen
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Peng Jin
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Jinying Zhao
- Department of Epidemiology, University of Florida, Gainesville, FL, 32603, USA
| | - Li Chen
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| |
Collapse
|
6
|
Fu ZH, He SZ, Wu Y, Zhao GR. Design and deep learning of synthetic B-cell-specific promoters. Nucleic Acids Res 2023; 51:11967-11979. [PMID: 37889080 PMCID: PMC10681721 DOI: 10.1093/nar/gkad930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/20/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
Synthetic biology and deep learning synergistically revolutionize our ability for decoding and recoding DNA regulatory grammar. The B-cell-specific transcriptional regulation is intricate, and unlock the potential of B-cell-specific promoters as synthetic elements is important for B-cell engineering. Here, we designed and pooled synthesized 23 640 B-cell-specific promoters that exhibit larger sequence space, B-cell-specific expression, and enable diverse transcriptional patterns in B-cells. By MPRA (Massively parallel reporter assays), we deciphered the sequence features that regulate promoter transcriptional, including motifs and motif syntax (their combination and distance). Finally, we built and trained a deep learning model capable of predicting the transcriptional strength of the immunoglobulin V gene promoter directly from sequence. Prediction of thousands of promoter variants identified in the global human population shows that polymorphisms in promoters influence the transcription of immunoglobulin V genes, which may contribute to individual differences in adaptive humoral immune responses. Our work helps to decipher the transcription mechanism in immunoglobulin genes and offers thousands of non-similar promoters for B-cell engineering.
Collapse
Affiliation(s)
- Zong-Heng Fu
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Si-Zhe He
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Yi Wu
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Guang-Rong Zhao
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| |
Collapse
|