1
|
Lin TC, Tsai CH, Shiau CK, Huang JH, Tsai HK. Predicting splicing patterns from the transcription factor binding sites in the promoter with deep learning. BMC Genomics 2024; 25:830. [PMID: 39227799 PMCID: PMC11373144 DOI: 10.1186/s12864-024-10667-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 07/25/2024] [Indexed: 09/05/2024] Open
Abstract
BACKGROUND Alternative splicing is a pivotal mechanism of post-transcriptional modification that contributes to the transcriptome plasticity and proteome diversity in metazoan cells. Although many splicing regulations around the exon/intron regions are known, the relationship between promoter-bound transcription factors and the downstream alternative splicing largely remains unexplored. RESULTS In this study, we present computational approaches to unravel the regulatory relationship between promoter-bound transcription factor binding sites (TFBSs) and the splicing patterns. We curated a fine dataset that includes DNase I hypersensitive site sequencing and transcriptomes across fifteen human tissues from ENCODE. Specifically, we proposed different representations of TF binding context and splicing patterns to examine the associations between the promoter and downstream splicing events. While machine learning models demonstrated potential in predicting splicing patterns based on TFBS occupancies, the limitations in the generalization of predicting the splicing forms of singleton genes across diverse tissues was observed with carefully examination using different cross-validation methods. We further investigated the association between alterations in individual TFBS at promoters and shifts in exon splicing efficiency. Our results demonstrate that the convolutional neural network (CNN) models, trained on TF binding changes in the promoters, can predict the changes in splicing patterns. Furthermore, a systemic in silico substitutions analysis on the CNN models highlighted several potential splicing regulators. Notably, using empirical validation using K562 CTCFL shRNA knock-down data, we showed the significant role of CTCFL in splicing regulation. CONCLUSION In conclusion, our finding highlights the potential role of promoter-bound TFBSs in influencing the regulation of downstream splicing patterns and provides insights for discovering alternative splicing regulations.
Collapse
Affiliation(s)
- Tzu-Chieh Lin
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Cheng-Hung Tsai
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Cheng-Kai Shiau
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Jia-Hsin Huang
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan.
- Taiwan AI Labs & Foundation, Taipei, 10351, Taiwan.
| | - Huai-Kuang Tsai
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan.
- Taiwan AI Labs & Foundation, Taipei, 10351, Taiwan.
| |
Collapse
|
2
|
Pugacheva EM, Bhatt DN, Rivero-Hinojosa S, Tajmul M, Fedida L, Price E, Ji Y, Loukinov D, Strunnikov AV, Ren B, Lobanenkov VV. BORIS/CTCFL epigenetically reprograms clustered CTCF binding sites into alternative transcriptional start sites. Genome Biol 2024; 25:40. [PMID: 38297316 PMCID: PMC10832218 DOI: 10.1186/s13059-024-03175-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 01/15/2024] [Indexed: 02/02/2024] Open
Abstract
BACKGROUND Pervasive usage of alternative promoters leads to the deregulation of gene expression in carcinogenesis and may drive the emergence of new genes in spermatogenesis. However, little is known regarding the mechanisms underpinning the activation of alternative promoters. RESULTS Here we describe how alternative cancer-testis-specific transcription is activated. We show that intergenic and intronic CTCF binding sites, which are transcriptionally inert in normal somatic cells, could be epigenetically reprogrammed into active de novo promoters in germ and cancer cells. BORIS/CTCFL, the testis-specific paralog of the ubiquitously expressed CTCF, triggers the epigenetic reprogramming of CTCF sites into units of active transcription. BORIS binding initiates the recruitment of the chromatin remodeling factor, SRCAP, followed by the replacement of H2A histone with H2A.Z, resulting in a more relaxed chromatin state in the nucleosomes flanking the CTCF binding sites. The relaxation of chromatin around CTCF binding sites facilitates the recruitment of multiple additional transcription factors, thereby activating transcription from a given binding site. We demonstrate that the epigenetically reprogrammed CTCF binding sites can drive the expression of cancer-testis genes, long noncoding RNAs, retro-pseudogenes, and dormant transposable elements. CONCLUSIONS Thus, BORIS functions as a transcription factor that epigenetically reprograms clustered CTCF binding sites into transcriptional start sites, promoting transcription from alternative promoters in both germ cells and cancer cells.
Collapse
Affiliation(s)
- Elena M Pugacheva
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Dharmendra Nath Bhatt
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Samuel Rivero-Hinojosa
- Center for Cancer and Immunology Research, Children's National Research Institute, Washington, DC, 20010, USA
| | - Md Tajmul
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Liron Fedida
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Emma Price
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Yon Ji
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Dmitri Loukinov
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Alexander V Strunnikov
- Guangzhou Institutes of Biomedicine and Health, Molecular Epigenetics Laboratory, 190 Kai Yuan Avenue, Science Park, Guangzhou, 510530, China
| | - Bing Ren
- Ludwig Institute for Cancer Research, 9500 Gilman Drive, La Jolla, CA, 92093, USA
- Department of Cellular and Molecular Medicine, Center for Epigenomics, Moores Cancer Center and Institute of Genomic Medicine, University of California, San Diego School of Medicine, La Jolla, CA, 92093-0653, USA
| | - Victor V Lobanenkov
- Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
3
|
Sati L, Soygur B, Goksu E, Bassorgun CI, McGrath J. CTCFL expression is associated with cerebral vascular abnormalities. Tissue Cell 2021; 72:101528. [PMID: 33756271 DOI: 10.1016/j.tice.2021.101528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 02/06/2021] [Accepted: 03/11/2021] [Indexed: 10/21/2022]
Abstract
CTCFL is expressed in testis, oocytes and embryonic stem cells, and is aberrantly expressed in malignant cells, and is classified as a cancer-testis gene. We have previously shown by using a tetracycline-inducible Ctcfl transgene that inappropriate expression of Ctcfl negatively impacts fetal development and causes early postnatal lethality in the mouse. The affected pups displayed severe vascular abnormalities and localized hemorrhages in the brain evocative of cerebral cavernous malformations (CCM) and arteriovenous malformations (AVM) in humans. Thus, we aim to analyze; a) the presence of CCM-related proteins CCM1/KRIT1, CCM2/malcavernin and CCM3/PDCD10 in Ctcfl transgenic animals and, b) whether there is CTCFL expression in human CCM and AVM tissues. Ctcfl transgenic animals exhibited increased CD31 expression in vascular areas of the dermis and periadnexal regions but no difference was observed for vWF and α-SMA expressions. CCM-related proteins CCM1/KRIT1, CCM2/malcavernin and CCM3/PDCD10 were aberrantly expressed in coronal sections of the head in transgenic animals. We also observed CTCFL expression in human CCMs and AVMs. The induced expression of CTCFL resulting in vascular brain malformations in mice combined with the presence of CTCFL in human vascular malformations provide new insights into the role of this gene in vascular development in humans.
Collapse
Affiliation(s)
- Leyla Sati
- Department of Histology and Embryology, Akdeniz University School of Medicine, Antalya, Turkey.
| | - Bikem Soygur
- Department of Histology and Embryology, Akdeniz University School of Medicine, Antalya, Turkey; Department of Obstetrics, Gynecology and Reproductive Sciences, Center for Reproductive Sciences, Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA, USA
| | - Ethem Goksu
- Department of Neurosurgery, Akdeniz University School of Medicine, Antalya, Turkey
| | | | - James McGrath
- Departments of Genetics and Comparative Medicine, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
4
|
Abstract
CCCTC-binding factor (CTCF) is a conserved, essential regulator of chromatin architecture containing a unique array of 11 zinc fingers (ZFs). Gene duplication and sequence divergence during early amniote evolution generated the CTCF paralog Brother Of the Regulator of Imprinted Sites (BORIS), which has a DNA binding specificity identical to that of CTCF but divergent N- and C-termini. While healthy somatic tissues express only CTCF, CTCF and BORIS are normally co-expressed in meiotic and post-meiotic germ cells, and aberrant activation of BORIS occurs in tumors and some cancer cell lines. This has led to a model in which CTCF and BORIS compete for binding to some but not all genomic target sites; however, regulation of CTCF and BORIS genomic co-occupancy is not well understood. We recently addressed this issue, finding evidence for two major classes of CTCF target sequences, some of which contain single CTCF target sites (1xCTSes) and others containing two adjacent CTCF motifs (2xCTSes). The functional and chromatin structural features of 2xCTSes are distinct from those of 1xCTS-containing regions bound by a CTCF monomer. We suggest that these previously overlooked classes of CTCF binding regions may have different roles in regulating diverse chromatin-based phenomena, and may impact our understanding of heritable epigenetic regulation in cancer cells and normal germ cells.
Collapse
Affiliation(s)
- Victor V Lobanenkov
- a Molecular Pathology Section, Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health , 5601 Fishers Ln, Rockville , MD , USA
| | - Gabriel E Zentner
- b Department of Biology , Indiana University , 915 E 3rd St, Bloomington , IN 47405 , USA
| |
Collapse
|