1
|
Krautwurst S, Lamkiewicz K. RNA-protein interaction prediction without high-throughput data: An overview and benchmark of in silico tools. Comput Struct Biotechnol J 2024; 23:4036-4046. [PMID: 39610906 PMCID: PMC11603007 DOI: 10.1016/j.csbj.2024.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 11/05/2024] [Accepted: 11/05/2024] [Indexed: 11/30/2024] Open
Abstract
RNA-protein interactions (RPIs) are crucial for accurately operating various processes in and between organisms across kingdoms of life. Mutual detection of RPI partner molecules depends on distinct sequential, structural, or thermodynamic features, which can be determined via experimental and bioinformatic methods. Still, the underlying molecular mechanisms of many RPIs are poorly understood. It is further hypothesized that many RPIs are not even described yet. Computational RPI prediction is continuously challenged by the lack of data and detailed research of very specific examples. With the discovery of novel RPI complexes in all kingdoms of life, adaptations of existing RPI prediction methods are necessary. Continuously improving computational RPI prediction is key in advancing the understanding of RPIs in detail and supplementing experimental RPI determination. The growing amount of data covering more species and detailed mechanisms support the accuracy of prediction tools, which in turn support specific experimental research on RPIs. Here, we give an overview of RPI prediction tools that do not use high-throughput data as the user's input. We review the tools according to their input, usability, and output. We then apply the tools to known RPI examples across different kingdoms of life. Our comparison shows that the investigated prediction tools do not favor a certain species and equip the user with results varying in degree of information, from an overall RPI score to detailed interacting residues. Furthermore, we provide a guide tree to assist users which RPI prediction tool is appropriate for their available input data and desired output.
Collapse
Affiliation(s)
- Sarah Krautwurst
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Puschstr. 4, 04103 Leipzig, Germany
| |
Collapse
|
2
|
Sasse A, Ray D, Laverty KU, Tam CL, Albu M, Zheng H, Lyudovyk O, Dalal T, Nie K, Magis C, Notredame C, Weirauch MT, Hughes TR, Morris Q. Reconstructing the sequence specificities of RNA-binding proteins across eukaryotes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.15.618476. [PMID: 39464061 PMCID: PMC11507768 DOI: 10.1101/2024.10.15.618476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
RNA-binding proteins (RBPs) are key regulators of gene expression. Here, we introduce EuPRI (Eukaryotic Protein-RNA Interactions) - a freely available resource of RNA motifs for 34,736 RBPs from 690 eukaryotes. EuPRI includes in vitro binding data for 504 RBPs, including newly collected RNAcompete data for 174 RBPs, along with thousands of reconstructed motifs. We reconstruct these motifs with a new computational platform - Joint Protein-Ligand Embedding (JPLE) - which can detect distant homology relationships and map specificity-determining peptides. EuPRI quadruples the number of known RBP motifs, expanding the motif repertoire across all major eukaryotic clades, and assigning motifs to the majority of human RBPs. EuPRI drastically improves knowledge of RBP motifs in flowering plants. For example, it increases the number of Arabidopsis thaliana RBP motifs 7-fold, from 14 to 105. EuPRI also has broad utility for inferring post-transcriptional function and evolutionary relationships. We demonstrate this by predicting a role for 12 Arabidopsis thaliana RBPs in RNA stability and identifying rapid and recent evolution of post-transcriptional regulatory networks in worms and plants. In contrast, the vertebrate RNA motif set has remained relatively stable after its drastic expansion between the metazoan and vertebrate ancestors. EuPRI represents a powerful resource for the study of gene regulation across eukaryotes.
Collapse
Affiliation(s)
- Alexander Sasse
- Department of Molecular Genetics, University of Toronto, Toronto, ON Canada
- Donnelly Centre, University of Toronto, Toronto, ON Canada
- Department of Computer Science, University of Washington, Seattle, WA, USA
- Vector Institute, Toronto, ON Canada
| | - Debashish Ray
- Donnelly Centre, University of Toronto, Toronto, ON Canada
| | - Kaitlin U Laverty
- Department of Molecular Genetics, University of Toronto, Toronto, ON Canada
- Donnelly Centre, University of Toronto, Toronto, ON Canada
- Vector Institute, Toronto, ON Canada
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Cyrus L Tam
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Graduate Program in Computational Biology and Medicine, Weill-Cornell Graduate School, New York, NY, USA
| | - Mihai Albu
- Donnelly Centre, University of Toronto, Toronto, ON Canada
| | - Hong Zheng
- Donnelly Centre, University of Toronto, Toronto, ON Canada
| | - Olga Lyudovyk
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Graduate Program in Computational Biology and Medicine, Weill-Cornell Graduate School, New York, NY, USA
| | - Taykhoom Dalal
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Graduate Program in Computational Biology and Medicine, Weill-Cornell Graduate School, New York, NY, USA
| | - Kate Nie
- Department of Molecular Genetics, University of Toronto, Toronto, ON Canada
- Donnelly Centre, University of Toronto, Toronto, ON Canada
- Vector Institute, Toronto, ON Canada
| | - Cedrik Magis
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Cedric Notredame
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Matthew T Weirauch
- Center for Autoimmune Genomics and Etiology, Divisions of Allergy & Immunology, Human Genetics, Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Timothy R Hughes
- Department of Molecular Genetics, University of Toronto, Toronto, ON Canada
- Donnelly Centre, University of Toronto, Toronto, ON Canada
| | - Quaid Morris
- Department of Molecular Genetics, University of Toronto, Toronto, ON Canada
- Donnelly Centre, University of Toronto, Toronto, ON Canada
- Vector Institute, Toronto, ON Canada
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Graduate Program in Computational Biology and Medicine, Weill-Cornell Graduate School, New York, NY, USA
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| |
Collapse
|
3
|
Kuret K, Amalietti AG, Jones DM, Capitanchik C, Ule J. Positional motif analysis reveals the extent of specificity of protein-RNA interactions observed by CLIP. Genome Biol 2022; 23:191. [PMID: 36085079 PMCID: PMC9461102 DOI: 10.1186/s13059-022-02755-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 08/22/2022] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND Crosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA-protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA-binding profiles of RBPs in cells. RESULTS We develop positionally enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimizes the impact of technical and regional genomic biases by internal data normalization. We cross-validate PEKA with mCross and show that the use of input control for background correction is not required to yield high specificity of enriched motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby, we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions of the studied proteins. CONCLUSIONS Our study provides insights into the overall contributions of regional binding preferences, protein domains, and low-complexity regions to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner ( https://imaps.goodwright.com/apps/peka/ ).
Collapse
Affiliation(s)
- Klara Kuret
- National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
- Jozef Stefan International Postgraduate School, Jamova cesta 39, 1000 Ljubljana, Slovenia
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
| | - Aram Gustav Amalietti
- National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
| | - D. Marc Jones
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
- UK Dementia Research Institute, King’s College London, London, UK
| | - Charlotte Capitanchik
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
- UK Dementia Research Institute, King’s College London, London, UK
| | - Jernej Ule
- National Institute of Chemistry, Hajdrihova 19, SI-1001 Ljubljana, Slovenia
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
- UK Dementia Research Institute, King’s College London, London, UK
| |
Collapse
|
4
|
Sohrabi-Jahromi S, Söding J. Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins. Bioinformatics 2021; 37:i308-i316. [PMID: 34252974 PMCID: PMC8275352 DOI: 10.1093/bioinformatics/btab300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Understanding how proteins recognize their RNA targets is essential to elucidate regulatory processes in the cell. Many RNA-binding proteins (RBPs) form complexes or have multiple domains that allow them to bind to RNA in a multivalent, cooperative manner. They can thereby achieve higher specificity and affinity than proteins with a single RNA-binding domain. However, current approaches to de novo discovery of RNA binding motifs do not take multivalent binding into account. RESULTS We present Bipartite Motif Finder (BMF), which is based on a thermodynamic model of RBPs with two cooperatively binding RNA-binding domains. We show that bivalent binding is a common strategy among RBPs, yielding higher affinity and sequence specificity. We furthermore illustrate that the spatial geometry between the binding sites can be learned from bound RNA sequences. These discovered bipartite motifs are consistent with previously known motifs and binding behaviors. Our results demonstrate the importance of multivalent binding for RNA-binding proteins and highlight the value of bipartite motif models in representing the multivalency of protein-RNA interactions. AVAILABILITY AND IMPLEMENTATION BMF source code is available at https://github.com/soedinglab/bipartite_motif_finder under a GPL license. The BMF web server is accessible at https://bmf.soedinglab.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Salma Sohrabi-Jahromi
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany.,Campus-Institut Data Science (CIDAS), Göttingen 37077, Germany
| |
Collapse
|
5
|
Ji H, Wang J, Lu B, Li J, Zhou J, Wang L, Xu S, Peng P, Hu X, Wang K. SP1 induced long non-coding RNA AGAP2-AS1 promotes cholangiocarcinoma proliferation via silencing of CDKN1A. Mol Med 2021; 27:10. [PMID: 33522895 PMCID: PMC7852216 DOI: 10.1186/s10020-020-00222-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 09/29/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND LncRNA can regulate gene at various levels such as apparent genetics, alternative splicing, and regulation of mRNA degradation. However, the molecular mechanism of LncRNA in cholangiocarcinoma is still unclear. This deserves further exploration. METHODS We investigated the expression of AGAP2-AS1 in 32 CCA tissues and two CCA cell lines. We found a LncRNA AGAP2-AS1 which induced by SP1 has not been reported in CCA, and Knockdown and overexpression were used to investigate the biological role of AGAP2-AS1 in vitro. CHIP and RIP were performed to verify the putative targets of AGAP2-AS1. RESULTS AGAP2-AS1 was significantly upregulated in CCA tumor tissues. SP1 induced AGAP2-AS1 plays an important role in tumorigenesis. AGAP2-AS1 knockdown significantly inhibited proliferation and caused apoptosis in CCA cells. In addition, we demonstrated that AGAP2-AS1 promotes the proliferation of CCA. CONCLUSIONS We conclude that the long non-coding RNA AGAP2-AS1 plays a role in promoting the proliferation of cholangiocarcinoma.
Collapse
Affiliation(s)
- Hao Ji
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Juan Wang
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Binbin Lu
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Juan Li
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Jing Zhou
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Li Wang
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Shufen Xu
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Peng Peng
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| | - Xuezhen Hu
- Jiangsu Provincial Hospital of Traditional Chinese Medicine, Nanjing, China
- Department of Radiology, Affiliated Hospital of Nanjing University of Chinese Medicine, Nanjing, 210000 Jiangsu People’s Republic of China
| | - Keming Wang
- Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, 210000 Jiangsu People’s Republic of China
- The Second Clinical Medical College of Nanjing Medical University, Nanjing, China
| |
Collapse
|
6
|
Grønning AGB, Doktor TK, Larsen SJ, Petersen USS, Holm LL, Bruun GH, Hansen MB, Hartung AM, Baumbach J, Andresen BS. DeepCLIP: predicting the effect of mutations on protein-RNA binding with deep learning. Nucleic Acids Res 2020; 48:7099-7118. [PMID: 32558887 PMCID: PMC7367176 DOI: 10.1093/nar/gkaa530] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 05/11/2020] [Accepted: 06/10/2020] [Indexed: 02/07/2023] Open
Abstract
Nucleotide variants can cause functional changes by altering protein-RNA binding in various ways that are not easy to predict. This can affect processes such as splicing, nuclear shuttling, and stability of the transcript. Therefore, correct modeling of protein-RNA binding is critical when predicting the effects of sequence variations. Many RNA-binding proteins recognize a diverse set of motifs and binding is typically also dependent on the genomic context, making this task particularly challenging. Here, we present DeepCLIP, the first method for context-aware modeling and predicting protein binding to RNA nucleic acids using exclusively sequence data as input. We show that DeepCLIP outperforms existing methods for modeling RNA-protein binding. Importantly, we demonstrate that DeepCLIP predictions correlate with the functional outcomes of nucleotide variants in independent wet lab experiments. Furthermore, we show how DeepCLIP binding profiles can be used in the design of therapeutically relevant antisense oligonucleotides, and to uncover possible position-dependent regulation in a tissue-specific manner. DeepCLIP is freely available as a stand-alone application and as a webtool at http://deepclip.compbio.sdu.dk.
Collapse
Affiliation(s)
- Alexander Gulliver Bjørnholt Grønning
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark.,Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense M, Denmark
| | - Thomas Koed Doktor
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Simon Jonas Larsen
- Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense M, Denmark
| | - Ulrika Simone Spangsberg Petersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Lise Lolle Holm
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Gitte Hoffmann Bruun
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Michael Birkerod Hansen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Anne-Mette Hartung
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense M, Denmark.,Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - Brage Storstein Andresen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark.,Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| |
Collapse
|
7
|
Luo X, Tu X, Ding Y, Gao G, Deng M. Expectation pooling: an effective and interpretable pooling method for predicting DNA-protein binding. Bioinformatics 2020; 36:1405-1412. [PMID: 31598637 PMCID: PMC7703793 DOI: 10.1093/bioinformatics/btz768] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/21/2019] [Accepted: 10/05/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Convolutional neural networks (CNNs) have outperformed conventional methods in modeling the sequence specificity of DNA-protein binding. While previous studies have built a connection between CNNs and probabilistic models, simple models of CNNs cannot achieve sufficient accuracy on this problem. Recently, some methods of neural networks have increased performance using complex neural networks whose results cannot be directly interpreted. However, it is difficult to combine probabilistic models and CNNs effectively to improve DNA-protein binding predictions. RESULTS In this article, we present a novel global pooling method: expectation pooling for predicting DNA-protein binding. Our pooling method stems naturally from the expectation maximization algorithm, and its benefits can be interpreted both statistically and via deep learning theory. Through experiments, we demonstrate that our pooling method improves the prediction performance DNA-protein binding. Our interpretable pooling method combines probabilistic ideas with global pooling by taking the expectations of inputs without increasing the number of parameters. We also analyze the hyperparameters in our method and propose optional structures to help fit different datasets. We explore how to effectively utilize these novel pooling methods and show that combining statistical methods with deep learning is highly beneficial, which is promising and meaningful for future studies in this field. AVAILABILITY AND IMPLEMENTATION All code is public in https://github.com/gao-lab/ePooling. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Xinming Tu
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and the State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences
| | - Yang Ding
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and the State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences
| | - Ge Gao
- Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and the State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences
| | - Minghua Deng
- School of Mathematical Sciences
- Center for Quantitative Biology, Peking University, Beijing 100871, China
| |
Collapse
|
8
|
Ghanbari M, Ohler U. Deep neural networks for interpreting RNA-binding protein target preferences. Genome Res 2020; 30:214-226. [PMID: 31992613 PMCID: PMC7050519 DOI: 10.1101/gr.247494.118] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 01/07/2020] [Indexed: 11/29/2022]
Abstract
Deep learning has become a powerful paradigm to analyze the binding sites of regulatory factors including RNA-binding proteins (RBPs), owing to its strength to learn complex features from possibly multiple sources of raw data. However, the interpretability of these models, which is crucial to improve our understanding of RBP binding preferences and functions, has not yet been investigated in significant detail. We have designed a multitask and multimodal deep neural network for characterizing in vivo RBP targets. The model incorporates not only the sequence but also the region type of the binding sites as input, which helps the model to boost the prediction performance. To interpret the model, we quantified the contribution of the input features to the predictive score of each RBP. Learning across multiple RBPs at once, we are able to avoid experimental biases and to identify the RNA sequence motifs and transcript context patterns that are the most important for the predictions of each individual RBP. Our findings are consistent with known motifs and binding behaviors and can provide new insights about the regulatory functions of RBPs.
Collapse
Affiliation(s)
- Mahsa Ghanbari
- The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 10115 Berlin, Germany
| | - Uwe Ohler
- The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 10115 Berlin, Germany.,Department of Biology, Humboldt Universität zu Berlin, 10117 Berlin, Germany.,Department of Computer Science, Humboldt Universität zu Berlin, 10117 Berlin, Germany
| |
Collapse
|
9
|
Adinolfi M, Pietrosanto M, Parca L, Ausiello G, Ferrè F, Helmer-Citterich M. Discovering sequence and structure landscapes in RNA interaction motifs. Nucleic Acids Res 2019; 47:4958-4969. [PMID: 31162604 PMCID: PMC6547422 DOI: 10.1093/nar/gkz250] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 02/22/2019] [Accepted: 04/09/2019] [Indexed: 12/16/2022] Open
Abstract
RNA molecules are able to bind proteins, DNA and other small or long RNAs using information at primary, secondary or tertiary structure level. Recent techniques that use cross-linking and immunoprecipitation of RNAs can detect these interactions and, if followed by high-throughput sequencing, molecules can be analysed to find recurrent elements shared by interactors, such as sequence and/or structure motifs. Many tools are able to find sequence motifs from lists of target RNAs, while others focus on structure using different approaches to find specific interaction elements. In this work, we make a systematic analysis of RBP-RNA and RNA-RNA datasets to better characterize the interaction landscape with information about multi-motifs on the same RNAs. To achieve this goal, we updated our BEAM algorithm to combine both sequence and structure information to create pairs of patterns that model motifs of interaction. This algorithm was applied to several RNA binding proteins and ncRNAs interactors, confirming already known motifs and discovering new ones. This landscape analysis on interaction variability reflects the diversity of target recognition and underlines that often both primary and secondary structure are involved in molecular recognition.
Collapse
Affiliation(s)
- Marta Adinolfi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Marco Pietrosanto
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Luca Parca
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| | - Fabrizio Ferrè
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna Alma Mater, Via Selmi 3, 40126 Bologna, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica snc, 00133 Rome, Italy
| |
Collapse
|
10
|
Mukherjee N, Wessels HH, Lebedeva S, Sajek M, Ghanbari M, Garzia A, Munteanu A, Yusuf D, Farazi T, Hoell JI, Akat KM, Akalin A, Tuschl T, Ohler U. Deciphering human ribonucleoprotein regulatory networks. Nucleic Acids Res 2019; 47:570-581. [PMID: 30517751 PMCID: PMC6344852 DOI: 10.1093/nar/gky1185] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 11/26/2018] [Indexed: 01/04/2023] Open
Abstract
RNA-binding proteins (RBPs) control and coordinate each stage in the life cycle of RNAs. Although in vivo binding sites of RBPs can now be determined genome-wide, most studies typically focused on individual RBPs. Here, we examined a large compendium of 114 high-quality transcriptome-wide in vivo RBP-RNA cross-linking interaction datasets generated by the same protocol in the same cell line and representing 64 distinct RBPs. Comparative analysis of categories of target RNA binding preference, sequence preference, and transcript region specificity was performed, and identified potential posttranscriptional regulatory modules, i.e. specific combinations of RBPs that bind to specific sets of RNAs and targeted regions. These regulatory modules represented functionally related proteins and exhibited distinct differences in RNA metabolism, expression variance, as well as subcellular localization. This integrative investigation of experimental RBP-RNA interaction evidence and RBP regulatory function in a human cell line will be a valuable resource for understanding the complexity of post-transcriptional regulation.
Collapse
Affiliation(s)
- Neelanjan Mukherjee
- Department of Biochemistry and Molecular Genetics, RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, CO 80045, USA.,Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Hans-Hermann Wessels
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany.,Institute of Biology, Humboldt University, 10099 Berlin, Germany
| | - Svetlana Lebedeva
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Marcin Sajek
- lnstitute of Human Genetics, Polish Academy of Sciences, Poznan, Poland.,Howard Hughes Medical Institute and Laboratory for RNA Molecular Biology, The Rockefeller University, 1230 York Ave, Box 186, New York, NY 10065, USA
| | - Mahsa Ghanbari
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Aitor Garzia
- Howard Hughes Medical Institute and Laboratory for RNA Molecular Biology, The Rockefeller University, 1230 York Ave, Box 186, New York, NY 10065, USA
| | - Alina Munteanu
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany.,Institute of Computer Science, Humboldt University, 10099 Berlin, Germany
| | - Dilmurat Yusuf
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Thalia Farazi
- Howard Hughes Medical Institute and Laboratory for RNA Molecular Biology, The Rockefeller University, 1230 York Ave, Box 186, New York, NY 10065, USA
| | - Jessica I Hoell
- Howard Hughes Medical Institute and Laboratory for RNA Molecular Biology, The Rockefeller University, 1230 York Ave, Box 186, New York, NY 10065, USA.,Department of Pediatric Oncology, Hematology and Clinical Immunology, Center for Child and Adolescent Health, Medical Faculty, Heinrich Heine University of Dusseldorf, Dusseldorf, Germany
| | - Kemal M Akat
- Howard Hughes Medical Institute and Laboratory for RNA Molecular Biology, The Rockefeller University, 1230 York Ave, Box 186, New York, NY 10065, USA
| | - Altuna Akalin
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Thomas Tuschl
- Howard Hughes Medical Institute and Laboratory for RNA Molecular Biology, The Rockefeller University, 1230 York Ave, Box 186, New York, NY 10065, USA
| | - Uwe Ohler
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, Germany.,Institute of Biology, Humboldt University, 10099 Berlin, Germany.,Institute of Computer Science, Humboldt University, 10099 Berlin, Germany
| |
Collapse
|
11
|
Ji H, Hui B, Wang J, Zhu Y, Tang L, Peng P, Wang T, Wang L, Xu S, Li J, Wang K. Long noncoding RNA MAPKAPK5-AS1 promotes colorectal cancer proliferation by partly silencing p21 expression. Cancer Sci 2019; 110:72-85. [PMID: 30343528 PMCID: PMC6317943 DOI: 10.1111/cas.13838] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 09/25/2018] [Accepted: 10/04/2018] [Indexed: 02/06/2023] Open
Abstract
Colorectal cancer (CRC) is the third most common malignancy in the world, and long noncoding RNA (lncRNA) plays a critical role in carcinogenesis. Here, we report a novel lncRNA, MAPKAPK5-AS1, that acts as a critical oncogene in CRC. In addition, we attempted to explore the functions of MAPKAPK5-AS1 on tumor progression in vitro and in vivo. Quantitative RT-PCR was used to examine the expression of MAPKAPK5-AS1 in CRC tissues and cells. Expression of MAPKAPK5-AS1 was significantly upregulated in 50 CRC tissues, and increased expression of MAPKAPK5-AS1 was found to be associated with greater tumor size and advanced pathological stage in CRC patients. Knockdown of MAPKAPK5-AS1 significantly inhibited proliferation and caused apoptosis in CRC cells. We also found that p21 is a target of MAPKAPK5-AS1. In addition, we are the first to report that MAPKAPK5-AS1 plays a carcinogenic role in CRC. MAPKAPK5-AS1 is a novel prognostic biomarker and a potential therapeutic candidate for CRC cancer.
Collapse
Affiliation(s)
- Hao Ji
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Bingqing Hui
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Jirong Wang
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Ya Zhu
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Lingyu Tang
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
- Institute of Digestive Endoscopy and Medical Center for Digestive DiseasesSecond Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Peng Peng
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Tianjun Wang
- Department of Obstetrics and GynecologyThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Lijuan Wang
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
- Department of GeriatricsSecond Affiliated HospitalNanjing Medical UniversityNanjingChina
| | - Shufeng Xu
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Juan Li
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| | - Keming Wang
- Department of OncologySecond Affiliated HospitalNanjing Medical UniversityNanjingChina
- The Second Clinical Medical College of Nanjing Medical UniversityNanjingChina
| |
Collapse
|
12
|
Anders M, Chelysheva I, Goebel I, Trenkner T, Zhou J, Mao Y, Verzini S, Qian SB, Ignatova Z. Dynamic m 6A methylation facilitates mRNA triaging to stress granules. Life Sci Alliance 2018; 1:e201800113. [PMID: 30456371 PMCID: PMC6238392 DOI: 10.26508/lsa.201800113] [Citation(s) in RCA: 125] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 06/25/2018] [Accepted: 06/26/2018] [Indexed: 12/21/2022] Open
Abstract
Reversible post-transcriptional modifications on messenger RNA emerge as prevalent phenomena in RNA metabolism. The most abundant among them is N6-methyladenosine (m6A) which is pivotal for RNA metabolism and function; its role in stress response remains elusive. We have discovered that in response to oxidative stress, transcripts are additionally m6A modified in their 5' vicinity. Distinct from that of the translationally active mRNAs, this methylation pattern provides a selective mechanism for triaging mRNAs from the translatable pool to stress-induced stress granules. These stress-induced newly methylated sites are selectively recognized by the YTH domain family 3 (YTHDF3) "reader" protein, thereby revealing a new role for YTHDF3 in shaping the selectivity of stress response. Our findings describe a previously unappreciated function for RNA m6A modification in oxidative-stress response and expand the breadth of physiological roles of m6A.
Collapse
Affiliation(s)
- Maximilian Anders
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany
| | - Irina Chelysheva
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany
| | - Ingrid Goebel
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany
| | - Timo Trenkner
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany
| | - Jun Zhou
- Division of Nutritional Science, Cornell University, Ithaca, NY, USA
| | - Yuanhui Mao
- Division of Nutritional Science, Cornell University, Ithaca, NY, USA
| | - Silvia Verzini
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany
| | - Shu-Bing Qian
- Division of Nutritional Science, Cornell University, Ithaca, NY, USA
| | - Zoya Ignatova
- Institute for Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, Hamburg, Germany
| |
Collapse
|