1
|
Schmerer N, Janga H, Aillaud M, Hoffmann J, Aznaourova M, Wende S, Steding H, Halder LD, Uhl M, Boldt F, Stiewe T, Nist A, Jerrentrup L, Kirschbaum A, Ruppert C, Rossbach O, Ntini E, Marsico A, Valasarajan C, Backofen R, Linne U, Pullamsetti SS, Schmeck B, Schulte LN. A searchable atlas of pathogen-sensitive lncRNA networks in human macrophages. Nat Commun 2025; 16:4733. [PMID: 40399309 PMCID: PMC12095776 DOI: 10.1038/s41467-025-60084-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 05/14/2025] [Indexed: 05/23/2025] Open
Abstract
Long noncoding RNAs (lncRNA) are crucial yet underexplored regulators of human immunity. Here we develop GRADR, a method integrating gradient profiling with RNA-binding proteome analysis, to map the protein interactomes of all expressed RNAs in a single experiment to study mechanisms of lncRNA-mediated regulation of human primary macrophages. Applying GRADR alongside CRISPR-multiomics, we reveal a network of NFκB-dependent lncRNAs, including LINC01215, AC022816.1 and ROCKI, which modulate distinct aspects of macrophage immunity, particularly through interactions with mRNA-processing factors, such as hnRNP proteins. We further uncover the function of ROCKI in repressing the messenger of the anti-inflammatory GATA2 transcription factor, thus promoting macrophage activation. Lastly, all data are consolidated in the SMyLR web interface, a searchable reference catalog for exploring lncRNA functions and pathway-dependencies in immune cells. Our results thus not only highlight the important functions of lncRNAs in immune regulation, but also provide a rich resource for lncRNA studies.
Collapse
Affiliation(s)
- Nils Schmerer
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Harshavardhan Janga
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Michelle Aillaud
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Janina Hoffmann
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Marina Aznaourova
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Sarah Wende
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Henrike Steding
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Luke D Halder
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Michael Uhl
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110, Freiburg, Germany
- Signalling Research Centre CIBSS, University of Freiburg, 79104, Freiburg, Germany
| | - Fabian Boldt
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Thorsten Stiewe
- German Center for Lung Research (DZL), 35392, Giessen, Germany
- Genomics Core Facility, Institute of Molecular Oncology, University of Marburg, 35043, Marburg, Germany
- Institute for Lung Health (ILH), Justus-Liebig University, Giessen, Germany
| | - Andrea Nist
- Genomics Core Facility, Institute of Molecular Oncology, University of Marburg, 35043, Marburg, Germany
| | - Lukas Jerrentrup
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
| | - Andreas Kirschbaum
- Department of Visceral, Thoracic and Vascular Surgery, University Hospital Giessen and Marburg (UKGM), Marburg, Germany
| | - Clemens Ruppert
- German Center for Lung Research (DZL), 35392, Giessen, Germany
- Universities of Giessen and Marburg Lung Center (UGMLC), Giessen, 35392, Germany
- UGMLC Giessen Biobank and european IPF registry (eurIPFreg), Giessen, 35392, Germany
| | - Oliver Rossbach
- Institute for Biochemistry, FB08, Justus Liebig University Giessen, 35392, Giessen, Germany
| | - Evgenia Ntini
- Max Planck Institute for Molecular Genetics, 14195, Berlin, Germany
| | - Annalisa Marsico
- Max Planck Institute for Molecular Genetics, 14195, Berlin, Germany
- Institute for Computational Biology, Helmholtz Center, 85764, München, Germany
| | - Chanil Valasarajan
- Universities of Giessen and Marburg Lung Center (UGMLC), Giessen, 35392, Germany
- Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
- Excellence Cluster Cardio-Pulmonary Institute (CPI), Justus-Liebig University, Giessen, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110, Freiburg, Germany
- Signalling Research Centre CIBSS, University of Freiburg, 79104, Freiburg, Germany
| | - Uwe Linne
- Mass spectrometry facility of the Department of Chemistry, Philipps University, Marburg, Germany
| | - Soni S Pullamsetti
- German Center for Lung Research (DZL), 35392, Giessen, Germany
- Institute for Lung Health (ILH), Justus-Liebig University, Giessen, Germany
- Universities of Giessen and Marburg Lung Center (UGMLC), Giessen, 35392, Germany
- Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
- Excellence Cluster Cardio-Pulmonary Institute (CPI), Justus-Liebig University, Giessen, Germany
| | - Bernd Schmeck
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany
- German Center for Lung Research (DZL), 35392, Giessen, Germany
- Institute for Lung Health (ILH), Justus-Liebig University, Giessen, Germany
- Department of Medicine, Pulmonary and Critical Care Medicine, University Hospital Giessen and Marburg, Philipps University Marburg, Marburg, Germany
- German Centre for Infectious Disease Research (DZIF), SYNMIKRO Centre for Synthetic Microbiology, Philipps University Marburg, Marburg, Germany
| | - Leon N Schulte
- Institute for Lung Research, Philipps University Marburg, 35043, Marburg, Germany.
- German Center for Lung Research (DZL), 35392, Giessen, Germany.
| |
Collapse
|
2
|
Shen X, Hou Y, Wang X, Zhang C, Liu J, Shen H, Wang W, Yang Y, Yang M, Li Y, Zhang J, Sun Y, Chen K, Shi L, Li X. A deep learning model for characterizing protein-RNA interactions from sequences at single-base resolution. PATTERNS (NEW YORK, N.Y.) 2025; 6:101150. [PMID: 39896261 PMCID: PMC11783876 DOI: 10.1016/j.patter.2024.101150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 09/18/2024] [Accepted: 12/11/2024] [Indexed: 02/04/2025]
Abstract
Protein-RNA interactions play pivotal roles in regulating transcription, translation, and RNA metabolism. Characterizing these interactions offers key insights into RNA dysregulation mechanisms. Here, we introduce Reformer, a deep learning model that predicts protein-RNA binding affinity from sequence data. Trained on 225 enhanced cross-linking and immunoprecipitation sequencing (eCLIP-seq) datasets encompassing 155 RNA-binding proteins across three cell lines, Reformer achieves high accuracy in predicting binding affinity at single-base resolution. The model uncovers binding motifs that are often undetectable through traditional eCLIP-seq methods. Notably, the motifs learned by Reformer are shown to correlate with RNA processing functions. Validation via electrophoretic mobility shift assays confirms the model's precision in quantifying the impact of mutations on RNA regulation. In summary, Reformer improves the resolution of RNA-protein interaction predictions and aids in prioritizing mutations that influence RNA regulation.
Collapse
Affiliation(s)
- Xilin Shen
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- Department of Pathology, Key Laboratory of Cancer Prevention and Therapy, Tianjin’s Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- State Key Laboratory of Experimental Hematology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Breast Cancer Prevention and Therapy (Ministry of Education), Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Yayan Hou
- Department of Pharmacy, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an 710061, China
- State Key Laboratory of Experimental Hematology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Breast Cancer Prevention and Therapy (Ministry of Education), Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Xueer Wang
- The Third Department of Breast Cancer, Tianjin’s Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin 300070, China
| | - Chunyong Zhang
- State Key Laboratory of Experimental Hematology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Breast Cancer Prevention and Therapy (Ministry of Education), Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Jilei Liu
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Hongru Shen
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Wei Wang
- Department of Epidemiology and Biostatistics, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Yichen Yang
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Meng Yang
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Yang Li
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Jin Zhang
- The Third Department of Breast Cancer, Tianjin’s Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin 300070, China
| | - Yan Sun
- Department of Pathology, Key Laboratory of Cancer Prevention and Therapy, Tianjin’s Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Lei Shi
- State Key Laboratory of Experimental Hematology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Breast Cancer Prevention and Therapy (Ministry of Education), Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Xiangchun Li
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
3
|
Rennie S. Deep Learning for Elucidating Modifications to RNA-Status and Challenges Ahead. Genes (Basel) 2024; 15:629. [PMID: 38790258 PMCID: PMC11121098 DOI: 10.3390/genes15050629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/11/2024] [Accepted: 05/11/2024] [Indexed: 05/26/2024] Open
Abstract
RNA-binding proteins and chemical modifications to RNA play vital roles in the co- and post-transcriptional regulation of genes. In order to fully decipher their biological roles, it is an essential task to catalogue their precise target locations along with their preferred contexts and sequence-based determinants. Recently, deep learning approaches have significantly advanced in this field. These methods can predict the presence or absence of modification at specific genomic regions based on diverse features, particularly sequence and secondary structure, allowing us to decipher the highly non-linear sequence patterns and structures that underlie site preferences. This article provides an overview of how deep learning is being applied to this area, with a particular focus on the problem of mRNA-RBP binding, while also considering other types of chemical modification to RNA. It discusses how different types of model can handle sequence-based and/or secondary-structure-based inputs, the process of model training, including choice of negative regions and separating sets for testing and training, and offers recommendations for developing biologically relevant models. Finally, it highlights four key areas that are crucial for advancing the field.
Collapse
Affiliation(s)
- Sarah Rennie
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| |
Collapse
|
4
|
Verma SK, Kuyumcu-Martinez MN. RNA binding proteins in cardiovascular development and disease. Curr Top Dev Biol 2024; 156:51-119. [PMID: 38556427 PMCID: PMC11896630 DOI: 10.1016/bs.ctdb.2024.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Congenital heart disease (CHD) is the most common birth defect affecting>1.35 million newborn babies worldwide. CHD can lead to prenatal, neonatal, postnatal lethality or life-long cardiac complications. RNA binding protein (RBP) mutations or variants are emerging as contributors to CHDs. RBPs are wizards of gene regulation and are major contributors to mRNA and protein landscape. However, not much is known about RBPs in the developing heart and their contributions to CHD. In this chapter, we will discuss our current knowledge about specific RBPs implicated in CHDs. We are in an exciting era to study RBPs using the currently available and highly successful RNA-based therapies and methodologies. Understanding how RBPs shape the developing heart will unveil their contributions to CHD. Identifying their target RNAs in the embryonic heart will ultimately lead to RNA-based treatments for congenital heart disease.
Collapse
Affiliation(s)
- Sunil K Verma
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine Charlottesville, VA, United States.
| | - Muge N Kuyumcu-Martinez
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine Charlottesville, VA, United States; Robert M. Berne Cardiovascular Research Center, University of Virginia School of Medicine, Charlottesville, VA, United States; University of Virginia Cancer Center, Charlottesville, VA, United States.
| |
Collapse
|
5
|
Wu H, Liu X, Fang Y, Yang Y, Huang Y, Pan X, Shen HB. Decoding protein binding landscape on circular RNAs with base-resolution transformer models. Comput Biol Med 2024; 171:108175. [PMID: 38402841 DOI: 10.1016/j.compbiomed.2024.108175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/16/2024] [Accepted: 02/18/2024] [Indexed: 02/27/2024]
Abstract
Circular RNAs (circRNAs), a class of endogenous RNA with a covalent loop structure, can regulate gene expression by serving as sponges for microRNAs and RNA-binding proteins (RBPs). To date, most computational methods for predicting RBP binding sites on circRNAs focus on circRNA fragments instead of circRNAs. These methods detect whether a circRNA fragment contains binding sites, but cannot determine where are the binding sites and how many binding sites are on the circRNA transcript. We report a hybrid deep learning-based tool, CircSite, to predict RBP binding sites at single-nucleotide resolution and detect key contributed nucleotides on circRNA transcripts. CircSite takes advantage of convolutional neural networks (CNNs) and Transformer for learning local and global representations of circRNAs binding to RBPs, respectively. We construct 37 datasets of circRNAs interacting with proteins for benchmarking and the experimental results show that CircSite offers accurate predictions of RBP binding nucleotides and detects key subsequences aligning well with known binding motifs. CircSite is an easy-to-use online webserver for predicting RBP binding sites on circRNA transcripts and freely available at http://www.csbio.sjtu.edu.cn/bioinf/CircSite/.
Collapse
Affiliation(s)
- Hehe Wu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaojian Liu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yi Fang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yang Yang
- Center for Brain-Like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yan Huang
- State Key Laboratory of Infrared Physics, Shanghai Institute of Technical Physics Chinese Academy of Sciences, 500 Yutian Road, Shanghai, 200083, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, And Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| |
Collapse
|
6
|
Horlacher M, Cantini G, Hesse J, Schinke P, Goedert N, Londhe S, Moyon L, Marsico A. A systematic benchmark of machine learning methods for protein-RNA interaction prediction. Brief Bioinform 2023; 24:bbad307. [PMID: 37635383 PMCID: PMC10516373 DOI: 10.1093/bib/bbad307] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 06/15/2023] [Accepted: 07/18/2023] [Indexed: 08/29/2023] Open
Abstract
RNA-binding proteins (RBPs) are central actors of RNA post-transcriptional regulation. Experiments to profile-binding sites of RBPs in vivo are limited to transcripts expressed in the experimental cell type, creating the need for computational methods to infer missing binding information. While numerous machine-learning based methods have been developed for this task, their use of heterogeneous training and evaluation datasets across different sets of RBPs and CLIP-seq protocols makes a direct comparison of their performance difficult. Here, we compile a set of 37 machine learning (primarily deep learning) methods for in vivo RBP-RNA interaction prediction and systematically benchmark a subset of 11 representative methods across hundreds of CLIP-seq datasets and RBPs. Using homogenized sample pre-processing and two negative-class sample generation strategies, we evaluate methods in terms of predictive performance and assess the impact of neural network architectures and input modalities on model performance. We believe that this study will not only enable researchers to choose the optimal prediction method for their tasks at hand, but also aid method developers in developing novel, high-performing methods by introducing a standardized framework for their evaluation.
Collapse
Affiliation(s)
- Marc Horlacher
- Computational Health Center, Helmholtz Center Munich, Germany
- School of Computation, Information and Technology, Technical University Munich (TUM), Germany
| | - Giulia Cantini
- Computational Health Center, Helmholtz Center Munich, Germany
| | - Julian Hesse
- Computational Health Center, Helmholtz Center Munich, Germany
| | - Patrick Schinke
- Computational Health Center, Helmholtz Center Munich, Germany
| | - Nicolas Goedert
- Computational Health Center, Helmholtz Center Munich, Germany
| | | | - Lambert Moyon
- Computational Health Center, Helmholtz Center Munich, Germany
| | | |
Collapse
|
7
|
Bhandari N, Walambe R, Kotecha K, Khare SP. A comprehensive survey on computational learning methods for analysis of gene expression data. Front Mol Biosci 2022; 9:907150. [PMID: 36458095 PMCID: PMC9706412 DOI: 10.3389/fmolb.2022.907150] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 09/28/2022] [Indexed: 09/19/2023] Open
Abstract
Computational analysis methods including machine learning have a significant impact in the fields of genomics and medicine. High-throughput gene expression analysis methods such as microarray technology and RNA sequencing produce enormous amounts of data. Traditionally, statistical methods are used for comparative analysis of gene expression data. However, more complex analysis for classification of sample observations, or discovery of feature genes requires sophisticated computational approaches. In this review, we compile various statistical and computational tools used in analysis of expression microarray data. Even though the methods are discussed in the context of expression microarrays, they can also be applied for the analysis of RNA sequencing and quantitative proteomics datasets. We discuss the types of missing values, and the methods and approaches usually employed in their imputation. We also discuss methods of data normalization, feature selection, and feature extraction. Lastly, methods of classification and class discovery along with their evaluation parameters are described in detail. We believe that this detailed review will help the users to select appropriate methods for preprocessing and analysis of their data based on the expected outcome.
Collapse
Affiliation(s)
- Nikita Bhandari
- Computer Science Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
| | - Rahee Walambe
- Electronics and Telecommunication Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
- Symbiosis Center for Applied AI (SCAAI), Symbiosis International (Deemed University), Pune, India
| | - Ketan Kotecha
- Computer Science Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
- Symbiosis Center for Applied AI (SCAAI), Symbiosis International (Deemed University), Pune, India
| | - Satyajeet P. Khare
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Pune, India
| |
Collapse
|
8
|
Arora V, Sanguinetti G. De novo prediction of RNA-protein interactions with graph neural networks. RNA (NEW YORK, N.Y.) 2022; 28:1469-1480. [PMID: 36008134 PMCID: PMC9745830 DOI: 10.1261/rna.079365.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 08/17/2022] [Indexed: 06/15/2023]
Abstract
RNA-binding proteins (RBPs) are key co- and post-transcriptional regulators of gene expression, playing a crucial role in many biological processes. Experimental methods like CLIP-seq have enabled the identification of transcriptome-wide RNA-protein interactions for select proteins; however, the time- and resource-intensive nature of these technologies call for the development of computational methods to complement their predictions. Here, we leverage recent, large-scale CLIP-seq experiments to construct a de novo predictor of RNA-protein interactions based on graph neural networks (GNN). We show that the GNN method allows us not only to predict missing links in an RNA-protein network, but to predict the entire complement of targets of previously unassayed proteins, and even to reconstruct the entire network of RNA-protein interactions in different conditions based on minimal information. Our results demonstrate the potential of modern machine learning methods to extract useful information on post-transcriptional regulation from large data sets.
Collapse
Affiliation(s)
- Viplove Arora
- Data Science, Department of Physics, SISSA, Trieste 34136, Italy
| | | |
Collapse
|
9
|
Uhl M, Tran VD, Heyl F, Backofen R. RNAProt: an efficient and feature-rich RNA binding protein binding site predictor. Gigascience 2021; 10:giab054. [PMID: 34406415 PMCID: PMC8372218 DOI: 10.1093/gigascience/giab054] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 05/18/2021] [Accepted: 07/27/2021] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Cross-linking and immunoprecipitation followed by next-generation sequencing (CLIP-seq) is the state-of-the-art technique used to experimentally determine transcriptome-wide binding sites of RNA-binding proteins (RBPs). However, it relies on gene expression, which can be highly variable between conditions and thus cannot provide a complete picture of the RBP binding landscape. This creates a demand for computational methods to predict missing binding sites. Although there exist various methods using traditional machine learning and lately also deep learning, we encountered several problems: many of these are not well documented or maintained, making them difficult to install and use, or are not even available. In addition, there can be efficiency issues, as well as little flexibility regarding options or supported features. RESULTS Here, we present RNAProt, an efficient and feature-rich computational RBP binding site prediction framework based on recurrent neural networks. We compare RNAProt with 1 traditional machine learning approach and 2 deep-learning methods, demonstrating its state-of-the-art predictive performance and better run time efficiency. We further show that its implemented visualizations capture known binding preferences and thus can help to understand what is learned. Since RNAProt supports various additional features (including user-defined features, which no other tool offers), we also present their influence on benchmark set performance. Finally, we show the benefits of incorporating additional features, specifically structure information, when learning the binding sites of an hairpin loop binding RBP. CONCLUSIONS RNAProt provides a complete framework for RBP binding site predictions, from data set generation over model training to the evaluation of binding preferences and prediction. It offers state-of-the-art predictive performance, as well as superior run time efficiency, while at the same time supporting more features and input types than any other tool available so far. RNAProt is easy to install and use, comes with comprehensive documentation, and is accompanied by informative statistics and visualizations. All this makes RNAProt a valuable tool to apply in future RBP binding site research.
Collapse
Affiliation(s)
- Michael Uhl
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Van Dinh Tran
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Florian Heyl
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Koehler-Allee 106, 79110 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schaenzlestr. 18, 79104 Freiburg, Germany
| |
Collapse
|