1
|
Zhang K, Yuan B, Dai X, Chen W, Zhang C, Qiao Y, Cao W, Chen Y, Duan X, Zhang X, Yang W, Li X, Zhao J, Liu K, Dong Z, Lu J. Selection and identification of DNA aptamer binding VDAC1 for tumor tissue imaging and targeted drug delivery. Int J Biol Macromol 2025; 306:141249. [PMID: 39984095 DOI: 10.1016/j.ijbiomac.2025.141249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2024] [Revised: 02/14/2025] [Accepted: 02/16/2025] [Indexed: 02/23/2025]
Abstract
Hepatocellular carcinoma (HCC) represents a significant health concern. Identifying novel molecular targets is crucial for clinical diagnosis and targeted treatment of HCC. Aptamers are capable of binding specifically to cancer cells via target protein molecules. Consequently, aptamers are frequently employed to identify novel cancer biomarkers. The invasiveness of tumor cells is closely associated with the recurrence and metastasis of tumors. In this study, the highly invasive Huh7-P3 cells were initially constructed, and subsequently, several aptamers that could specifically recognize Huh7-P3 were developed using cell-based Systematic Evolution of Ligands by Exponential Enrichment (SELEX). The selected aptamer, designated S2-2, demonstrated the capacity to bind to multiple cancer cells. Furthermore, tissue imaging demonstrated that S2-2 exhibited a specific recognition of HCC tissue, while demonstrating no binding to normal tissue. Subsequently, voltage-dependent anion channel 1 (VDAC1) was identified as a potential target for S2-2. Furthermore, Doxorubicin (Dox)-loaded S2-2 was shown to specifically kill target Huh7-P3 cells. In vivo fluorescence imaging revealed that S2-2 was capable of specifically targeting tumors. Importantly, S2-2-Dox enhanced the anti-tumor efficacy of Dox in cell-line-derived xenograft (CDX) model. This study may provide a promising biomarker and molecular target for the clinical diagnosis and targeted therapy of cancers with high VDAC1 expression.
Collapse
Affiliation(s)
- Kai Zhang
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China
| | - Baoyin Yuan
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China
| | - Xiaoshuo Dai
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China
| | - Wei Chen
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China
| | - Chengjuan Zhang
- Department of Pathology, Henan Cancer Hospital, Zhengzhou University, Zhengzhou, Henan Province 450003, PR China
| | - Yan Qiao
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Wenbo Cao
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Yihuan Chen
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China
| | - Xiaoxuan Duan
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China
| | - Xiaoyan Zhang
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Wanjing Yang
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Xiang Li
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Jimin Zhao
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Kangdong Liu
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Ziming Dong
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China
| | - Jing Lu
- Department of Pathophysiology, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; Collaborative Innovation Center of Henan Province for Cancer Chemoprevention, Zhengzhou University, Zhengzhou, Henan Province 450001, PR China; State Key Laboratory of Metabolic Disorders and Esophageal Cancer Prevention & Treatment, Zhengzhou University, Zhengzhou, Henan Province 450052, PR China.
| |
Collapse
|
2
|
Zhang C, Wang Q, Li Y, Teng A, Hu G, Wuyun Q, Zheng W. The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction. Biomolecules 2024; 14:1531. [PMID: 39766238 PMCID: PMC11673352 DOI: 10.3390/biom14121531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 11/24/2024] [Accepted: 11/27/2024] [Indexed: 01/11/2025] Open
Abstract
Multiple sequence alignment (MSA) has evolved into a fundamental tool in the biological sciences, playing a pivotal role in predicting molecular structures and functions. With broad applications in protein and nucleic acid modeling, MSAs continue to underpin advancements across a range of disciplines. MSAs are not only foundational for traditional sequence comparison techniques but also increasingly important in the context of artificial intelligence (AI)-driven advancements. Recent breakthroughs in AI, particularly in protein and nucleic acid structure prediction, rely heavily on the accuracy and efficiency of MSAs to enhance remote homology detection and guide spatial restraints. This review traces the historical evolution of MSA, highlighting its significance in molecular structure and function prediction. We cover the methodologies used for protein monomers, protein complexes, and RNA, while also exploring emerging AI-based alternatives, such as protein language models, as complementary or replacement approaches to traditional MSAs in application tasks. By discussing the strengths, limitations, and applications of these methods, this review aims to provide researchers with valuable insights into MSA's evolving role, equipping them to make informed decisions in structural prediction research.
Collapse
Affiliation(s)
- Chenyue Zhang
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China; (C.Z.); (Y.L.); (G.H.)
| | - Qinxin Wang
- Suzhou New & High-Tech Innovation Service Center, Suzhou 215011, China;
| | - Yiyang Li
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China; (C.Z.); (Y.L.); (G.H.)
| | - Anqi Teng
- Bioscience and Biomedical Engineering Thrust, Systems Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511453, China;
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China; (C.Z.); (Y.L.); (G.H.)
| | - Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Wei Zheng
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China; (C.Z.); (Y.L.); (G.H.)
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
Zhu M, Zuber J, Tan Z, Sharma G, Mathews DH. DecoyFinder: Identification of Contaminants in Sets of Homologous RNA Sequences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.12.618037. [PMID: 39464058 PMCID: PMC11507696 DOI: 10.1101/2024.10.12.618037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Motivation RNA structure is essential for the function of many non-coding RNAs. Using multiple homologous sequences, which share structure and function, secondary structure can be predicted with much higher accuracy than with a single sequence. It can be difficult, however, to establish a set of homologous sequences when their structure is not yet known. We developed a method to identify sequences in a set of putative homologs that are in fact non-homologs. Results Previously, we developed TurboFold to estimate conserved structure using multiple, unaligned RNA homologs. Here, we report that the positive predictive value of TurboFold is significantly reduced by the presence of contamination by non-homologous sequences, although the reduction is less than 1%. We developed a method called DecoyFinder, which applies machine learning trained with features determined by TurboFold, to detect sequences that are not homologous with the other sequences in the set. This method can identify approximately 45% of non-homologous sequences, at a rate of 5% misidentification of true homologous sequences. Availability DecoyFinder and TurboFold are incorporated in RNAstructure, which is provided for free and open source under the GPL V2 license. It can be downloaded at http://rna.urmc.rochester.edu/RNAstructure.html.
Collapse
Affiliation(s)
- Mingyi Zhu
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| | - Jeffrey Zuber
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| | - Zhen Tan
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| | - Gaurav Sharma
- University of Rochester, Department of Electrical and Computer Engineering, Rochester, NY, United States
- University of Rochester, Department of Computer Science, Rochester, NY, United States
| | - David H Mathews
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| |
Collapse
|
4
|
Zhou Y, Pedrielli G, Zhang F, Wu T. Predicting RNA sequence-structure likelihood via structure-aware deep learning. BMC Bioinformatics 2024; 25:316. [PMID: 39350066 PMCID: PMC11443715 DOI: 10.1186/s12859-024-05916-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 08/27/2024] [Indexed: 10/04/2024] Open
Abstract
BACKGROUND The active functionalities of RNA are recognized to be heavily dependent on the structure and sequence. Therefore, a model that can accurately evaluate a design by giving RNA sequence-structure pairs would be a valuable tool for many researchers. Machine learning methods have been explored to develop such tools, showing promising results. However, two key issues remain. Firstly, the performance of machine learning models is affected by the features used to characterize RNA. Currently, there is no consensus on which features are the most effective for characterizing RNA sequence-structure pairs. Secondly, most existing machine learning methods extract features describing entire RNA molecule. We argue that it is essential to define additional features that characterize nucleotides and specific sections of RNA structure to enhance the overall efficacy of the RNA design process. RESULTS We develop two deep learning models for evaluating RNA sequence-secondary structure pairs. The first model, NU-ResNet, uses a convolutional neural network architecture that solves the aforementioned problems by explicitly encoding RNA sequence-structure information into a 3D matrix. Building upon NU-ResNet, our second model, NUMO-ResNet, incorporates additional information derived from the characterizations of RNA, specifically the 2D folding motifs. In this work, we introduce an automated method to extract these motifs based on fundamental secondary structure descriptions. We evaluate the performance of both models on an independent testing dataset. Our proposed models outperform the models from literatures in this independent testing dataset. To assess the robustness of our models, we conduct 10-fold cross validation. To evaluate the generalization ability of NU-ResNet and NUMO-ResNet across different RNA families, we train and test our proposed models in different RNA families. Our proposed models show superior performance compared to the models from literatures when being tested across different independent RNA families. CONCLUSIONS In this study, we propose two deep learning models, NU-ResNet and NUMO-ResNet, to evaluate RNA sequence-secondary structure pairs. These two models expand the field of data-driven approaches for learning RNA. Furthermore, these two models provide the new method to encode RNA sequence-secondary structure pairs.
Collapse
Affiliation(s)
- You Zhou
- School of Computing and Augmented Intelligence, Arizona State University, 699 S Mill Ave, Tempe, AZ, 85281, USA
- ASU-Mayo Center for Innovative Imaging, Arizona State University, 699 S Mill Ave, Tempe, AZ, 85281, USA
| | - Giulia Pedrielli
- School of Computing and Augmented Intelligence, Arizona State University, 699 S Mill Ave, Tempe, AZ, 85281, USA.
- ASU-Mayo Center for Innovative Imaging, Arizona State University, 699 S Mill Ave, Tempe, AZ, 85281, USA.
| | - Fei Zhang
- Department of Chemistry, Rutgers University, 73 Warren St, Newark, NJ, 07102, USA
| | - Teresa Wu
- School of Computing and Augmented Intelligence, Arizona State University, 699 S Mill Ave, Tempe, AZ, 85281, USA
- ASU-Mayo Center for Innovative Imaging, Arizona State University, 699 S Mill Ave, Tempe, AZ, 85281, USA
| |
Collapse
|
5
|
Chen K, Litfin T, Singh J, Zhan J, Zhou Y. MARS and RNAcmap3: The Master Database of All Possible RNA Sequences Integrated with RNAcmap for RNA Homology Search. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae018. [PMID: 38872612 PMCID: PMC12053375 DOI: 10.1093/gpbjnl/qzae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 09/24/2023] [Accepted: 10/31/2023] [Indexed: 06/15/2024]
Abstract
Recent success of AlphaFold2 in protein structure prediction relied heavily on co-evolutionary information derived from homologous protein sequences found in the huge, integrated database of protein sequences (Big Fantastic Database). In contrast, the existing nucleotide databases were not consolidated to facilitate wider and deeper homology search. Here, we built a comprehensive database by incorporating the non-coding RNA (ncRNA) sequences from RNAcentral, the transcriptome assembly and metagenome assembly from metagenomics RAST (MG-RAST), the genomic sequences from Genome Warehouse (GWH), and the genomic sequences from MGnify, in addition to the nucleotide (nt) database and its subsets in National Center of Biotechnology Information (NCBI). The resulting Master database of All possible RNA sequences (MARS) is 20-fold larger than NCBI's nt database or 60-fold larger than RNAcentral. The new dataset along with a new split-search strategy allows a substantial improvement in homology search over existing state-of-the-art techniques. It also yields more accurate and more sensitive multiple sequence alignments (MSAs) than manually curated MSAs from Rfam for the majority of structured RNAs mapped to Rfam. The results indicate that MARS coupled with the fully automatic homology search tool RNAcmap will be useful for improved structural and functional inference of ncRNAs and RNA language models based on MSAs. MARS is accessible at https://ngdc.cncb.ac.cn/omix/release/OMIX003037, and RNAcmap3 is accessible at http://zhouyq-lab.szbl.ac.cn/download/.
Collapse
Affiliation(s)
- Ke Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
- Peking University Shenzhen Graduate School, Shenzhen 518055, China
- University of Science and Technology of China, Hefei 230026, China
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou 215123, China
| | - Thomas Litfin
- Institute for Glycomics, Griffith University, Southport, QLD 4222, Australia
| | - Jaswinder Singh
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
- Peking University Shenzhen Graduate School, Shenzhen 518055, China
- Institute for Glycomics, Griffith University, Southport, QLD 4222, Australia
| |
Collapse
|
6
|
Du Z, Peng Z, Yang J. RNA threading with secondary structure and sequence profile. Bioinformatics 2024; 40:btae080. [PMID: 38341662 PMCID: PMC10893584 DOI: 10.1093/bioinformatics/btae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 01/05/2024] [Accepted: 02/09/2024] [Indexed: 02/12/2024] Open
Abstract
MOTIVATION RNA threading aims to identify remote homologies for template-based modeling of RNA 3D structure. Existing RNA alignment methods primarily rely on secondary structure alignment. They are often time- and memory-consuming, limiting large-scale applications. In addition, the accuracy is far from satisfactory. RESULTS Using RNA secondary structure and sequence profile, we developed a novel RNA threading algorithm, named RNAthreader. To enhance the alignment process and minimize memory usage, a novel approach has been introduced to simplify RNA secondary structures into compact diagrams. RNAthreader employs a two-step methodology. Initially, integer programming and dynamic programming are combined to create an initial alignment for the simplified diagram. Subsequently, the final alignment is obtained using dynamic programming, taking into account the initial alignment derived from the previous step. The benchmark test on 80 RNAs illustrates that RNAthreader generates more accurate alignments than other methods, especially for RNAs with pseudoknots. Another benchmark, involving 30 RNAs from the RNA-Puzzles experiments, exhibits that the models constructed using RNAthreader templates have a lower average RMSD than those created by alternative methods. Remarkably, RNAthreader takes less than two hours to complete alignments with ∼5000 RNAs, which is 3-40 times faster than other methods. These compelling results suggest that RNAthreader is a promising algorithm for RNA template detection. AVAILABILITY AND IMPLEMENTATION https://yanglab.qd.sdu.edu.cn/RNAthreader.
Collapse
Affiliation(s)
- Zongyang Du
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Zhenling Peng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
7
|
Zhang Y, Lang M, Jiang J, Gao Z, Xu F, Litfin T, Chen K, Singh J, Huang X, Song G, Tian Y, Zhan J, Chen J, Zhou Y. Multiple sequence alignment-based RNA language model and its application to structural inference. Nucleic Acids Res 2024; 52:e3. [PMID: 37941140 PMCID: PMC10783488 DOI: 10.1093/nar/gkad1031] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 10/21/2023] [Indexed: 11/10/2023] Open
Abstract
Compared with proteins, DNA and RNA are more difficult languages to interpret because four-letter coded DNA/RNA sequences have less information content than 20-letter coded protein sequences. While BERT (Bidirectional Encoder Representations from Transformers)-like language models have been developed for RNA, they are ineffective at capturing the evolutionary information from homologous sequences because unlike proteins, RNA sequences are less conserved. Here, we have developed an unsupervised multiple sequence alignment-based RNA language model (RNA-MSM) by utilizing homologous sequences from an automatic pipeline, RNAcmap, as it can provide significantly more homologous sequences than manually annotated Rfam. We demonstrate that the resulting unsupervised, two-dimensional attention maps and one-dimensional embeddings from RNA-MSM contain structural information. In fact, they can be directly mapped with high accuracy to 2D base pairing probabilities and 1D solvent accessibilities, respectively. Further fine-tuning led to significantly improved performance on these two downstream tasks compared with existing state-of-the-art techniques including SPOT-RNA2 and RNAsnap2. By comparison, RNA-FM, a BERT-based RNA language model, performs worse than one-hot encoding with its embedding in base pair and solvent-accessible surface area prediction. We anticipate that the pre-trained RNA-MSM model can be fine-tuned on many other tasks related to RNA structure and function.
Collapse
Affiliation(s)
- Yikun Zhang
- School of Electronic and Computer Engineering, Peking University, Shenzhen 518055, China
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzen 518055, China
| | - Mei Lang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| | - Jiuhong Jiang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| | - Zhiqiang Gao
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
- Peng Cheng Laboratory, Shenzhen 518066, China
| | - Fan Xu
- Peng Cheng Laboratory, Shenzhen 518066, China
| | - Thomas Litfin
- Institute for Glycomics, Griffith University, Parklands Dr, Southport, QLD 4215, Australia
| | - Ke Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| | - Jaswinder Singh
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| | | | - Guoli Song
- Peng Cheng Laboratory, Shenzhen 518066, China
| | | | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518107, China
| | - Jie Chen
- School of Electronic and Computer Engineering, Peking University, Shenzhen 518055, China
- Peng Cheng Laboratory, Shenzhen 518066, China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518107, China
- Institute for Glycomics, Griffith University, Parklands Dr, Southport, QLD 4215, Australia
| |
Collapse
|
8
|
Zhang J, Lang M, Zhou Y, Zhang Y. Predicting RNA structures and functions by artificial intelligence. Trends Genet 2023; 40:S0168-9525(23)00229-9. [PMID: 39492264 DOI: 10.1016/j.tig.2023.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/22/2023] [Accepted: 10/03/2023] [Indexed: 11/05/2024]
Abstract
RNA functions by interacting with its intended targets structurally. However, due to the dynamic nature of RNA molecules, RNA structures are difficult to determine experimentally or predict computationally. Artificial intelligence (AI) has revolutionized many biomedical fields and has been progressively utilized to deduce RNA structures, target binding, and associated functionality. Integrating structural and target binding information could also help improve the robustness of AI-based RNA function prediction and RNA design. Given the rapid development of deep learning (DL) algorithms, AI will provide an unprecedented opportunity to elucidate the sequence-structure-function relation of RNAs.
Collapse
Affiliation(s)
- Jun Zhang
- National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, 518060, China
| | - Mei Lang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China.
| | - Yang Zhang
- School of Science, Harbin Institute of Technology, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|
9
|
Kagaya Y, Zhang Z, Ibtehaz N, Wang X, Nakamura T, Huang D, Kihara D. NuFold: A Novel Tertiary RNA Structure Prediction Method Using Deep Learning with Flexible Nucleobase Center Representation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.558715. [PMID: 37790488 PMCID: PMC10542152 DOI: 10.1101/2023.09.20.558715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
RNA is not only playing a core role in the central dogma as mRNA between DNA and protein, but also many non-coding RNAs have been discovered to have unique and diverse biological functions. As genome sequences become increasingly available and our knowledge of RNA sequences grows, the study of RNA's structure and function has become more demanding. However, experimental determination of three-dimensional RNA structures is both costly and time-consuming, resulting in a substantial disparity between RNA sequence data and structural insights. In response to this challenge, we propose a novel computational approach that harnesses state-of-the-art deep learning architecture NuFold to accurately predict RNA tertiary structures. This approach aims to offer a cost-effective and efficient means of bridging the gap between RNA sequence information and structural comprehension. NuFold implements a nucleobase center representation, which allows it to reproduce all possible nucleotide conformations accurately.
Collapse
Affiliation(s)
- Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Nabil Ibtehaz
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - David Huang
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
10
|
Yang E, Zhang H, Zang Z, Zhou Z, Wang S, Liu Z, Liu Y. GCNfold: A novel lightweight model with valid extractors for RNA secondary structure prediction. Comput Biol Med 2023; 164:107246. [PMID: 37487383 DOI: 10.1016/j.compbiomed.2023.107246] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/23/2023] [Accepted: 07/07/2023] [Indexed: 07/26/2023]
Abstract
RNA secondary structure is essential for predicting the tertiary structure and understanding RNA function. Recent research tends to stack numerous modules to design large deep-learning models. This can increase the accuracy to more than 70%, as well as significant training costs and prediction efficiency. We proposed a model with three feature extractors called GCNfold. Structure Extractor utilizes a three-layer Graph Convolutional Network (GCN) to mine the structural information of RNA, such as stems, hairpin, and internal loops. Structure and Sequence Fusion embeds structural information into sequences with Transformer Encoders. Long-distance Dependency Extractor captures long-range pairwise relationships by UNet. The experiments indicate that GCNfold has a small number of parameters, a fast inference speed, and a high accuracy among all models with over 80% accuracy. Additionally, GCNfold-Small takes only 90ms to infer an RNA secondary structure and can achieve close to 90% accuracy on average. The GCNfold code is available on Github https://github.com/EnbinYang/GCNfold.
Collapse
Affiliation(s)
- Enbin Yang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Hao Zhang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China; College of Software, Jilin University, Changchun, 130012, China
| | - Zinan Zang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Zhiyong Zhou
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Shuo Wang
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Zhen Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China; Graduate School of Engineering, Nagasaki Institute of Applied Science, 536 Aba-machi, Nagasaki 851-0193, Japan
| | - Yuanning Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China; College of Software, Jilin University, Changchun, 130012, China.
| |
Collapse
|
11
|
Tang M, Hwang K, Kang SH. StemP: A Fast and Deterministic Stem-Graph Approach for RNA Secondary Structure Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3278-3291. [PMID: 37028040 DOI: 10.1109/tcbb.2023.3253049] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
We propose a new deterministic methodology to predict the secondary structure of RNA sequences. What information of stem is important for structure prediction, and is it enough ? The proposed simple deterministic algorithm uses minimum stem length, Stem-Loop score, and co-existence of stems, to give good structure predictions for short RNA and tRNA sequences. The main idea is to consider all possible stem with certain stem loop energy and strength to predict RNA secondary structure. We use graph notation, where stems are represented as vertexes, and co-existence between stems as edges. This full Stem-graph presents all possible folding structure, and we pick sub-graph(s) which give the best matching energy for structure prediction. Stem-Loop score adds structure information and speeds up the computation. The proposed method can predict secondary structure even with pseudo knots. One of the strengths of this approach is the simplicity and flexibility of the algorithm, and it gives a deterministic answer. Numerical experiments are done on various sequences from Protein Data Bank and the Gutell Lab using a laptop and results take only a few seconds.
Collapse
|
12
|
Rivas E. RNA covariation at helix-level resolution for the identification of evolutionarily conserved RNA structure. PLoS Comput Biol 2023; 19:e1011262. [PMID: 37450549 PMCID: PMC10370758 DOI: 10.1371/journal.pcbi.1011262] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 06/12/2023] [Indexed: 07/18/2023] Open
Abstract
Many biologically important RNAs fold into specific 3D structures conserved through evolution. Knowing when an RNA sequence includes a conserved RNA structure that could lead to new biology is not trivial and depends on clues left behind by conservation in the form of covariation and variation. For that purpose, the R-scape statistical test was created to identify from alignments of RNA sequences, the base pairs that significantly covary above phylogenetic expectation. R-scape treats base pairs as independent units. However, RNA base pairs do not occur in isolation. The Watson-Crick (WC) base pairs stack together forming helices that constitute the scaffold that facilitates the formation of the non-WC base pairs, and ultimately the complete 3D structure. The helix-forming WC base pairs carry most of the covariation signal in an RNA structure. Here, I introduce a new measure of statistically significant covariation at helix-level by aggregation of the covariation significance and covariation power calculated at base-pair-level resolution. Performance benchmarks show that helix-level aggregated covariation increases sensitivity in the detection of evolutionarily conserved RNA structure without sacrificing specificity. This additional helix-level sensitivity reveals an artifact that results from using covariation to build an alignment for a hypothetical structure and then testing the alignment for whether its covariation significantly supports the structure. Helix-level reanalysis of the evolutionary evidence for a selection of long non-coding RNAs (lncRNAs) reinforces the evidence against these lncRNAs having a conserved secondary structure.
Collapse
Affiliation(s)
- Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
13
|
Rivas E. RNA covariation at helix-level resolution for the identification of evolutionarily conserved RNA structure. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.14.536965. [PMID: 37131783 PMCID: PMC10153129 DOI: 10.1101/2023.04.14.536965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Many biologically important RNAs fold into specific 3D structures conserved through evolution. Knowing when an RNA sequence includes a conserved RNA structure that could lead to new biology is not trivial and depends on clues left behind by conservation in the form of covariation and variation. For that purpose, the R-scape statistical test was created to identify from alignments of RNA sequences, the base pairs that significantly covary above phylogenetic expectation. R-scape treats base pairs as independent units. However, RNA base pairs do not occur in isolation. The Watson-Crick (WC) base pairs stack together forming helices that constitute the scaffold that facilitates the formation of the non-WC base pairs, and ultimately the complete 3D structure. The helix-forming WC base pairs carry most of the covariation signal in an RNA structure. Here, I introduce a new measure of statistically significant covariation at helix-level by aggregation of the covariation significance and covariation power calculated at base-pair-level resolution. Performance benchmarks show that helix-level aggregated covariation increases sensitivity in the detection of evolutionarily conserved RNA structure without sacrificing specificity. This additional helix-level sensitivity reveals an artifact that results from using covariation to build an alignment for a hypothetical structure and then testing the alignment for whether its covariation significantly supports the structure. Helix-level reanalysis of the evolutionary evidence for a selection of long non-coding RNAs (lncRNAs) reinforces the evidence against these lncRNAs having a conserved secondary structure. Availability Helix aggregated E-values are integrated in the R-scape software package (version 2.0.0.p and higher). The R-scape web server eddylab.org/R-scape includes a link to download the source code. Contact elenarivas@fas.harvard.edu. Supplementary information Supplementary data and code are provided with this manuscript at rivaslab.org .
Collapse
|
14
|
Qiu X. Sequence similarity governs generalizability of de novo deep learning models for RNA secondary structure prediction. PLoS Comput Biol 2023; 19:e1011047. [PMID: 37068100 PMCID: PMC10138783 DOI: 10.1371/journal.pcbi.1011047] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 04/27/2023] [Accepted: 03/25/2023] [Indexed: 04/18/2023] Open
Abstract
Making no use of physical laws or co-evolutionary information, de novo deep learning (DL) models for RNA secondary structure prediction have achieved far superior performances than traditional algorithms. However, their statistical underpinning raises the crucial question of generalizability. We present a quantitative study of the performance and generalizability of a series of de novo DL models, with a minimal two-module architecture and no post-processing, under varied similarities between seen and unseen sequences. Our models demonstrate excellent expressive capacities and outperform existing methods on common benchmark datasets. However, model generalizability, i.e., the performance gap between the seen and unseen sets, degrades rapidly as the sequence similarity decreases. The same trends are observed from several recent DL and machine learning models. And an inverse correlation between performance and generalizability is revealed collectively across all learning-based models with wide-ranging architectures and sizes. We further quantitate how generalizability depends on sequence and structure identity scores via pairwise alignment, providing unique quantitative insights into the limitations of statistical learning. Generalizability thus poses a major hurdle for deploying de novo DL models in practice and various pathways for future advances are discussed.
Collapse
Affiliation(s)
- Xiangyun Qiu
- Department of Physics, George Washington University, Washington DC, United States of America
| |
Collapse
|
15
|
Moafinejad SN, Pandaranadar Jeyeram IPN, Jaryani F, Shirvanizadeh N, Baulin EF, Bujnicki JM. 1D2DSimScore: A novel method for comparing contacts in biomacromolecules and their complexes. Protein Sci 2023; 32:e4503. [PMID: 36369832 PMCID: PMC9795538 DOI: 10.1002/pro.4503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 10/28/2022] [Accepted: 11/01/2022] [Indexed: 11/13/2022]
Abstract
The biologically relevant structures of proteins and nucleic acids and their complexes are dynamic. They include a combination of regions ranging from rigid structural segments to structural switches to regions that are almost always disordered, which interact with each other in various ways. Comparing conformational changes and variation in contacts between different conformational states is essential to understand the biological functions of proteins, nucleic acids, and their complexes. Here, we describe a new computational tool, 1D2DSimScore, for comparing contacts and contact interfaces in all kinds of macromolecules and macromolecular complexes, including proteins, nucleic acids, and other molecules. 1D2DSimScore can be used to compare structural features of macromolecular models between alternative structures obtained in a particular experiment or to score various predictions against a defined "ideal" reference structure. Comparisons at the level of contacts are particularly useful for flexible molecules, for which comparisons in 3D that require rigid-body superpositions are difficult, and in biological systems where the formation of specific inter-residue contacts is more relevant for the biological function than the maintenance of a specific global 3D structure. Similarity/dissimilarity scores calculated by 1D2DSimScore can be used to complement scores describing 3D structural similarity measures calculated by the existing tools.
Collapse
Affiliation(s)
- S. Naeim Moafinejad
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | | | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | - Niloofar Shirvanizadeh
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | - Eugene F. Baulin
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | - Janusz M. Bujnicki
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| |
Collapse
|
16
|
Abstract
RNA molecules carry out various cellular functions, and understanding the mechanisms behind their functions requires the knowledge of their 3D structures. Different types of computational methods have been developed to model RNA 3D structures over the past decade. These methods were widely used by researchers although their performance needs to be further improved. Recently, along with these traditional methods, machine-learning techniques have been increasingly applied to RNA 3D structure prediction and show significant improvement in performance. Here we shall give a brief review of the traditional methods and recent related advances in machine-learning approaches for RNA 3D structure prediction.
Collapse
Affiliation(s)
- Xiujuan Ou
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Yi Zhang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Yiduo Xiong
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Yi Xiao
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| |
Collapse
|
17
|
Rolband L, Beasock D, Wang Y, Shu YG, Dinman JD, Schlick T, Zhou Y, Kieft JS, Chen SJ, Bussi G, Oukhaled A, Gao X, Šulc P, Binzel D, Bhullar AS, Liang C, Guo P, Afonin KA. Biomotors, viral assembly, and RNA nanobiotechnology: Current achievements and future directions. Comput Struct Biotechnol J 2022; 20:6120-6137. [PMID: 36420155 PMCID: PMC9672130 DOI: 10.1016/j.csbj.2022.11.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/13/2022] Open
Abstract
The International Society of RNA Nanotechnology and Nanomedicine (ISRNN) serves to further the development of a wide variety of functional nucleic acids and other related nanotechnology platforms. To aid in the dissemination of the most recent advancements, a biennial discussion focused on biomotors, viral assembly, and RNA nanobiotechnology has been established where international experts in interdisciplinary fields such as structural biology, biophysical chemistry, nanotechnology, cell and cancer biology, and pharmacology share their latest accomplishments and future perspectives. The results summarized here highlight advancements in our understanding of viral biology and the structure-function relationship of frame-shifting elements in genomic viral RNA, improvements in the predictions of SHAPE analysis of 3D RNA structures, and the understanding of dynamic RNA structures through a variety of experimental and computational means. Additionally, recent advances in the drug delivery, vaccine design, nanopore technologies, biomotor and biomachine development, DNA packaging, RNA nanotechnology, and drug delivery are included in this critical review. We emphasize some of the novel accomplishments, major discussion topics, and present current challenges and perspectives of these emerging fields.
Collapse
Affiliation(s)
- Lewis Rolband
- University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Damian Beasock
- University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Yang Wang
- Wenzhou Institute, University of China Academy of Sciences, 1st, Jinlian Road, Longwan District, Wenzhou, Zhjiang 325001, China
| | - Yao-Gen Shu
- Wenzhou Institute, University of China Academy of Sciences, 1st, Jinlian Road, Longwan District, Wenzhou, Zhjiang 325001, China
| | | | - Tamar Schlick
- New York University, Department of Chemistry and Courant Institute of Mathematical Sciences, Simons Center for Computational Physical Chemistry, New York, NY 10012, USA
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong 518107, China
| | - Jeffrey S. Kieft
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Shi-Jie Chen
- University of Missouri at Columbia, Columbia, MO 65211, USA
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, via Bonomea 265, 34136 Trieste, Italy
| | | | - Xingfa Gao
- National Center for Nanoscience and Technology of China, Beijing 100190, China
| | - Petr Šulc
- Arizona State University, Tempe, AZ, USA
| | | | | | - Chenxi Liang
- The Ohio State University, Columbus, OH 43210, USA
| | - Peixuan Guo
- The Ohio State University, Columbus, OH 43210, USA
| | - Kirill A. Afonin
- University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
18
|
Stephenson HN, Streeck R, Grüblinger F, Goosmann C, Herzig A. Hemocytes are essential for Drosophila melanogaster post-embryonic development, independent of control of the microbiota. Development 2022; 149:dev200286. [PMID: 36093870 PMCID: PMC9641648 DOI: 10.1242/dev.200286] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 08/19/2022] [Indexed: 09/22/2023]
Abstract
Proven roles for hemocytes (blood cells) have expanded beyond the control of infections in Drosophila. Despite this, the crucial role of hemocytes in post-embryonic development has long thought to be limited to control of microorganisms during metamorphosis. This has previously been shown by rescue of adult development in hemocyte-ablation models under germ-free conditions. Here, we show that hemocytes have an essential role in post-embryonic development beyond their ability to control the microbiota. Using a newly generated strong hemocyte-specific driver line for the GAL4/UAS system, we show that specific ablation of hemocytes is early pupal lethal, even under axenic conditions. Genetic rescue experiments prove that this is a hemocyte-specific phenomenon. RNA-seq data suggests that dysregulation of the midgut is a prominent consequence of hemocyte ablation in larval stages, resulting in reduced gut lengths. Dissection suggests that multiple processes may be affected during metamorphosis. We believe this previously unreported role for hemocytes during metamorphosis is a major finding for the field.
Collapse
Affiliation(s)
- Holly N. Stephenson
- Department of Cellular Microbiology, Max Planck Institute for Infection Biology, Charitéplatz 1, Berlin 10117, Germany
- Peninsula Medical School, Faculty of Health,University of Plymouth, Plymouth, Devon PL4 8AA, UK
| | - Robert Streeck
- Department of Cellular Microbiology, Max Planck Institute for Infection Biology, Charitéplatz 1, Berlin 10117, Germany
| | - Florian Grüblinger
- Department of Cellular Microbiology, Max Planck Institute for Infection Biology, Charitéplatz 1, Berlin 10117, Germany
| | - Christian Goosmann
- Department of Cellular Microbiology, Max Planck Institute for Infection Biology, Charitéplatz 1, Berlin 10117, Germany
| | - Alf Herzig
- Department of Cellular Microbiology, Max Planck Institute for Infection Biology, Charitéplatz 1, Berlin 10117, Germany
| |
Collapse
|
19
|
Singh J, Paliwal K, Litfin T, Singh J, Zhou Y. Predicting RNA distance-based contact maps by integrated deep learning on physics-inferred secondary structure and evolutionary-derived mutational coupling. Bioinformatics 2022; 38:3900-3910. [PMID: 35751593 PMCID: PMC9364379 DOI: 10.1093/bioinformatics/btac421] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 04/30/2022] [Accepted: 06/28/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Recently, AlphaFold2 achieved high experimental accuracy for the majority of proteins in Critical Assessment of Structure Prediction (CASP 14). This raises the hope that one day, we may achieve the same feat for RNA structure prediction for those structured RNAs, which is as fundamentally and practically important similar to protein structure prediction. One major factor in the recent advancement of protein structure prediction is the highly accurate prediction of distance-based contact maps of proteins. RESULTS Here, we showed that by integrated deep learning with physics-inferred secondary structures, co-evolutionary information and multiple sequence-alignment sampling, we can achieve RNA contact-map prediction at a level of accuracy similar to that in protein contact-map prediction. More importantly, highly accurate prediction for top L long-range contacts can be assured for those RNAs with a high effective number of homologous sequences (Neff > 50). The initial use of the predicted contact map as distance-based restraints confirmed its usefulness in 3D structure prediction. AVAILABILITY AND IMPLEMENTATION SPOT-RNA-2D is available as a web server at https://sparks-lab.org/server/spot-rna-2d/ and as a standalone program at https://github.com/jaswindersingh2/SPOT-RNA-2D. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Thomas Litfin
- Institute for Glycomics, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Jaspreet Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Yaoqi Zhou
- To whom correspondence should be addressed. or or
| |
Collapse
|
20
|
Solayman M, Litfin T, Singh J, Paliwal K, Zhou Y, Zhan J. Probing RNA structures and functions by solvent accessibility: an overview from experimental and computational perspectives. Brief Bioinform 2022; 23:bbac112. [PMID: 35348613 PMCID: PMC9116373 DOI: 10.1093/bib/bbac112] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 03/03/2022] [Accepted: 03/04/2022] [Indexed: 12/30/2022] Open
Abstract
Characterizing RNA structures and functions have mostly been focused on 2D, secondary and 3D, tertiary structures. Recent advances in experimental and computational techniques for probing or predicting RNA solvent accessibility make this 1D representation of tertiary structures an increasingly attractive feature to explore. Here, we provide a survey of these recent developments, which indicate the emergence of solvent accessibility as a simple 1D property, adding to secondary and tertiary structures for investigating complex structure-function relations of RNAs.
Collapse
Affiliation(s)
- Md Solayman
- Institute for Glycomics, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Thomas Litfin
- Institute for Glycomics, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
| | - Jaswinder Singh
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University, Brisbane, QLD 4111, Australia
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Parklands Dr. Southport, QLD 4222, Australia
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
- Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Jian Zhan
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| |
Collapse
|
21
|
Singh J, Paliwal K, Singh J, Zhou Y. RNA Backbone Torsion and Pseudotorsion Angle Prediction Using Dilated Convolutional Neural Networks. J Chem Inf Model 2021; 61:2610-2622. [PMID: 34037398 DOI: 10.1021/acs.jcim.1c00153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
RNA three-dimensional structure prediction has been relied on using a predicted or experimentally determined secondary structure as a restraint to reduce the conformational sampling space. However, the secondary-structure restraints are limited to paired bases, and the conformational space of the ribose-phosphate backbone is still too large to be sampled efficiently. Here, we employed the dilated convolutional neural network to predict backbone torsion and pseudotorsion angles using a single RNA sequence as input. The method called SPOT-RNA-1D was trained on a high-resolution training data set and tested on three independent, nonredundant, and high-resolution test sets. The proposed method yields substantially smaller mean absolute errors than the baseline predictors based on random predictions and based on helix conformations according to actual angle distributions. The mean absolute errors for three test sets range from 14°-44° for different angles, compared to 17°-62° by random prediction and 14°-58° by helix prediction. The method also accurately recovers the overall patterns of single or pairwise angle distributions. In general, torsion angles further away from the bases and associated with unpaired bases and paired bases involved in tertiary interactions are more difficult to predict. Compared to the best models in RNA-puzzles experiments, SPOT-RNA-1D yielded more accurate dihedral angles and, thus, are potentially useful as model quality indicators and restraints for RNA structure prediction as in protein structure prediction.
Collapse
Affiliation(s)
- Jaswinder Singh
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Jaspreet Singh
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland 4122, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, P.R. China
| |
Collapse
|