1
|
Yang Y, Zhong Y, Chen L. EIciRNAs in focus: current understanding and future perspectives. RNA Biol 2025; 22:1-12. [PMID: 39711231 DOI: 10.1080/15476286.2024.2443876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 11/14/2024] [Accepted: 12/09/2024] [Indexed: 12/24/2024] Open
Abstract
Circular RNAs (circRNAs) are a unique class of covalently closed single-stranded RNA molecules that play diverse roles in normal physiology and pathology. Among the major types of circRNA, exon-intron circRNA (EIciRNA) distinguishes itself by its sequence composition and nuclear localization. Recent RNA-seq technologies and computational methods have facilitated the detection and characterization of EIciRNAs, with features like circRNA intron retention (CIR) and tissue-specificity being characterized. EIciRNAs have been identified to exert their functions via mechanisms such as regulating gene transcription, and the physiological relevance of EIciRNAs has been reported. Within this review, we present a summary of the current understanding of EIciRNAs, delving into their identification and molecular functions. Additionally, we emphasize factors regulating EIciRNA biogenesis and the physiological roles of EIciRNAs based on recent research. We also discuss the future challenges in EIciRNA exploration, underscoring the potential for novel functions and functional mechanisms of EIciRNAs for further investigation.
Collapse
Affiliation(s)
- Yan Yang
- Department of Cardiology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, China
- Hefei National Laboratory for Physical Sciences at Microscale, University of Science and Technology of China, Hefei, China
| | - Yinchun Zhong
- Hefei National Laboratory for Physical Sciences at Microscale, University of Science and Technology of China, Hefei, China
- Department of Clinical Laboratory, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, China
| | - Liang Chen
- Department of Cardiology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, China
| |
Collapse
|
2
|
Oh Moon D. The role of MELK in cancer and its interactions with non-coding RNAs: Implications for therapeutic strategies. Bull Cancer 2025; 112:35-53. [PMID: 39562208 DOI: 10.1016/j.bulcan.2024.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 10/29/2024] [Accepted: 10/29/2024] [Indexed: 11/21/2024]
Abstract
In the evolving landscape of cancer research, the identification of key molecular players that contribute to the disease's progression and resistance against treatments has become paramount. Among these, Maternal Embryonic Leucine Zipper Kinase (MELK) has emerged as a critical regulator of cancer cell proliferation, survival, and therapy evasion. Concurrently, the significance of non-coding RNAs (ncRNAs), including microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), in modulating gene expression and cancer phenotypes has been increasingly recognized. Given the pivotal roles both MELK and ncRNAs play within cancer biology, investigating their interactions presents a compelling new frontier for therapeutic innovation. This exploration not only promises to enhance our understanding of cancer's molecular underpinnings but also opens up avenues for developing novel targeted interventions. The rationale behind focusing on MELK-ncRNA crosstalk lies in the potential to disrupt these critical molecular interactions, thereby offering a novel strategy to counteract cancer progression and improve treatment outcomes.
Collapse
Affiliation(s)
- Dong Oh Moon
- Department of Biology Education, Daegu University, 201, Daegudae-ro, Gyeongsan-si, 38453 Gyeongsangbuk-do, Republic of Korea.
| |
Collapse
|
3
|
Du C, Fan W, Zhou Y. Integrated Biochemical and Computational Methods for Deciphering RNA-Processing Codes. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1875. [PMID: 39523464 DOI: 10.1002/wrna.1875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 09/23/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]
Abstract
RNA processing involves steps such as capping, splicing, polyadenylation, modification, and nuclear export. These steps are essential for transforming genetic information in DNA into proteins and contribute to RNA diversity and complexity. Many biochemical methods have been developed to profile and quantify RNAs, as well as to identify the interactions between RNAs and RNA-binding proteins (RBPs), especially when coupled with high-throughput sequencing technologies. With the rapid accumulation of diverse data, it is crucial to develop computational methods to convert the big data into biological knowledge. In particular, machine learning and deep learning models are commonly utilized to learn the rules or codes governing the transformation from DNA sequences to intriguing RNAs based on manually designed or automatically extracted features. When precise enough, the RNA codes can be incredibly useful for predicting RNA products, decoding the molecular mechanisms, forecasting the impact of disease variants on RNA processing events, and identifying driver mutations. In this review, we systematically summarize the biochemical and computational methods for deciphering five important RNA codes related to alternative splicing, alternative polyadenylation, RNA localization, RNA modifications, and RBP binding. For each code, we review the main types of experimental methods used to generate training data, as well as the key features, strategic model structures, and advantages of representative tools. We also discuss the challenges encountered in developing predictive models using large language models and extensive domain knowledge. Additionally, we highlight useful resources and propose ways to improve computational tools for studying RNA codes.
Collapse
Affiliation(s)
- Chen Du
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
| | - Weiliang Fan
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
| | - Yu Zhou
- College of Life Sciences, TaiKang Center for Life and Medical Sciences, RNA Institute, Wuhan University, Wuhan, China
- Frontier Science Center for Immunology and Metabolism, Wuhan University, Wuhan, China
- State Key Laboratory of Virology, Wuhan University, Wuhan, China
| |
Collapse
|
4
|
He C, Duan L, Zheng H, Wang X, Guan L, Xu J. A Representation Learning Approach for Predicting circRNA Back-Splicing Event via Sequence-Interaction-Aware Dual Encoder. IEEE Trans Nanobioscience 2024; 23:603-611. [PMID: 39226209 DOI: 10.1109/tnb.2024.3454079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.
Collapse
|
5
|
Diao B, Luo J, Guo Y. A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs. Brief Funct Genomics 2024; 23:314-324. [PMID: 38576205 DOI: 10.1093/bfgp/elae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/25/2024] [Accepted: 03/14/2024] [Indexed: 04/06/2024] Open
Abstract
Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body's normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.
Collapse
Affiliation(s)
- Biyu Diao
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Jin Luo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Yu Guo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| |
Collapse
|
6
|
Digby B, Finn S, Ó Broin P. Computational approaches and challenges in the analysis of circRNA data. BMC Genomics 2024; 25:527. [PMID: 38807085 PMCID: PMC11134749 DOI: 10.1186/s12864-024-10420-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 05/15/2024] [Indexed: 05/30/2024] Open
Abstract
Circular RNAs (circRNA) are a class of non-coding RNA, forming a single-stranded covalently closed loop structure generated via back-splicing. Advancements in sequencing methods and technologies in conjunction with algorithmic developments of bioinformatics tools have enabled researchers to characterise the origin and function of circRNAs, with practical applications as a biomarker of diseases becoming increasingly relevant. Computational methods developed for circRNA analysis are predicated on detecting the chimeric back-splice junction of circRNAs whilst mitigating false-positive sequencing artefacts. In this review, we discuss in detail the computational strategies developed for circRNA identification, highlighting a selection of tool strengths, weaknesses and assumptions. In addition to circRNA identification tools, we describe methods for characterising the role of circRNAs within the competing endogenous RNA (ceRNA) network, their interactions with RNA-binding proteins, and publicly available databases for rich circRNA annotation.
Collapse
Affiliation(s)
- Barry Digby
- School of Mathematical and Statistical Sciences, University of Galway, Galway, Ireland.
| | - Stephen Finn
- Discipline of Histopathology, School of Medicine, Trinity College Dublin and Cancer Molecular Diagnostic Laboratory, Dublin, Ireland
| | - Pilib Ó Broin
- School of Mathematical and Statistical Sciences, University of Galway, Galway, Ireland
| |
Collapse
|
7
|
Wei PJ, Guo Z, Gao Z, Ding Z, Cao RF, Su Y, Zheng CH. Inference of gene regulatory networks based on directed graph convolutional networks. Brief Bioinform 2024; 25:bbae309. [PMID: 38935070 PMCID: PMC11209731 DOI: 10.1093/bib/bbae309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 05/17/2024] [Indexed: 06/28/2024] Open
Abstract
Inferring gene regulatory network (GRN) is one of the important challenges in systems biology, and many outstanding computational methods have been proposed; however there remains some challenges especially in real datasets. In this study, we propose Directed Graph Convolutional neural network-based method for GRN inference (DGCGRN). To better understand and process the directed graph structure data of GRN, a directed graph convolutional neural network is conducted which retains the structural information of the directed graph while also making full use of neighbor node features. The local augmentation strategy is adopted in graph neural network to solve the problem of poor prediction accuracy caused by a large number of low-degree nodes in GRN. In addition, for real data such as E.coli, sequence features are obtained by extracting hidden features using Bi-GRU and calculating the statistical physicochemical characteristics of gene sequence. At the training stage, a dynamic update strategy is used to convert the obtained edge prediction scores into edge weights to guide the subsequent training process of the model. The results on synthetic benchmark datasets and real datasets show that the prediction performance of DGCGRN is significantly better than existing models. Furthermore, the case studies on bladder uroepithelial carcinoma and lung cancer cells also illustrate the performance of the proposed model.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Ziqiang Guo
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Zhen Gao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Zheng Ding
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Rui-Fen Cao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Yansen Su
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Chun-Hou Zheng
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| |
Collapse
|
8
|
Yang B, Wang YW, Zhang K. Interactions between circRNA and protein in breast cancer. Gene 2024; 895:148019. [PMID: 37984538 DOI: 10.1016/j.gene.2023.148019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 11/10/2023] [Accepted: 11/17/2023] [Indexed: 11/22/2023]
Abstract
Circular RNA (circRNA) is a newly discovered endogenous non-coding RNA that plays important roles in the occurrence and development of various cancers. Current research indicates that circRNA can inhibit the function of miRNA by acting as an miRNA sponge, interacting with proteins, and being translated into proteins. Most current research focuses on the circRNA-miRNA interaction; however, few studies have investigated the interaction between circRNAs and RNA binding proteins (RBPs) in breast cancer. In this review, we systematically summarize the potential molecular mechanism of the circRNA-protein interaction in breast cancer. Specifically, we elaborate on the direct interaction between circRNAs and proteins in breast cancer, including the functions of circRNA as protein sponges, decoys, and scaffolds, thereby affecting the progression of breast cancer. We also discuss the indirect interaction between circRNAs and proteins in breast cancer in which RBPs, transcription factors and m6A modifying enzymes could in turn regulate the expression and formation of circRNA. Finally, we discuss the potential application of circRNA-protein interaction for treating breast cancer, providing a reference for further research in this field.
Collapse
Affiliation(s)
- Bin Yang
- Department of Breast Surgery, General Surgery, Qilu Hospital of Shandong University, Jinan 250012, Shandong, People's Republic of China
| | - Ya-Wen Wang
- Department of Breast Surgery, General Surgery, Qilu Hospital of Shandong University, Jinan 250012, Shandong, People's Republic of China
| | - Kai Zhang
- Department of Breast Surgery, General Surgery, Qilu Hospital of Shandong University, Jinan 250012, Shandong, People's Republic of China.
| |
Collapse
|
9
|
Choi SR, Lee M. Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review. BIOLOGY 2023; 12:1033. [PMID: 37508462 PMCID: PMC10376273 DOI: 10.3390/biology12071033] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/18/2023] [Accepted: 07/21/2023] [Indexed: 07/30/2023]
Abstract
The emergence and rapid development of deep learning, specifically transformer-based architectures and attention mechanisms, have had transformative implications across several domains, including bioinformatics and genome data analysis. The analogous nature of genome sequences to language texts has enabled the application of techniques that have exhibited success in fields ranging from natural language processing to genomic data. This review provides a comprehensive analysis of the most recent advancements in the application of transformer architectures and attention mechanisms to genome and transcriptome data. The focus of this review is on the critical evaluation of these techniques, discussing their advantages and limitations in the context of genome data analysis. With the swift pace of development in deep learning methodologies, it becomes vital to continually assess and reflect on the current standing and future direction of the research. Therefore, this review aims to serve as a timely resource for both seasoned researchers and newcomers, offering a panoramic view of the recent advancements and elucidating the state-of-the-art applications in the field. Furthermore, this review paper serves to highlight potential areas of future investigation by critically evaluating studies from 2019 to 2023, thereby acting as a stepping-stone for further research endeavors.
Collapse
Affiliation(s)
| | - Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea;
| |
Collapse
|
10
|
Wu P, Nie Z, Huang Z, Zhang X. CircPCBL: Identification of Plant CircRNAs with a CNN-BiGRU-GLT Model. PLANTS (BASEL, SWITZERLAND) 2023; 12:1652. [PMID: 37111874 PMCID: PMC10143888 DOI: 10.3390/plants12081652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 04/10/2023] [Accepted: 04/13/2023] [Indexed: 06/19/2023]
Abstract
Circular RNAs (circRNAs), which are produced post-splicing of pre-mRNAs, are strongly linked to the emergence of several tumor types. The initial stage in conducting follow-up studies involves identifying circRNAs. Currently, animals are the primary target of most established circRNA recognition technologies. However, the sequence features of plant circRNAs differ from those of animal circRNAs, making it impossible to detect plant circRNAs. For example, there are non-GT/AG splicing signals at circRNA junction sites and few reverse complementary sequences and repetitive elements in the flanking intron sequences of plant circRNAs. In addition, there have been few studies on circRNAs in plants, and thus it is urgent to create a plant-specific method for identifying circRNAs. In this study, we propose CircPCBL, a deep-learning approach that only uses raw sequences to distinguish between circRNAs found in plants and other lncRNAs. CircPCBL comprises two separate detectors: a CNN-BiGRU detector and a GLT detector. The CNN-BiGRU detector takes in the one-hot encoding of the RNA sequence as the input, while the GLT detector uses k-mer (k = 1 - 4) features. The output matrices of the two submodels are then concatenated and ultimately pass through a fully connected layer to produce the final output. To verify the generalization performance of the model, we evaluated CircPCBL using several datasets, and the results revealed that it had an F1 of 85.40% on the validation dataset composed of six different plants species and 85.88%, 75.87%, and 86.83% on the three cross-species independent test sets composed of Cucumis sativus, Populus trichocarpa, and Gossypium raimondii, respectively. With an accuracy of 90.9% and 90%, respectively, CircPCBL successfully predicted ten of the eleven circRNAs of experimentally reported Poncirus trifoliata and nine of the ten lncRNAs of rice on the real set. CircPCBL could potentially contribute to the identification of circRNAs in plants. In addition, it is remarkable that CircPCBL also achieved an average accuracy of 94.08% on the human datasets, which is also an excellent result, implying its potential application in animal datasets. Ultimately, CircPCBL is available as a web server, from which the data and source code can also be downloaded free of charge.
Collapse
Affiliation(s)
- Pengpeng Wu
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
- School of Life Science, Anhui Agricultural University, Hefei 230036, China
| | - Zhenjun Nie
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
- School of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
| | - Zhiqiang Huang
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
- School of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
| | - Xiaodan Zhang
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
- School of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
| |
Collapse
|
11
|
Tong Y, Zhang S, Riddle S, Song R, Yue D. Circular RNAs in the Origin of Developmental Lung Disease: Promising Diagnostic and Therapeutic Biomarkers. Biomolecules 2023; 13:biom13030533. [PMID: 36979468 PMCID: PMC10046088 DOI: 10.3390/biom13030533] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 03/11/2023] [Accepted: 03/12/2023] [Indexed: 03/17/2023] Open
Abstract
Circular RNA (circRNA) is a newly discovered noncoding RNA that regulates gene transcription, binds to RNA-related proteins, and encodes protein microRNAs (miRNAs). The development of molecular biomarkers such as circRNAs holds great promise in the diagnosis and prognosis of clinical disorders. Importantly, circRNA-mediated maternal-fetus risk factors including environmental (high altitude), maternal (preeclampsia, smoking, and chorioamnionitis), placental, and fetal (preterm birth and low birth weight) factors are the early origins and likely to contribute to the occurrence and progression of developmental and pediatric cardiopulmonary disorders. Although studies of circRNAs in normal cardiopulmonary development and developmental diseases have just begun, some studies have revealed their expression patterns. Here, we provide an overview of circRNAs’ biogenesis and biological functions. Furthermore, this review aims to emphasize the importance of circRNAs in maternal-fetus risk factors. Likewise, the potential biomarker and therapeutic target of circRNAs in developmental and pediatric lung diseases are explored.
Collapse
Affiliation(s)
- Yajie Tong
- Department of Pediatrics, Shengjing Hospital of China Medical University, Shenyang 110004, China
| | - Shuqing Zhang
- School of Pharmacy, China Medical University, Shenyang 110122, China
| | - Suzette Riddle
- Cardiovascular Pulmonary Research Laboratories, Departments of Pediatrics and Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Rui Song
- Lawrence D. Longo MD Center for Perinatal Biology, Department of Basic Sciences, Loma Linda University School of Medicine, Loma Linda, CA 92350, USA
- Correspondence: (R.S.); (D.Y.); Tel.: +01-909-558-4325 (R.S.); +86-24-9661551125 (D.Y.)
| | - Dongmei Yue
- Department of Pediatrics, Shengjing Hospital of China Medical University, Shenyang 110004, China
- Correspondence: (R.S.); (D.Y.); Tel.: +01-909-558-4325 (R.S.); +86-24-9661551125 (D.Y.)
| |
Collapse
|
12
|
Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT. The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 2023; 24:bbac529. [PMID: 36617463 PMCID: PMC9851315 DOI: 10.1093/bib/bbac529] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 01/10/2023] Open
Abstract
DNA and RNA sequencing technologies have revolutionized biology and biomedical sciences, sequencing full genomes and transcriptomes at very high speeds and reasonably low costs. RNA sequencing (RNA-Seq) enables transcript identification and quantification, but once sequencing has concluded researchers can be easily overwhelmed with questions such as how to go from raw data to differential expression (DE), pathway analysis and interpretation. Several pipelines and procedures have been developed to this effect. Even though there is no unique way to perform RNA-Seq analysis, it usually follows these steps: 1) raw reads quality check, 2) alignment of reads to a reference genome, 3) aligned reads' summarization according to an annotation file, 4) DE analysis and 5) gene set analysis and/or functional enrichment analysis. Each step requires researchers to make decisions, and the wide variety of options and resulting large volumes of data often lead to interpretation challenges. There also seems to be insufficient guidance on how best to obtain relevant information and derive actionable knowledge from transcription experiments. In this paper, we explain RNA-Seq steps in detail and outline differences and similarities of different popular options, as well as advantages and disadvantages. We also discuss non-coding RNA analysis, multi-omics, meta-transcriptomics and the use of artificial intelligence methods complementing the arsenal of tools available to researchers. Lastly, we perform a complete analysis from raw reads to DE and functional enrichment analysis, visually illustrating how results are not absolute truths and how algorithmic decisions can greatly impact results and interpretation.
Collapse
Affiliation(s)
- Jiung-Wen Chen
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lisa Shrestha
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - George Green
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Microbiology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| |
Collapse
|
13
|
Shen Z, Shao YL, Liu W, Zhang Q, Yuan L. Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks. BMC Genomics 2022; 23:581. [PMID: 35962324 PMCID: PMC9373444 DOI: 10.1186/s12864-022-08820-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 08/03/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Circular RNAs (CircRNAs) play critical roles in gene expression regulation and disease development. Understanding the regulation mechanism of CircRNAs formation can help reveal the role of CircRNAs in various biological processes mentioned above. Back-splicing is important for CircRNAs formation. Back-splicing sites prediction helps uncover the mysteries of CircRNAs formation. Several methods were proposed for back-splicing sites prediction or circRNA-realted prediction tasks. Model performance was constrained by poor feature learning and using ability. RESULTS In this study, CircCNN was proposed to predict pre-mRNA back-splicing sites. Convolution neural network and batch normalization are the main parts of CircCNN. Experimental results on three datasets show that CircCNN outperforms other baseline models. Moreover, PPM (Position Probability Matrix) features extract by CircCNN were converted as motifs. Further analysis reveals that some of motifs found by CircCNN match known motifs involved in gene expression regulation, the distribution of motif and special short sequence is important for pre-mRNA back-splicing. CONCLUSIONS In general, the findings in this study provide a new direction for exploring CircRNA-related gene expression regulatory mechanism and identifying potential targets for complex malignant diseases. The datasets and source code of this study are freely available at: https://github.com/szhh521/CircCNN .
Collapse
Affiliation(s)
- Zhen Shen
- School of Computer and Software, Nanyang Institute of Technology, Changjiang Road 80, Nanyang, 473004, Henan, China
| | - Yan Ling Shao
- School of Computer and Software, Nanyang Institute of Technology, Changjiang Road 80, Nanyang, 473004, Henan, China
| | - Wei Liu
- School of Computer and Software, Nanyang Institute of Technology, Changjiang Road 80, Nanyang, 473004, Henan, China
| | - Qinhu Zhang
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Siping Road 1239, Shanghai, 200092, China
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Caoan Road 4800, Shanghai, 201804, China
| | - Lin Yuan
- School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Daxue Road 3501, Jinan, 250353, Shandong, China.
| |
Collapse
|
14
|
Xue C, Li G, Zheng Q, Gu X, Bao Z, Lu J, Li L. The functional roles of the circRNA/Wnt axis in cancer. Mol Cancer 2022; 21:108. [PMID: 35513849 PMCID: PMC9074313 DOI: 10.1186/s12943-022-01582-0] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 04/22/2022] [Indexed: 01/09/2023] Open
Abstract
CircRNAs, covalently closed noncoding RNAs, are widely expressed in a wide range of species ranging from viruses to plants to mammals. CircRNAs were enriched in the Wnt pathway. Aberrant Wnt pathway activation is involved in the development of various types of cancers. Accumulating evidence indicates that the circRNA/Wnt axis modulates the expression of cancer-associated genes and then regulates cancer progression. Wnt pathway-related circRNA expression is obviously associated with many clinical characteristics. CircRNAs could regulate cell biological functions by interacting with the Wnt pathway. Moreover, Wnt pathway-related circRNAs are promising potential biomarkers for cancer diagnosis, prognosis evaluation, and treatment. In our review, we summarized the recent research progress on the role and clinical application of Wnt pathway-related circRNAs in tumorigenesis and progression.
Collapse
Affiliation(s)
- Chen Xue
- grid.13402.340000 0004 1759 700XState Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, National Clinical Research Center for Infectious Diseases, Zhejiang University, No. 79 Qingchun Road, Shangcheng District, 310003 Hangzhou, China
| | - Ganglei Li
- grid.13402.340000 0004 1759 700XDepartment of Neurosurgery, The First Affiliated Hospital, College of Medicine, Zhejiang University, 310003 Hangzhou, China
| | - Qiuxian Zheng
- grid.13402.340000 0004 1759 700XState Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, National Clinical Research Center for Infectious Diseases, Zhejiang University, No. 79 Qingchun Road, Shangcheng District, 310003 Hangzhou, China
| | - Xinyu Gu
- grid.13402.340000 0004 1759 700XState Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, National Clinical Research Center for Infectious Diseases, Zhejiang University, No. 79 Qingchun Road, Shangcheng District, 310003 Hangzhou, China
| | - Zhengyi Bao
- grid.13402.340000 0004 1759 700XState Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, National Clinical Research Center for Infectious Diseases, Zhejiang University, No. 79 Qingchun Road, Shangcheng District, 310003 Hangzhou, China
| | - Juan Lu
- grid.13402.340000 0004 1759 700XState Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, National Clinical Research Center for Infectious Diseases, Zhejiang University, No. 79 Qingchun Road, Shangcheng District, 310003 Hangzhou, China
| | - Lanjuan Li
- grid.13402.340000 0004 1759 700XState Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, College of Medicine, National Clinical Research Center for Infectious Diseases, Zhejiang University, No. 79 Qingchun Road, Shangcheng District, 310003 Hangzhou, China
| |
Collapse
|
15
|
Micheel J, Safrastyan A, Wollny D. Advances in Non-Coding RNA Sequencing. Noncoding RNA 2021; 7:70. [PMID: 34842804 PMCID: PMC8628893 DOI: 10.3390/ncrna7040070] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/22/2021] [Accepted: 10/26/2021] [Indexed: 12/11/2022] Open
Abstract
Non-coding RNAs (ncRNAs) comprise a set of abundant and functionally diverse RNA molecules. Since the discovery of the first ncRNA in the 1960s, ncRNAs have been shown to be involved in nearly all steps of the central dogma of molecular biology. In recent years, the pace of discovery of novel ncRNAs and their cellular roles has been greatly accelerated by high-throughput sequencing. Advances in sequencing technology, library preparation protocols as well as computational biology helped to greatly expand our knowledge of which ncRNAs exist throughout the kingdoms of life. Moreover, RNA sequencing revealed crucial roles of many ncRNAs in human health and disease. In this review, we discuss the most recent methodological advancements in the rapidly evolving field of high-throughput sequencing and how it has greatly expanded our understanding of ncRNA biology across a large number of different organisms.
Collapse
Affiliation(s)
| | | | - Damian Wollny
- RNA Bioinformatics/High Throughput Analysis, Faculty of Mathematics and Computer Science, Friedrich Schiller University, 07743 Jena, Germany; (J.M.); (A.S.)
| |
Collapse
|
16
|
Asim MN, Ibrahim MA, Imran Malik M, Dengel A, Ahmed S. Advances in Computational Methodologies for Classification and Sub-Cellular Locality Prediction of Non-Coding RNAs. Int J Mol Sci 2021; 22:8719. [PMID: 34445436 PMCID: PMC8395733 DOI: 10.3390/ijms22168719] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 02/06/2023] Open
Abstract
Apart from protein-coding Ribonucleic acids (RNAs), there exists a variety of non-coding RNAs (ncRNAs) which regulate complex cellular and molecular processes. High-throughput sequencing technologies and bioinformatics approaches have largely promoted the exploration of ncRNAs which revealed their crucial roles in gene regulation, miRNA binding, protein interactions, and splicing. Furthermore, ncRNAs are involved in the development of complicated diseases like cancer. Categorization of ncRNAs is essential to understand the mechanisms of diseases and to develop effective treatments. Sub-cellular localization information of ncRNAs demystifies diverse functionalities of ncRNAs. To date, several computational methodologies have been proposed to precisely identify the class as well as sub-cellular localization patterns of RNAs). This paper discusses different types of ncRNAs, reviews computational approaches proposed in the last 10 years to distinguish coding-RNA from ncRNA, to identify sub-types of ncRNAs such as piwi-associated RNA, micro RNA, long ncRNA, and circular RNA, and to determine sub-cellular localization of distinct ncRNAs and RNAs. Furthermore, it summarizes diverse ncRNA classification and sub-cellular localization determination datasets along with benchmark performance to aid the development and evaluation of novel computational methodologies. It identifies research gaps, heterogeneity, and challenges in the development of computational approaches for RNA sequence analysis. We consider that our expert analysis will assist Artificial Intelligence researchers with knowing state-of-the-art performance, model selection for various tasks on one platform, dominantly used sequence descriptors, neural architectures, and interpreting inter-species and intra-species performance deviation.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Imran Malik
- National Center for Artificial Intelligence (NCAI), National University of Sciences and Technology, Islamabad 44000, Pakistan;
- School of Electrical Engineering & Computer Science, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Andreas Dengel
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- DeepReader GmbH, Trippstadter Str. 122, 67663 Kaiserslautern, Germany
| |
Collapse
|