1
|
Gondal MN, Farooqi HMU. Single-Cell Transcriptomic Approaches for Decoding Non-Coding RNA Mechanisms in Colorectal Cancer. Noncoding RNA 2025; 11:24. [PMID: 40126348 PMCID: PMC11932299 DOI: 10.3390/ncrna11020024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 01/27/2025] [Accepted: 03/03/2025] [Indexed: 03/25/2025] Open
Abstract
Non-coding RNAs (ncRNAs) play crucial roles in colorectal cancer (CRC) development and progression. Recent developments in single-cell transcriptome profiling methods have revealed surprising levels of expression variability among seemingly homogeneous cells, suggesting the existence of many more cell types than previously estimated. This review synthesizes recent advances in ncRNA research in CRC, emphasizing single-cell bioinformatics approaches for their analysis. We explore computational methods and tools used for ncRNA identification, characterization, and functional prediction in CRC, with a focus on single-cell RNA sequencing (scRNA-seq) data. The review highlights key bioinformatics strategies, including sequence-based and structure-based approaches, machine learning applications, and multi-omics data integration. We discuss how these computational techniques can be applied to analyze differential expression, perform functional enrichment, and construct regulatory networks involving ncRNAs in CRC. Additionally, we examine the role of bioinformatics in leveraging ncRNAs as diagnostic and prognostic biomarkers for CRC. We also discuss recent scRNA-seq studies revealing ncRNA heterogeneity in CRC. This review aims to provide a comprehensive overview of the current state of single-cell bioinformatics in ncRNA CRC research and outline future directions in this rapidly evolving field, emphasizing the integration of computational approaches with experimental validation to advance our understanding of ncRNA biology in CRC.
Collapse
Affiliation(s)
- Mahnoor Naseer Gondal
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA;
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hafiz Muhammad Umer Farooqi
- Laboratory of Energy Metabolism, Division of Metabolic Disorders, Children’s Hospital of Orange County, Orange, CA 92868, USA
| |
Collapse
|
2
|
Diamantopoulos MA, Adamopoulos PG, Scorilas A. Small non-coding RNAs as diagnostic, prognostic and predictive biomarkers of gynecological cancers: an update. Expert Rev Mol Diagn 2024; 24:979-995. [PMID: 39390687 DOI: 10.1080/14737159.2024.2408740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 09/22/2024] [Indexed: 10/12/2024]
Abstract
INTRODUCTION Non-coding RNAs (ncRNAs) comprise a heterogeneous cluster of RNA molecules. Emerging evidence suggests their involvement in various aspects of tumorigenesis, particularly in gynecological malignancies. Notably, ncRNAs have been implicated as mediators within tumor signaling pathways, exerting their influence through interactions with RNA or proteins. These findings further highlight the hypothesis that ncRNAs constitute therapeutic targets and point out their clinical potential as stratification biomarkers. AREAS COVERED The review outlines the use of small ncRNAs, including miRNAs, tRNA-derived small RNAs, PIWI-interacting RNAs and circular RNAs, for diagnostic, prognostic, and predictive purposes in gynecological cancers. It aims to increase our knowledge of their functions in tumor biology and their translation into clinical practice. EXPERT OPINION By leveraging interdisciplinary collaborations, scientists can decipher the riddle of small ncRNA biomarkers as diagnostic, prognostic and predictive biomarkers of gynecological tumors. Integrating small ncRNA-based assays into clinical practice will allow clinicians to provide cure plans for each patient, reducing the likelihood of adverse responses. Nevertheless, addressing challenges such as standardizing experimental methodologies and refining diagnostic assays is imperative for advancing small ncRNA research in gynecological cancer.
Collapse
Affiliation(s)
- Marios A Diamantopoulos
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Panagiotis G Adamopoulos
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Athens, Greece
| | - Andreas Scorilas
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
3
|
Loganathan T, Doss C GP. Non-coding RNAs in human health and disease: potential function as biomarkers and therapeutic targets. Funct Integr Genomics 2023; 23:33. [PMID: 36625940 PMCID: PMC9838419 DOI: 10.1007/s10142-022-00947-4] [Citation(s) in RCA: 77] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 12/14/2022] [Accepted: 12/15/2022] [Indexed: 01/11/2023]
Abstract
Human diseases have been a critical threat from the beginning of human history. Knowing the origin, course of action and treatment of any disease state is essential. A microscopic approach to the molecular field is a more coherent and accurate way to explore the mechanism, progression, and therapy with the introduction and evolution of technology than a macroscopic approach. Non-coding RNAs (ncRNAs) play increasingly important roles in detecting, developing, and treating all abnormalities related to physiology, pathology, genetics, epigenetics, cancer, and developmental diseases. Noncoding RNAs are becoming increasingly crucial as powerful, multipurpose regulators of all biological processes. Parallel to this, a rising amount of scientific information has revealed links between abnormal noncoding RNA expression and human disorders. Numerous non-coding transcripts with unknown functions have been found in addition to advancements in RNA-sequencing methods. Non-coding linear RNAs come in a variety of forms, including circular RNAs with a continuous closed loop (circRNA), long non-coding RNAs (lncRNA), and microRNAs (miRNA). This comprises specific information on their biogenesis, mode of action, physiological function, and significance concerning disease (such as cancer or cardiovascular diseases and others). This study review focuses on non-coding RNA as specific biomarkers and novel therapeutic targets.
Collapse
Affiliation(s)
- Tamizhini Loganathan
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore- 632014, Tamil Nadu, India
| | - George Priya Doss C
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of Biosciences and Technology, Vellore Institute of Technology (VIT), Vellore- 632014, Tamil Nadu, India.
| |
Collapse
|
4
|
Majumder R, Ghosh S, Das A, Singh MK, Samanta S, Saha A, Saha RP. Prokaryotic ncRNAs: Master regulators of gene expression. CURRENT RESEARCH IN PHARMACOLOGY AND DRUG DISCOVERY 2022; 3:100136. [PMID: 36568271 PMCID: PMC9780080 DOI: 10.1016/j.crphar.2022.100136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/11/2022] [Accepted: 10/14/2022] [Indexed: 12/14/2022] Open
Abstract
ncRNA plays a very pivotal role in various biological activities ranging from gene regulation to controlling important developmental networks. It is imperative to note that this small molecule is not only present in all three domains of cellular life, but is an important modulator of gene regulation too in all these domains. In this review, we discussed various aspects of ncRNA biology, especially their role in bacteria. The last two decades of scientific research have proved that this molecule plays an important role in the modulation of various regulatory pathways in bacteria including the adaptive immune system and gene regulation. It is also very surprising to note that this small molecule is also employed in various processes related to the pathogenicity of virulent microorganisms.
Collapse
Affiliation(s)
- Rajib Majumder
- Department of Biotechnology, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India
| | - Sanmitra Ghosh
- Department of Biological Sciences, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India
| | - Arpita Das
- Department of Biotechnology, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India
| | - Manoj Kumar Singh
- Department of Biotechnology, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India
| | - Saikat Samanta
- Department of Biotechnology, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India
| | - Abinit Saha
- Department of Biotechnology, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India,Corresponding authors.
| | - Rudra P. Saha
- Department of Biotechnology, School of Life Science & Biotechnology, Adamas University, Kolkata, 700126, India,Corresponding authors.
| |
Collapse
|
5
|
Mahendran G, Jayasinghe OT, Thavakumaran D, Arachchilage GM, Silva GN. Key players in regulatory RNA realm of bacteria. Biochem Biophys Rep 2022; 30:101276. [PMID: 35592614 PMCID: PMC9111926 DOI: 10.1016/j.bbrep.2022.101276] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 04/30/2022] [Accepted: 05/04/2022] [Indexed: 11/30/2022] Open
Abstract
Precise regulation of gene expression is crucial for living cells to adapt for survival in diverse environmental conditions. Among the common cellular regulatory mechanisms, RNA-based regulators play a key role in all domains of life. Discovery of regulatory RNAs have made a paradigm shift in molecular biology as many regulatory functions of RNA have been identified beyond its canonical roles as messenger, ribosomal and transfer RNA. In the complex regulatory RNA network, riboswitches, small RNAs, and RNA thermometers can be identified as some of the key players. Herein, we review the discovery, mechanism, and potential therapeutic use of these classes of regulatory RNAs mainly found in bacteria. Being highly adaptive organisms that inhabit a broad range of ecological niches, bacteria have adopted tight and rapid-responding gene regulation mechanisms. This review aims to highlight how bacteria utilize versatile RNA structures and sequences to build a sophisticated gene regulation network. The three major classes of prokaryotic ncRNAs and their characterized mechanisms of operation in gene regulation. sRNAs emerging as major players in global gene regulatory networks. Riboswitch mediated gene control mechanisms through on/off switches in response to ligand binding. RNA thermo sensors for temperature-dependent gene expression. Therapeutic importance of ncRNAs and computational approaches involved in the discovery of ncRNAs.
Collapse
Affiliation(s)
- Gowthami Mahendran
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Department of Chemistry and Biochemistry, University of Notre Dame, IN, 46556, USA
| | - Oshadhi T. Jayasinghe
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Department of Biochemistry and Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA, 16802, USA
| | - Dhanushika Thavakumaran
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Department of Chemistry and Biochemistry, University of Notre Dame, IN, 46556, USA
| | - Gayan Mirihana Arachchilage
- Howard Hughes Medical Institute, Yale University, New Haven, CT, 06520-8103, USA
- PTC Therapeutics Inc, South Plainfield, NJ, 07080, USA
| | - Gayathri N. Silva
- Department of Chemistry, University of Colombo, Colombo, Sri Lanka
- Corresponding author.
| |
Collapse
|
6
|
Yuan S, Gong Y, Wang G, Zhang B, Liu Y, Zhang H. MSFF-CDCGAN: A novel method to predict RNA secondary structure based on Generative Adversarial Network. Methods 2022; 204:368-375. [DOI: 10.1016/j.ymeth.2022.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/07/2022] [Accepted: 04/11/2022] [Indexed: 11/25/2022] Open
|
7
|
Deng L, Li W, Zhang J. LDAH2V: Exploring Meta-Paths Across Multiple Networks for lncRNA-Disease Association Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1572-1581. [PMID: 31725386 DOI: 10.1109/tcbb.2019.2946257] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Accumulating evidence has demonstrated dysfunctions of long non-coding RNAs (lncRNAs) are involved in various complex human diseases. However, even today, the relationships between lncRNAs and diseases remain unknown in most cases. Developing effective computational approaches to identify potential lncRNA-disease associations has become a hot topic. Existing network-based approaches are usually focused on the intrinsic features of lncRNAs and diseases but ignore the heterogeneous information of biological networks. Considering the limitations in previous methods, we propose LDAH2V, an efficient computational framework for predicting potential lncRNA-disease associations. LDAH2V uses the HIN2Vec to calculate the meta-path and feature vector for each lncRNA-disease pair in the heterogeneous information network (HIN), which consists of lncRNA similarity network, disease similarity network, miRNA similarity network, and the associations between them. Then, a Gradient Boosting Tree (GBT) classifier to predict lncRNA-disease associations is built with the feature vectors. The results show that LDAH2V performs significantly better than the four existing state-of-the-art methods and gains an AUC of 0.97 in the 10-fold cross-validation test. Furthermore, case studies of colon cancer and ovarian cancer-related lncRNAs have been confirmed in related databases and medical literature.
Collapse
|
8
|
Singh D, Madhawan A, Roy J. Identification of multiple RNAs using feature fusion. Brief Bioinform 2021; 22:6272794. [PMID: 33971667 DOI: 10.1093/bib/bbab178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Detection of novel transcripts with deep sequencing has increased the demand for computational algorithms as their identification and validation using in vivo techniques is time-consuming, costly and unreliable. Most of these discovered transcripts belong to non-coding RNAs, a large group known for their diverse functional roles but lacks the common taxonomy. Thus, upon the identification of the absence of coding potential in them, it is crucial to recognize their prime functional category. To address this heterogeneity issue, we divide the ncRNAs into three classes and present RNA classifier (RNAC) that categorizes the RNAs into coding, housekeeping, small non-coding and long non-coding classes. RNAC utilizes the alignment-based genomic descriptors to extract statistical, local binary patterns and histogram features and fuse them to construct the classification models with extreme gradient boosting. The experiments are performed on four species, and the performance is assessed on multiclass and conventional binary classification (coding versus no-coding) problems. The proposed approach achieved >93% accuracy on both classification problems and also outperformed other well-known existing methods in coding potential prediction. This validates the usefulness of feature fusion for improved performance on both types of classification problems. Hence, RNAC is a valuable tool for the accurate identification of multiple RNAs .
Collapse
Affiliation(s)
- Dalwinder Singh
- National Agri-Food Biotechnology Institute, Sector 81, SAS Nagar, 140306, Punjab, India
| | - Akansha Madhawan
- National Agri-Food Biotechnology Institute, Sector 81, SAS Nagar, 140306, Punjab, India
| | - Joy Roy
- National Agri-Food Biotechnology Institute, Sector 81, SAS Nagar, 140306, Punjab, India
| |
Collapse
|
9
|
Wang X, Yang Y, Liu J, Wang G. The stacking strategy-based hybrid framework for identifying non-coding RNAs. Brief Bioinform 2021; 22:6165004. [PMID: 33693454 DOI: 10.1093/bib/bbab023] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/16/2021] [Indexed: 12/12/2022] Open
Abstract
With the development of next-generation sequencing technology, a large number of transcripts need to be analyzed, and it has been a challenge to distinguish non-coding ribonucleic acid (RNAs) (ncRNAs) from coding RNAs. And for non-model organisms, due to the lack of transcriptional data, many existing methods cannot identify them. Therefore, in addition to using deoxyribonucleic acid-based and RNA-based features, we also proposed a hybrid framework based on the stacking strategy to identify ncRNAs, and we innovatively added eight features based on predicted peptides. The proposed framework was based on stacking two-layer classifier which combined random forest (RF), LightGBM, XGBoost and logistic regression (LR) models. We used this framework to build two types of models. For cross-species ncRNAs identification model, we tested it on six different species: human, mouse, zebrafish, fruit fly, worm and Arabidopsis. Compared with other tools, our model was the best in datasets of Arabidopsis, worm and zebrafish with the accuracy of 98.36%, 99.65% and 94.12%. For performance metrics analysis, the datasets of the six species were considered as a whole set, and the sensitivity, accuracy, precision and F1 values of our model were the best. For the plant-specific ncRNAs identification model, the average values of the six metrics of the two experiments were all greater than 95%, which demonstrated it can be used to identify ncRNAs in plants. The above indicates that the hybrid framework we designed is universal between animals and plants and has significant advantages in the identification of cross-species ncRNAs.
Collapse
Affiliation(s)
- Xin Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yang Yang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jian Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Guohua Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
10
|
Systematic and computational identification of Androctonus crassicauda long non-coding RNAs. Sci Rep 2021; 11:4720. [PMID: 33633149 PMCID: PMC7907363 DOI: 10.1038/s41598-021-83815-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 02/09/2021] [Indexed: 01/31/2023] Open
Abstract
The potential function of long non-coding RNAs in regulating neighbor protein-coding genes has attracted scientists' attention. Despite the important role of lncRNAs in biological processes, a limited number of studies focus on non-model animal lncRNAs. In this study, we used a stringent step-by-step filtering pipeline and machine learning-based tools to identify the specific Androctonus crassicauda lncRNAs and analyze the features of predicted scorpion lncRNAs. 13,401 lncRNAs were detected using pipeline in A. crassicauda transcriptome. The blast results indicated that the majority of these lncRNAs sequences (12,642) have no identifiable orthologs even in closely related species and those considered as novel lncRNAs. Compared to lncRNA prediction tools indicated that our pipeline is a helpful approach to distinguish protein-coding and non-coding transcripts from RNA sequencing data of species without reference genomes. Moreover, analyzing lncRNA characteristics in A. crassicauda uncovered that lower protein-coding potential, lower GC content, shorter transcript length, and less number of isoform per gene are outstanding features of A. crassicauda lncRNAs transcripts.
Collapse
|
11
|
Deepthi K, Jereesh A. An ensemble approach for CircRNA-disease association prediction based on autoencoder and deep neural network. Gene 2020; 762:145040. [DOI: 10.1016/j.gene.2020.145040] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 06/28/2020] [Accepted: 08/04/2020] [Indexed: 01/26/2023]
|
12
|
Platon L, Zehraoui F, Bendahmane A, Tahi F. IRSOM, a reliable identifier of ncRNAs based on supervised self-organizing maps with rejection. Bioinformatics 2019; 34:i620-i628. [PMID: 30423081 PMCID: PMC6129289 DOI: 10.1093/bioinformatics/bty572] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Motivation Non-coding RNAs (ncRNAs) play important roles in many biological processes and are involved in many diseases. Their identification is an important task, and many tools exist in the literature for this purpose. However, almost all of them are focused on the discrimination of coding and ncRNAs without giving more biological insight. In this paper, we propose a new reliable method called IRSOM, based on a supervised Self-Organizing Map (SOM) with a rejection option, that overcomes these limitations. The rejection option in IRSOM improves the accuracy of the method and also allows identifing the ambiguous transcripts. Furthermore, with the visualization of the SOM, we analyze the rejected predictions and highlight the ambiguity of the transcripts. Results IRSOM was tested on datasets of several species from different reigns, and shown better results compared to state-of-art. The accuracy of IRSOM is always greater than 0.95 for all the species with an average specificity of 0.98 and an average sensitivity of 0.99. Besides, IRSOM is fast (it takes around 254 s to analyze a dataset of 147 000 transcripts) and is able to handle very large datasets. Availability and implementation IRSOM is implemented in Python and C++. It is available on our software platform EvryRNA (http://EvryRNA.ibisc.univ-evry.fr).
Collapse
Affiliation(s)
- Ludovic Platon
- IBISC, Université Evry, Université Paris-Saclay, Evry, France.,Institute of Plant Sciences Paris-Saclay, INRA, CNRS, Université Paris-Sud, Université d'Evry, Université Paris-Diderot, Orsay, France
| | - Farida Zehraoui
- IBISC, Université Evry, Université Paris-Saclay, Evry, France
| | - Abdelhafid Bendahmane
- Institute of Plant Sciences Paris-Saclay, INRA, CNRS, Université Paris-Sud, Université d'Evry, Université Paris-Diderot, Orsay, France
| | - Fariza Tahi
- IBISC, Université Evry, Université Paris-Saclay, Evry, France
| |
Collapse
|
13
|
Interpreting and integrating big data in non-coding RNA research. Emerg Top Life Sci 2019; 3:343-355. [PMID: 33523206 DOI: 10.1042/etls20190004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 07/10/2019] [Accepted: 07/15/2019] [Indexed: 12/17/2022]
Abstract
In the last two decades, we have witnessed an impressive crescendo of non-coding RNA studies, due to both the development of high-throughput RNA-sequencing strategies and an ever-increasing awareness of the involvement of newly discovered ncRNA classes in complex regulatory networks. Together with excitement for the possibility to explore previously unknown layers of gene regulation, these advancements led to the realization of the need for shared criteria of data collection and analysis and for novel integrative perspectives and tools aimed at making biological sense of very large bodies of molecular information. In the last few years, efforts to respond to this need have been devoted mainly to the regulatory interactions involving ncRNAs as direct or indirect regulators of protein-coding mRNAs. Such efforts resulted in the development of new computational tools, allowing the exploitation of the information spread in numerous different ncRNA data sets to interpret transcriptome changes under physiological and pathological cell responses. While experimental validation remains essential to identify key RNA regulatory interactions, the integration of ncRNA big data, in combination with systematic literature mining, is proving to be invaluable in identifying potential new players, biomarkers and therapeutic targets in cancer and other diseases.
Collapse
|
14
|
Wei H, Liu B. iCircDA-MF: identification of circRNA-disease associations based on matrix factorization. Brief Bioinform 2019; 21:1356-1367. [DOI: 10.1093/bib/bbz057] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 03/13/2019] [Accepted: 04/17/2019] [Indexed: 12/19/2022] Open
Abstract
Abstract
Circular RNAs (circRNAs) are a group of novel discovered non-coding RNAs with closed-loop structure, which play critical roles in various biological processes. Identifying associations between circRNAs and diseases is critical for exploring the complex disease mechanism and facilitating disease-targeted therapy. Although several computational predictors have been proposed, their performance is still limited. In this study, a novel computational method called iCircDA-MF is proposed. Because the circRNA-disease associations with experimental validation are very limited, the potential circRNA-disease associations are calculated based on the circRNA similarity and disease similarity extracted from the disease semantic information and the known associations of circRNA-gene, gene-disease and circRNA-disease. The circRNA-disease interaction profiles are then updated by the neighbour interaction profiles so as to correct the false negative associations. Finally, the matrix factorization is performed on the updated circRNA-disease interaction profiles to predict the circRNA-disease associations. The experimental results on a widely used benchmark dataset showed that iCircDA-MF outperforms other state-of-the-art predictors and can identify new circRNA-disease associations effectively.
Collapse
Affiliation(s)
- Hang Wei
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
15
|
Amin N, McGrath A, Chen YPP. Evaluation of deep learning in non-coding RNA classification. NAT MACH INTELL 2019. [DOI: 10.1038/s42256-019-0051-2] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
16
|
Yu DL, Ma YL, Yu ZG. Inferring microRNA-disease association by hybrid recommendation algorithm and unbalanced bi-random walk on heterogeneous network. Sci Rep 2019; 9:2474. [PMID: 30792474 PMCID: PMC6385311 DOI: 10.1038/s41598-019-39226-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/18/2019] [Indexed: 02/04/2023] Open
Abstract
More and more research works have indicated that microRNAs (miRNAs) play indispensable roles in exploring the pathogenesis of diseases. Detecting miRNA-disease associations by experimental techniques in biology is expensive and time-consuming. Hence, it is important to propose reliable and accurate computational methods to exploring potential miRNAs related diseases. In our work, we develop a novel method (BRWHNHA) to uncover potential miRNAs associated with diseases based on hybrid recommendation algorithm and unbalanced bi-random walk. We first integrate the Gaussian interaction profile kernel similarity into the miRNA functional similarity network and the disease semantic similarity network. Then we calculate the transition probability matrix of bipartite network by using hybrid recommendation algorithm. Finally, we adopt unbalanced bi-random walk on the heterogeneous network to infer undiscovered miRNA-disease relationships. We tested BRWHNHA on 22 diseases based on five-fold cross-validation and achieves reliable performance with average AUC of 0.857, which an area under the ROC curve ranging from 0.807 to 0.924. As a result, BRWHNHA significantly improves the performance of inferring potential miRNA-disease association compared with previous methods. Moreover, the case studies on lung neoplasms and prostate neoplasms also illustrate that BRWHNHA is superior to previous prediction methods and is more advantageous in exploring potential miRNAs related diseases. All source codes can be downloaded from https://github.com/myl446/BRWHNHA.
Collapse
Affiliation(s)
- Dong-Ling Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, Hunan 411105, P.R. China
| | - Yuan-Lin Ma
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, Hunan 411105, P.R. China
| | - Zu-Guo Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, Hunan 411105, P.R. China. .,School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, Q4001, Australia.
| |
Collapse
|
17
|
Wang X, Peng F, Cheng L, Yang G, Zhang D, Liu J, Chen X, Zhao S. Prognostic and clinicopathological role of long non-coding RNA UCA1 in various carcinomas. Oncotarget 2018; 8:28373-28384. [PMID: 28423704 PMCID: PMC5438656 DOI: 10.18632/oncotarget.16059] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Accepted: 02/27/2017] [Indexed: 12/26/2022] Open
Abstract
Urothelial cancer associated 1 (UCA1) as an oncogenic long non-coding RNA (LncRNA) was aberrantly upregulated in various solid tumors. Numerous studies have demonstrated overexpression of UCA1 is an unfavorable prognostic indicator in cancer patients. This study aimed to further explore the prognosis role and clinical significance of UCA1 in cancer. Eligible studies were recruited by a systematic search in PubMed, Embase, Cochrane Library and Web of Science databases. A total of 19/16 studies with 1587/1291 cancer patients were included to evaluate the association between UCA1 expression and overall survival (OS) and clinicopathological factors of malignancies by computing hazard ratio (HR), odds ratios (OR) and confidence interval (CI). The meta-analysis indicated overexpression of UCA1 was significantly correlated with unexpected OS in patients with cancer (pooled HR = 1.85, 95% CI 1.62-2.10, p < 0.001). There was also a significantly negative association between high level of UCA1 and poor grade cancer (pooled OR = 2.74, 95% CI 2.04-3.70, p < 0.001) and positive lymphatic metastasis (pooled OR = 2.43, 95% CI 1.72-3.41, p < 0.001). In conclusion, our study suggested that UCA1 was correlated with more advanced clinicopathological features and poor prognosis as a novel predictive biomarker of patients with various tumors.
Collapse
Affiliation(s)
- Xiaoxiong Wang
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| | - Fei Peng
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150081, People's Republic of China
| | - Guang Yang
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| | - Daming Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| | - Jiaqi Liu
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| | - Xin Chen
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| | - Shiguang Zhao
- Department of Neurosurgery, The First Affiliated Hospital of Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China.,Institute of Brain Science, Harbin Medical University, Nangang District, Harbin, Heilongjiang Province, 150001, People's Republic of China
| |
Collapse
|
18
|
Liu Y, Zeng X, He Z, Zou Q. Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:905-915. [PMID: 27076459 DOI: 10.1109/tcbb.2016.2550432] [Citation(s) in RCA: 209] [Impact Index Per Article: 26.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Since the discovery of the regulatory function of microRNA (miRNA), increased attention has focused on identifying the relationship between miRNA and disease. It has been suggested that computational method are an efficient way to identify potential disease-related miRNAs for further confirmation using biological experiments. In this paper, we first highlighted three limitations commonly associated with previous computational methods. To resolve these limitations, we established disease similarity subnetwork and miRNA similarity subnetwork by integrating multiple data sources, where the disease similarity is composed of disease semantic similarity and disease functional similarity, and the miRNA similarity is calculated using the miRNA-target gene and miRNA-lncRNA (long non-coding RNA) associations. Then, a heterogeneous network was constructed by connecting the disease similarity subnetwork and the miRNA similarity subnetwork using the known miRNA-disease associations. We extended random walk with restart to predict miRNA-disease associations in the heterogeneous network. The leave-one-out cross-validation achieved an average area under the curve (AUC) of 0:8049 across 341 diseases and 476 miRNAs. For five-fold cross-validation, our method achieved an AUC from 0:7970 to 0:9249 for 15 human diseases. Case studies further demonstrated the feasibility of our method to discover potential miRNA-disease associations. An online service for prediction is freely available at http://ifmda.aliapp.com.
Collapse
|
19
|
A Review on Recent Computational Methods for Predicting Noncoding RNAs. BIOMED RESEARCH INTERNATIONAL 2017; 2017:9139504. [PMID: 28553651 PMCID: PMC5434267 DOI: 10.1155/2017/9139504] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Revised: 02/06/2017] [Accepted: 02/15/2017] [Indexed: 12/20/2022]
Abstract
Noncoding RNAs (ncRNAs) play important roles in various cellular activities and diseases. In this paper, we presented a comprehensive review on computational methods for ncRNA prediction, which are generally grouped into four categories: (1) homology-based methods, that is, comparative methods involving evolutionarily conserved RNA sequences and structures, (2) de novo methods using RNA sequence and structure features, (3) transcriptional sequencing and assembling based methods, that is, methods designed for single and pair-ended reads generated from next-generation RNA sequencing, and (4) RNA family specific methods, for example, methods specific for microRNAs and long noncoding RNAs. In the end, we summarized the advantages and limitations of these methods and pointed out a few possible future directions for ncRNA prediction. In conclusion, many computational methods have been demonstrated to be effective in predicting ncRNAs for further experimental validation. They are critical in reducing the huge number of potential ncRNAs and pointing the community to high confidence candidates. In the future, high efficient mapping technology and more intrinsic sequence features (e.g., motif and k-mer frequencies) and structure features (e.g., minimum free energy, conserved stem-loop, or graph structures) are suggested to be combined with the next- and third-generation sequencing platforms to improve ncRNA prediction.
Collapse
|
20
|
Barman RK, Mukhopadhyay A, Das S. An improved method for identification of small non-coding RNAs in bacteria using support vector machine. Sci Rep 2017; 7:46070. [PMID: 28383059 PMCID: PMC5382675 DOI: 10.1038/srep46070] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 03/08/2017] [Indexed: 12/25/2022] Open
Abstract
Bacterial small non-coding RNAs (sRNAs) are not translated into proteins, but act as functional RNAs. They are involved in diverse biological processes like virulence, stress response and quorum sensing. Several high-throughput techniques have enabled identification of sRNAs in bacteria, but experimental detection remains a challenge and grossly incomplete for most species. Thus, there is a need to develop computational tools to predict bacterial sRNAs. Here, we propose a computational method to identify sRNAs in bacteria using support vector machine (SVM) classifier. The primary sequence and secondary structure features of experimentally-validated sRNAs of Salmonella Typhimurium LT2 (SLT2) was used to build the optimal SVM model. We found that a tri-nucleotide composition feature of sRNAs achieved an accuracy of 88.35% for SLT2. We validated the SVM model also on the experimentally-detected sRNAs of E. coli and Salmonella Typhi. The proposed model had robustly attained an accuracy of 81.25% and 88.82% for E. coli K-12 and S. Typhi Ty2, respectively. We confirmed that this method significantly improved the identification of sRNAs in bacteria. Furthermore, we used a sliding window-based method and identified sRNAs from complete genomes of SLT2, S. Typhi Ty2 and E. coli K-12 with sensitivities of 89.09%, 83.33% and 67.39%, respectively.
Collapse
Affiliation(s)
- Ranjan Kumar Barman
- Biomedical Informatics Centre, National Institute Of Cholera and Enteric Diseases, Kolkata, West Bengal, India
| | - Anirban Mukhopadhyay
- Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India
| | - Santasabuj Das
- Biomedical Informatics Centre, National Institute Of Cholera and Enteric Diseases, Kolkata, West Bengal, India.,Division of Clinical Medicine, National Institute of Cholera and Enteric Diseases, Kolkata, West Bengal, India
| |
Collapse
|
21
|
Freitas Castro F, Ruy PC, Nogueira Zeviani K, Freitas Santos R, Simões Toledo J, Kaysel Cruz A. Evidence of putative non-coding RNAs from Leishmania untranslated regions. Mol Biochem Parasitol 2017; 214:69-74. [PMID: 28385563 DOI: 10.1016/j.molbiopara.2017.04.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 03/29/2017] [Accepted: 04/01/2017] [Indexed: 11/28/2022]
Abstract
Non-coding RNAs (ncRNAs) are regulatory elements present in a wide range of organisms, including trypanosomatids. ncRNAs transcribed from the untranslated regions (UTRs) of coding genes have been described in the transcriptomes of several eukaryotes, including Trypanosoma brucei. To uncover novel putative ncRNAs in two Leishmania species, we examined a L. major cDNA library and a L. donovani non-polysomal RNA library. Using a combination of computational analysis and experimental approaches, we classified 26 putative ncRNA in L. major, of these, 5 arising from intergenic regions and 21 from untranslated regions. In L. donovani, we classified 37 putative ncRNAs, of these, 7 arising from intergenic regions, and 30 from UTRs. Our results suggest, for the first time, that UTR-transcripts may be a common feature in the eukaryote Leishmania similarly to those previously shown in T. brucei and other eukaryotes.
Collapse
Affiliation(s)
- Felipe Freitas Castro
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Patricia C Ruy
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Karina Nogueira Zeviani
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Ramon Freitas Santos
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Juliano Simões Toledo
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil
| | - Angela Kaysel Cruz
- Department of Cell and Molecular Biology, Ribeirão Preto Medical School, University of São Paulo, Brazil.
| |
Collapse
|
22
|
Long Noncoding RNA Identification: Comparing Machine Learning Based Tools for Long Noncoding Transcripts Discrimination. BIOMED RESEARCH INTERNATIONAL 2016; 2016:8496165. [PMID: 28042575 PMCID: PMC5153550 DOI: 10.1155/2016/8496165] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 10/05/2016] [Accepted: 10/13/2016] [Indexed: 12/27/2022]
Abstract
Long noncoding RNA (lncRNA) is a kind of noncoding RNA with length more than 200 nucleotides, which aroused interest of people in recent years. Lots of studies have confirmed that human genome contains many thousands of lncRNAs which exert great influence over some critical regulators of cellular process. With the advent of high-throughput sequencing technologies, a great quantity of sequences is waiting for exploitation. Thus, many programs are developed to distinguish differences between coding and long noncoding transcripts. Different programs are generally designed to be utilised under different circumstances and it is sensible and practical to select an appropriate method according to a certain situation. In this review, several popular methods and their advantages, disadvantages, and application scopes are summarised to assist people in employing a suitable method and obtaining a more reliable result.
Collapse
|
23
|
Liu H, Lyu J, Liu H, Gao Y, Guo J, He H, Han Z, Zhang Y, Wu Q. Computational identification of putative lincRNAs in mouse embryonic stem cell. Sci Rep 2016; 6:34892. [PMID: 27713513 PMCID: PMC5054606 DOI: 10.1038/srep34892] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 09/21/2016] [Indexed: 01/19/2023] Open
Abstract
As the regulatory factors, lncRNAs play critical roles in embryonic stem cells. And lincRNAs are most widely studied lncRNAs, however, there might still might exist a large member of uncovered lncRNAs. In this study, we constructed the de novo assembly of transcriptome to detect 6,701 putative long intergenic non-coding transcripts (lincRNAs) expressed in mouse embryonic stem cells (ESCs), which might be incomplete with the lack coverage of 5' ends assessed by CAGE peaks. Comparing the TSS proximal regions between the known lincRNAs and their closet protein coding transcripts, our results revealed that the lincRNA TSS proximal regions are associated with the characteristic genomic and epigenetic features. Subsequently, 1,293 lincRNAs were corrected at their 5' ends using the putative lincRNA TSS regions predicted by the TSS proximal region prediction model based on genomic and epigenetic features. Finally, 43 putative lincRNAs were annotated by Gene Ontology terms. In conclusion, this work provides a novel catalog of mouse ESCs-expressed lincRNAs with the relatively complete transcript length, which might be useful for the investigation of transcriptional and post-transcriptional regulation of lincRNA in mouse ESCs and even mammalian development.
Collapse
Affiliation(s)
- Hui Liu
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Jie Lyu
- Dan L. Duncan Cancer Center, Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Hongbo Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yang Gao
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Jing Guo
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Hongjuan He
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Zhengbin Han
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| | - Yan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Qiong Wu
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
24
|
Zhang Y, Huang H, Dong X, Fang Y, Wang K, Zhu L, Wang K, Huang T, Yang J. A Dynamic 3D Graphical Representation for RNA Structure Analysis and Its Application in Non-Coding RNA Classification. PLoS One 2016; 11:e0152238. [PMID: 27213271 PMCID: PMC4877074 DOI: 10.1371/journal.pone.0152238] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 03/10/2016] [Indexed: 12/21/2022] Open
Abstract
With the development of new technologies in transcriptome and epigenetics, RNAs have been identified to play more and more important roles in life processes. Consequently, various methods have been proposed to assess the biological functions of RNAs and thus classify them functionally, among which comparative study of RNA structures is perhaps the most important one. To measure the structural similarity of RNAs and classify them, we propose a novel three dimensional (3D) graphical representation of RNA secondary structure, in which an RNA secondary structure is first transformed into a characteristic sequence based on chemical property of nucleic acids; a dynamic 3D graph is then constructed for the characteristic sequence; and lastly a numerical characterization of the 3D graph is used to represent the RNA secondary structure. We tested our algorithm on three datasets: (1) Dataset I consisting of nine RNA secondary structures of viruses, (2) Dataset II consisting of complex RNA secondary structures including pseudo-knots, and (3) Dataset III consisting of 18 non-coding RNA families. We also compare our method with other nine existing methods using Dataset II and III. The results demonstrate that our method is better than other methods in similarity measurement and classification of RNA secondary structures.
Collapse
Affiliation(s)
- Yi Zhang
- Department of Mathematics, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
- Hebei Laboratory of Pharmaceutic Molecular Chemistry, Shijiazhuang, Hebei 050018, People's Republic of China
- * E-mail: (JY); (YZ); (TH)
| | - Haiyun Huang
- Department of Information Retrieval of Library, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
| | - Xiaoqing Dong
- Department of Mathematics, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
| | - Yiliang Fang
- International Travel Healthcare Center, Fuzhou, Fujian 350001, People's Republic of China
| | - Kejing Wang
- Department of Mathematics, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
| | - Lijuan Zhu
- Department of Mathematics, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
| | - Ke Wang
- Department of Mathematics, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
| | - Tao Huang
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People's Republic of China
- * E-mail: (JY); (YZ); (TH)
| | - Jialiang Yang
- Department of Mathematics, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, People's Republic of China
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States of America
- * E-mail: (JY); (YZ); (TH)
| |
Collapse
|
25
|
Liu B, Fang L. WITHDRAWN: Identification of microRNA precursor based on gapped n-tuple structure status composition kernel. Comput Biol Chem 2016:S1476-9271(16)30036-6. [PMID: 26935400 DOI: 10.1016/j.compbiolchem.2016.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2016] [Accepted: 02/01/2016] [Indexed: 10/22/2022]
Abstract
This article has been withdrawn at the request of the author(s) and/or editor. The Publisher apologizes for any inconvenience this may cause. The full Elsevier Policy on Article Withdrawal can be found at http://www.elsevier.com/locate/withdrawalpolicy.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China; Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China.
| | - Longyun Fang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China.
| |
Collapse
|
26
|
Moore AC, Winkjer JS, Tseng TT. Bioinformatics Resources for MicroRNA Discovery. Biomark Insights 2016; 10:53-8. [PMID: 26819547 PMCID: PMC4718083 DOI: 10.4137/bmi.s29513] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Revised: 11/22/2015] [Accepted: 11/24/2015] [Indexed: 11/12/2022] Open
Abstract
Biomarker identification is often associated with the diagnosis and evaluation of various diseases. Recently, the role of microRNA (miRNA) has been implicated in the development of diseases, particularly cancer. With the advent of next-generation sequencing, the amount of data on miRNA has increased tremendously in the last decade, requiring new bioinformatics approaches for processing and storing new information. New strategies have been developed in mining these sequencing datasets to allow better understanding toward the actions of miRNAs. As a result, many databases have also been established to disseminate these findings. This review focuses on several curated databases of miRNAs and their targets from both predicted and validated sources.
Collapse
Affiliation(s)
- Alyssa C Moore
- Department of Molecular and Cellular Biology, Kennesaw State University, Kennesaw, GA, USA
| | - Jonathan S Winkjer
- Department of Molecular and Cellular Biology, Kennesaw State University, Kennesaw, GA, USA
| | - Tsai-Tien Tseng
- Department of Molecular and Cellular Biology, Kennesaw State University, Kennesaw, GA, USA
| |
Collapse
|
27
|
Zou Q, Zeng J, Cao L, Ji R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2014.12.123] [Citation(s) in RCA: 124] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
28
|
Huang Y, Cheng JH, Luo FN, Pan H, Sun XJ, Diao LY, Qin XJ. Genome-wide identification and characterization of microRNA genes and their targets in large yellow croaker (Larimichthys crocea). Gene 2015; 576:261-7. [PMID: 26523500 DOI: 10.1016/j.gene.2015.10.044] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2015] [Revised: 10/04/2015] [Accepted: 10/13/2015] [Indexed: 12/12/2022]
Abstract
MicroRNAs (miRNAs or miRs) are a class of non-coding RNAs of 20-25 nucleotides (nt) in length, which regulates the expression of gene in eukaryotic organism. Studies has been confirmed that miRNA plays an important role in various biological and metabolic processes in both animals and plants. Predicting new miRNAs by computer based homology search analysis is an effective way to discover novel miRNAs. Though a large number of miRNAs have been reported in many fish species, reports of miRNAs in large yellow croaker (L. crocea) are limited especially via the computational-based approaches. In this paper, a method of comparative genomic approach by computational genomic homology based on the conservation of miRNA sequences and the stem-loop hairpin secondary structures of miRNAs was adopted. A total of 199 potential miRNAs were predicted representing 81 families. 12 of them were chose to be validated by real time RT-PCR, apart from miR-7132b-5p which was not detected. Results indicated that the prediction method that we used to identify the miRNAs was effective. Furthermore, 948 potential target genes were predicted. Gene ontology (GO) analysis revealed that 175, 287, and 486 target genes were involved in cellular components, biological processes and molecular functions, respectively. Overall, our findings provide a first computational identification and characterization of L. crocea miRNAs and their potential targets in functional analysis, and will be useful in laying the foundation for further characterization of their role in the regulation of diversity of physiological processes.
Collapse
Affiliation(s)
- Yong Huang
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China.
| | - Jia-Heng Cheng
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Fu-Nong Luo
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Hao Pan
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Xiao-Juan Sun
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Lan-Yu Diao
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Xiao-Juan Qin
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| |
Collapse
|
29
|
Computational Identification of MicroRNAs and Their Targets from Finger Millet (Eleusine coracana). Interdiscip Sci 2015; 9:72-79. [PMID: 26496774 DOI: 10.1007/s12539-015-0130-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 09/29/2015] [Accepted: 10/12/2015] [Indexed: 10/22/2022]
Abstract
MicroRNAs are endogenous small RNAs regulating intrinsic normal growth and development of plant. Discovering miRNAs, their targets and further inferring their functions had become routine process to comprehend the normal biological processes of miRNAs and their roles in plant development. In this study, we used homology-based analysis with available expressed sequence tag of finger millet (Eleusine coracana) to predict conserved miRNAs. Three potent miRNAs targeting 88 genes were identified. The newly identified miRNAs were found to be homologous with miR166 and miR1310. The targets recognized were transcription factors and enzymes, and GO analysis showed these miRNAs played varied roles in gene regulation. The identification of miRNAs and their targets is anticipated to hasten the pace of key epigenetic regulators in plant development.
Collapse
|
30
|
Survey of Natural Language Processing Techniques in Bioinformatics. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:674296. [PMID: 26525745 PMCID: PMC4615216 DOI: 10.1155/2015/674296] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2015] [Revised: 06/12/2015] [Accepted: 06/21/2015] [Indexed: 01/02/2023]
Abstract
Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.
Collapse
|
31
|
Zou Q, Guo J, Ju Y, Wu M, Zeng X, Hong Z. Improving tRNAscan-SE Annotation Results via Ensemble Classifiers. Mol Inform 2015; 34:761-70. [DOI: 10.1002/minf.201500031] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 07/01/2015] [Indexed: 01/18/2023]
|
32
|
Liu B, Fang L, Chen J, Liu F, Wang X. miRNA-dis: microRNA precursor identification based on distance structure status pairs. MOLECULAR BIOSYSTEMS 2015; 11:1194-204. [DOI: 10.1039/c5mb00050e] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
MicroRNA precursor identification is an important task in bioinformatics.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology
- Harbin Institute of Technology Shenzhen Graduate School
- HIT Campus Shenzhen University Town
- Shenzhen
- China
| | - Longyun Fang
- School of Computer Science and Technology
- Harbin Institute of Technology Shenzhen Graduate School
- HIT Campus Shenzhen University Town
- Shenzhen
- China
| | - Junjie Chen
- School of Computer Science and Technology
- Harbin Institute of Technology Shenzhen Graduate School
- HIT Campus Shenzhen University Town
- Shenzhen
- China
| | - Fule Liu
- School of Computer Science and Technology
- Harbin Institute of Technology Shenzhen Graduate School
- HIT Campus Shenzhen University Town
- Shenzhen
- China
| | - Xiaolong Wang
- School of Computer Science and Technology
- Harbin Institute of Technology Shenzhen Graduate School
- HIT Campus Shenzhen University Town
- Shenzhen
- China
| |
Collapse
|