1
|
Chen S, Liu M, Yi W, Li H, Yu Q. Micropeptides derived from long non-coding RNAs: Computational analysis and functional roles in breast cancer and other diseases. Gene 2025; 935:149019. [PMID: 39461573 DOI: 10.1016/j.gene.2024.149019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 10/08/2024] [Accepted: 10/16/2024] [Indexed: 10/29/2024]
Abstract
Long non-coding RNAs (lncRNAs), once thought to be mere transcriptional noise, are now revealing a hidden code. Recent advancements like ribosome sequencing have unveiled that many lncRNAs harbor small open reading frames and can potentially encode functional micropeptides. Emerging research suggests these micropeptides, not the lncRNAs themselves, play crucial roles in regulating homeostasis, inflammation, metabolism, and especially in breast cancer progression. This review delves into the rapidly evolving computational tools used to predict and validate lncRNA-encoded micropeptides. We then explore the diverse functions and mechanisms of action of these micropeptides in breast cancer pathogenesis, with a focus on their roles in various species. Ultimately, this review aims to illuminate the functional landscape of lncRNA-encoded micropeptides and their potential as therapeutic targets in cancer.
Collapse
Affiliation(s)
- Saisai Chen
- Department of Breast Surgery, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, Hefei 230031, China
| | - Mengru Liu
- Department of Infection, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230000, China
| | - Weizhen Yi
- Department of Breast Surgery, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, Hefei 230031, China
| | - Huagang Li
- Department of Breast Surgery, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, Hefei 230031, China
| | - Qingsheng Yu
- Institute of Chinese Medicine Surgery, Anhui Academy of Chinese Medicine, Hefei 230031, China.
| |
Collapse
|
2
|
Huang J, Wang X, Xia R, Yang D, Liu J, Lv Q, Yu X, Meng J, Chen K, Song B, Wang Y. Domain-knowledge enabled ensemble learning of 5-formylcytosine (f5C) modification sites. Comput Struct Biotechnol J 2024; 23:3175-3185. [PMID: 39253057 PMCID: PMC11381828 DOI: 10.1016/j.csbj.2024.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/07/2024] [Accepted: 08/07/2024] [Indexed: 09/11/2024] Open
Abstract
5-formylcytidine (f5C) is a unique post-transcriptional RNA modification found in mRNA and tRNA at the wobble site, playing a crucial role in mitochondrial protein synthesis and potentially contributing to the regulation of translation. Recent studies have unveiled that the f5C modifications may drive mitochondrial mRNA translation to power cancer metastasis. Accurate identification of f5C sites is essential for further unraveling their molecular functions and regulatory mechanisms, but there are currently no computational methods available for predicting their locations. In this study, we introduce an innovative ensemble approach, successfully enabling the computational recognition of Saccharomyces cerevisiae f5C. We conducted a comprehensive model selection process that involved multiple basic machine learning and deep learning algorithms such as recurrent neural networks, convolutional neural networks and Transformer-based models. Initially trained only on sequence information, these individual models achieved an AUROC ranging from 0.7104 to 0.7492. Through the integration of 32 novel domain-derived genomic features, the performance of individual models has significantly improved to an AUROC between 0.7309 and 0.8076. To further enhance accuracy and robustness, we then constructed the ensembles of these individual models with different combinations. The best performance attained by our ensemble models reached an AUROC of 0.8391. Shapley additive explanations were conducted to explain the significant contributions of genomic features, providing insights into the putative distribution of f5C across various topological regions and potentially paving the way for revealing their functional relevance within distinct genomic contexts. A freely accessible web server that allows real-time analysis of user-uploaded sites can be accessed at: www.rnamd.org/Resf5C-Pred.
Collapse
Affiliation(s)
- Jiaming Huang
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
- Department of Biological Sciences, School of Science, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Xuan Wang
- Department of Biological Sciences, School of Science, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Rong Xia
- Department of Biological Sciences, School of Science, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- School of AI and Advanced Computing, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Dongqing Yang
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jian Liu
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Qi Lv
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Xiaoxuan Yu
- Department of Pharmacology, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jia Meng
- Department of Biological Sciences, School of Science, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- AI University Research Centre, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L7 8TX, United Kingdom
| | - Kunqi Chen
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou 350004, China
| | - Bowen Song
- Department of Public Health, School of Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Yue Wang
- Jiangsu Key Laboratory for Functional Substance of Chinese Medicine, School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| |
Collapse
|
3
|
Xia Y, Zhang Y, Liu D, Zhu YH, Wang Z, Song J, Yu DJ. BLAM6A-Merge: Leveraging Attention Mechanisms and Feature Fusion Strategies to Improve the Identification of RNA N6-Methyladenosine Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1803-1815. [PMID: 38913512 DOI: 10.1109/tcbb.2024.3418490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
RNA N6-methyladenosine is a prevalent and abundant type of RNA modification that exerts significant influence on diverse biological processes. To date, numerous computational approaches have been developed for predicting methylation, with most of them ignoring the correlations of different encoding strategies and failing to explore the adaptability of various attention mechanisms for methylation identification. To solve the above issues, we proposed an innovative framework for predicting RNA m6A modification site, termed BLAM6A-Merge. Specifically, it utilized a multimodal feature fusion strategy to combine the classification results of four features and Blastn tool. Apart from this, different attention mechanisms were employed for extracting higher-level features on specific features after the screening process. Extensive experiments on 12 benchmarking datasets demonstrated that BLAM6A-Merge achieved superior performance (average AUC: 0.849 for the full transcript mode and 0.784 for the mature mRNA mode). Notably, the Blastn tool was employed for the first time in the identification of methylation sites.
Collapse
|
4
|
Xiao Y, Ren Y, Hu W, Paliouras AR, Zhang W, Zhong L, Yang K, Su L, Wang P, Li Y, Ma M, Shi L. Long non-coding RNA-encoded micropeptides: functions, mechanisms and implications. Cell Death Discov 2024; 10:450. [PMID: 39443468 PMCID: PMC11499885 DOI: 10.1038/s41420-024-02175-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 08/29/2024] [Accepted: 09/05/2024] [Indexed: 10/25/2024] Open
Abstract
Long non-coding RNAs (lncRNAs) are typically described as RNA transcripts exceeding 200 nucleotides in length, which do not code for proteins. Recent advancements in technology, including ribosome RNA sequencing and ribosome nascent-chain complex sequencing, have demonstrated that many lncRNAs retain small open reading frames and can potentially encode micropeptides. Emerging studies have revealed that these micropeptides, rather than lncRNAs themselves, are responsible for vital functions, including but not limited to regulating homeostasis, managing inflammation and the immune system, moderating metabolism, and influencing tumor progression. In this review, we initially outline the rapidly advancing computational analytical methods and public tools to predict and validate the potential encoding of lncRNAs. We then focus on the diverse functions of micropeptides and their underlying mechanisms in the pathogenesis of disease. This review aims to elucidate the functions of lncRNA-encoded micropeptides and explore their potential applications as therapeutic targets in cancer.
Collapse
Affiliation(s)
- Yinan Xiao
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China
| | - Yaru Ren
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China
| | - Wenteng Hu
- Thoracic surgery department, The First Hospital, Lanzhou University, Lanzhou, 730000, PR China
| | | | - Wenyang Zhang
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China
| | - Linghui Zhong
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China
| | - Kaixin Yang
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China
| | - Li Su
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China
| | - Peng Wang
- College of Animal Science and Technology, Hebei North University, Zhangjiakou, 075131, PR China
| | - Yonghong Li
- NHC Key Laboratory of Diagnosis and Therapy of Gastrointestinal Tumor, Gansu Provincial Hospital, Lanzhou, 730000, PR China
| | - Minjie Ma
- Thoracic surgery department, The First Hospital, Lanzhou University, Lanzhou, 730000, PR China
| | - Lei Shi
- RNA Oncology Group, School of Public Health, Lanzhou University, Lanzhou, 730000, PR China.
| |
Collapse
|
5
|
Luo Z, Yu L, Xu Z, Liu K, Gu L. Comprehensive Review and Assessment of Computational Methods for Prediction of N6-Methyladenosine Sites. BIOLOGY 2024; 13:777. [PMID: 39452086 PMCID: PMC11504118 DOI: 10.3390/biology13100777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 09/19/2024] [Accepted: 09/23/2024] [Indexed: 10/26/2024]
Abstract
N6-methyladenosine (m6A) plays a crucial regulatory role in the control of cellular functions and gene expression. Recent advances in sequencing techniques for transcriptome-wide m6A mapping have accelerated the accumulation of m6A site information at a single-nucleotide level, providing more high-confidence training data to develop computational approaches for m6A site prediction. However, it is still a major challenge to precisely predict m6A sites using in silico approaches. To advance the computational support for m6A site identification, here, we curated 13 up-to-date benchmark datasets from nine different species (i.e., H. sapiens, M. musculus, Rat, S. cerevisiae, Zebrafish, A. thaliana, Pig, Rhesus, and Chimpanzee). This will assist the research community in conducting an unbiased evaluation of alternative approaches and support future research on m6A modification. We revisited 52 computational approaches published since 2015 for m6A site identification, including 30 traditional machine learning-based, 14 deep learning-based, and 8 ensemble learning-based methods. We comprehensively reviewed these computational approaches in terms of their training datasets, calculated features, computational methodologies, performance evaluation strategy, and webserver/software usability. Using these benchmark datasets, we benchmarked nine predictors with available online websites or stand-alone software and assessed their prediction performance. We found that deep learning and traditional machine learning approaches generally outperformed scoring function-based approaches. In summary, the curated benchmark dataset repository and the systematic assessment in this study serve to inform the design and implementation of state-of-the-art computational approaches for m6A identification and facilitate more rigorous comparisons of new methods in the future.
Collapse
Affiliation(s)
- Zhengtao Luo
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China;
- Anhui Provincial Key Laboratory of Smart Agriculture Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| | - Liyi Yu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China; (L.Y.); (Z.X.)
| | - Zhaochun Xu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China; (L.Y.); (Z.X.)
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin 150076, China
| | - Kening Liu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China; (L.Y.); (Z.X.)
| | - Lichuan Gu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China;
- Anhui Provincial Key Laboratory of Smart Agriculture Technology and Equipment, Anhui Agricultural University, Hefei 230036, China
| |
Collapse
|
6
|
Bortoletto E, Rosani U. Bioinformatics for Inosine: Tools and Approaches to Trace This Elusive RNA Modification. Genes (Basel) 2024; 15:996. [PMID: 39202357 PMCID: PMC11353476 DOI: 10.3390/genes15080996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 07/23/2024] [Accepted: 07/25/2024] [Indexed: 09/03/2024] Open
Abstract
Inosine is a nucleotide resulting from the deamination of adenosine in RNA. This chemical modification process, known as RNA editing, is typically mediated by a family of double-stranded RNA binding proteins named Adenosine Deaminase Acting on dsRNA (ADAR). While the presence of ADAR orthologs has been traced throughout the evolution of metazoans, the existence and extension of RNA editing have been characterized in a more limited number of animals so far. Undoubtedly, ADAR-mediated RNA editing plays a vital role in physiology, organismal development and disease, making the understanding of the evolutionary conservation of this phenomenon pivotal to a deep characterization of relevant biological processes. However, the lack of direct high-throughput methods to reveal RNA modifications at single nucleotide resolution limited an extended investigation of RNA editing. Nowadays, these methods have been developed, and appropriate bioinformatic pipelines are required to fully exploit this data, which can complement existing approaches to detect ADAR editing. Here, we review the current literature on the "bioinformatics for inosine" subject and we discuss future research avenues in the field.
Collapse
Affiliation(s)
| | - Umberto Rosani
- Department of Biology, University of Padova, 35131 Padova, Italy;
| |
Collapse
|
7
|
González-Iglesias A, Arcas A, Domingo-Muelas A, Mancini E, Galcerán J, Valcárcel J, Fariñas I, Nieto MA. Intron detention tightly regulates the stemness/differentiation switch in the adult neurogenic niche. Nat Commun 2024; 15:2837. [PMID: 38565566 PMCID: PMC10987655 DOI: 10.1038/s41467-024-47092-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 03/13/2024] [Indexed: 04/04/2024] Open
Abstract
The adult mammalian brain retains some capacity to replenish neurons and glia, holding promise for brain regeneration. Thus, understanding the mechanisms controlling adult neural stem cell (NSC) differentiation is crucial. Paradoxically, adult NSCs in the subependymal zone transcribe genes associated with both multipotency maintenance and neural differentiation, but the mechanism that prevents conflicts in fate decisions due to these opposing transcriptional programmes is unknown. Here we describe intron detention as such control mechanism. In NSCs, while multiple mRNAs from stemness genes are spliced and exported to the cytoplasm, transcripts from differentiation genes remain unspliced and detained in the nucleus, and the opposite is true under neural differentiation conditions. We also show that m6A methylation is the mechanism that releases intron detention and triggers nuclear export, enabling rapid and synchronized responses. m6A RNA methylation operates as an on/off switch for transcripts with antagonistic functions, tightly controlling the timing of NSCs commitment to differentiation.
Collapse
Affiliation(s)
| | - Aida Arcas
- Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain
- Department of Gene Therapy and Regulation of Gene Expression, Center for Applied Medical Research, University of Navarra, Pamplona, 31008, Spain
| | - Ana Domingo-Muelas
- Departamento de Biología Celular, Biología Funcional y Antropología Física and Instituto de Biotecnología y Biomedicina, Universidad de Valencia, Burjassot, 46100, Spain
- Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
- Carlos Simon Foundation, 46980, Paterna, Valencia, Spain
- Department of Cell and Developmental Biology, Institute for Regenerative Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Igenomix Foundation, 46980, Paterna, Valencia, Spain
| | - Estefania Mancini
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, 08003, Spain
| | - Joan Galcerán
- Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain
- Centro de Investigación Biomédica en Red sobre Enfermedades Raras (CIBERER), 28029, Madrid, Spain
| | - Juan Valcárcel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010, Barcelona, Spain
| | - Isabel Fariñas
- Departamento de Biología Celular, Biología Funcional y Antropología Física and Instituto de Biotecnología y Biomedicina, Universidad de Valencia, Burjassot, 46100, Spain
- Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), 28029, Madrid, Spain
| | - M Angela Nieto
- Instituto de Neurociencias (CSIC-UMH), Sant Joan d'Alacant, 03550, Spain.
- Centro de Investigación Biomédica en Red sobre Enfermedades Raras (CIBERER), 28029, Madrid, Spain.
| |
Collapse
|
8
|
Wang M, Ali H, Xu Y, Xie J, Xu S. BiPSTP: Sequence feature encoding method for identifying different RNA modifications with bidirectional position-specific trinucleotides propensities. J Biol Chem 2024; 300:107140. [PMID: 38447795 PMCID: PMC10997841 DOI: 10.1016/j.jbc.2024.107140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/17/2024] [Accepted: 02/25/2024] [Indexed: 03/08/2024] Open
Abstract
RNA modification, a posttranscriptional regulatory mechanism, significantly influences RNA biogenesis and function. The accurate identification of modification sites is paramount for investigating their biological implications. Methods for encoding RNA sequence into numerical data play a crucial role in developing robust models for predicting modification sites. However, existing techniques suffer from limitations, including inadequate information representation, challenges in effectively integrating positional and sequential information, and the generation of irrelevant or redundant features when combining multiple approaches. These deficiencies hinder the effectiveness of machine learning models in addressing the performance challenges associated with predicting RNA modification sites. Here, we introduce a novel RNA sequence feature representation method, named BiPSTP, which utilizes bidirectional trinucleotide position-specific propensities. We employ the parameter ξ to denote the interval between the current nucleotide and its adjacent forward or backward dinucleotide, enabling the extraction of positional and sequential information from RNA sequences. Leveraging the BiPSTP method, we have developed the prediction model mRNAPred using support vector machine classifier to identify multiple types of RNA modification sites. We evaluate the performance of our BiPSTP method and mRNAPred model across 12 distinct RNA modification types. Our experimental results demonstrate the superiority of the mRNAPred model compared to state-of-art models in the domain of RNA modification sites identification. Importantly, our BiPSTP method enhances the robustness and generalization performance of prediction models. Notably, it can be applied to feature extraction from DNA sequences to predict other biological modification sites.
Collapse
Affiliation(s)
- Mingzhao Wang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Haider Ali
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yandi Xu
- School of Computer Science, Shaanxi Normal University, Xi'an, China; College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Juanying Xie
- School of Computer Science, Shaanxi Normal University, Xi'an, China.
| | - Shengquan Xu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China.
| |
Collapse
|
9
|
Meng Q, Schatten H, Zhou Q, Chen J. Crosstalk between m6A and coding/non-coding RNA in cancer and detection methods of m6A modification residues. Aging (Albany NY) 2023; 15:6577-6619. [PMID: 37437245 PMCID: PMC10373953 DOI: 10.18632/aging.204836] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 06/15/2023] [Indexed: 07/14/2023]
Abstract
N6-methyladenosine (m6A) is one of the most common and well-known internal RNA modifications that occur on mRNAs or ncRNAs. It affects various aspects of RNA metabolism, including splicing, stability, translocation, and translation. An abundance of evidence demonstrates that m6A plays a crucial role in various pathological and biological processes, especially in tumorigenesis and tumor progression. In this article, we introduce the potential functions of m6A regulators, including "writers" that install m6A marks, "erasers" that demethylate m6A, and "readers" that determine the fate of m6A-modified targets. We have conducted a review on the molecular functions of m6A, focusing on both coding and noncoding RNAs. Additionally, we have compiled an overview of the effects noncoding RNAs have on m6A regulators and explored the dual roles of m6A in the development and advancement of cancer. Our review also includes a detailed summary of the most advanced databases for m6A, state-of-the-art experimental and sequencing detection methods, and machine learning-based computational predictors for identifying m6A sites.
Collapse
Affiliation(s)
- Qingren Meng
- National Clinical Research Center for Infectious Diseases, Shenzhen Third People’s Hospital, The Second Hospital Affiliated with the Southern University of Science and Technology, Shenzhen, Guangdong Province, China
| | - Heide Schatten
- Department of Veterinary Pathobiology, University of Missouri, Columbia, MO 65211, USA
| | - Qian Zhou
- International Cancer Center, Shenzhen University Medical School, Shenzhen, Guangdong Province, China
| | - Jun Chen
- National Clinical Research Center for Infectious Diseases, Shenzhen Third People’s Hospital, The Second Hospital Affiliated with the Southern University of Science and Technology, Shenzhen, Guangdong Province, China
| |
Collapse
|
10
|
Cheng J, Li G, Wang W, Stovall DB, Sui G, Li D. Circular RNAs with protein-coding ability in oncogenesis. Biochim Biophys Acta Rev Cancer 2023; 1878:188909. [PMID: 37172651 DOI: 10.1016/j.bbcan.2023.188909] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 05/08/2023] [Accepted: 05/08/2023] [Indexed: 05/15/2023]
Abstract
As ubiquitously expressed transcripts in eukaryotes, circular RNAs (circRNAs) are covalently closed and lack a 5'-cap and 3'-polyadenylation (poly (A)) tail. Initially, circRNAs were considered non-coding RNA (ncRNA), and their roles as sponging molecules to adsorb microRNAs have been extensively reported. However, in recent years, accumulating evidence has demonstrated that circRNAs could encode functional polypeptides through the initiation of translation mediated by internal ribosomal entry sites (IRESs) or N6-methyladenosine (m6A). In this review, we collectively discuss the biogenesis, cognate mRNA products, regulatory mechanisms, aberrant expression and biological phenotypes or clinical relevance of all currently reported, cancer-relevant protein-coding circRNAs. Overall, we provide a comprehensive overview of circRNA-encoded proteins and their physiological and pathological functions.
Collapse
Affiliation(s)
- Jiahui Cheng
- College of Life Science, Northeast Forestry University, Harbin 150040, China
| | - Guangyue Li
- College of Life Science, Northeast Forestry University, Harbin 150040, China
| | - Wenmeng Wang
- College of Life Science, Northeast Forestry University, Harbin 150040, China
| | - Daniel B Stovall
- College of Arts and Sciences, Winthrop University, Rock Hill, SC 29733, United States
| | - Guangchao Sui
- College of Life Science, Northeast Forestry University, Harbin 150040, China.
| | - Dangdang Li
- College of Life Science, Northeast Forestry University, Harbin 150040, China.
| |
Collapse
|
11
|
Acera Mateos P, Zhou Y, Zarnack K, Eyras E. Concepts and methods for transcriptome-wide prediction of chemical messenger RNA modifications with machine learning. Brief Bioinform 2023; 24:7150742. [PMID: 37139545 DOI: 10.1093/bib/bbad163] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 03/03/2023] [Indexed: 05/05/2023] Open
Abstract
The expanding field of epitranscriptomics might rival the epigenome in the diversity of biological processes impacted. In recent years, the development of new high-throughput experimental and computational techniques has been a key driving force in discovering the properties of RNA modifications. Machine learning applications, such as for classification, clustering or de novo identification, have been critical in these advances. Nonetheless, various challenges remain before the full potential of machine learning for epitranscriptomics can be leveraged. In this review, we provide a comprehensive survey of machine learning methods to detect RNA modifications using diverse input data sources. We describe strategies to train and test machine learning methods and to encode and interpret features that are relevant for epitranscriptomics. Finally, we identify some of the current challenges and open questions about RNA modification analysis, including the ambiguity in predicting RNA modifications in transcript isoforms or in single nucleotides, or the lack of complete ground truth sets to test RNA modifications. We believe this review will inspire and benefit the rapidly developing field of epitranscriptomics in addressing the current limitations through the effective use of machine learning.
Collapse
Affiliation(s)
- Pablo Acera Mateos
- EMBL Australia Partner Laboratory Network at the Australian National University, Canberra, Australia
- The Shine-Dalgarno Centre for RNA Innovation, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
- The Centre for Computational Biomedical Sciences, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - You Zhou
- Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
- Institute of Molecular Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
| | - Kathi Zarnack
- Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
- Institute of Molecular Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 15, 60438 Frankfurt a.M., Germany
| | - Eduardo Eyras
- EMBL Australia Partner Laboratory Network at the Australian National University, Canberra, Australia
- The Shine-Dalgarno Centre for RNA Innovation, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
- The Centre for Computational Biomedical Sciences, The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| |
Collapse
|
12
|
Wang R, Chung CR, Huang HD, Lee TY. Identification of species-specific RNA N6-methyladinosine modification sites from RNA sequences. Brief Bioinform 2023; 24:7008797. [PMID: 36715277 DOI: 10.1093/bib/bbac573] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/11/2022] [Accepted: 11/24/2022] [Indexed: 01/31/2023] Open
Abstract
N6-methyladinosine (m6A) modification is the most abundant co-transcriptional modification in eukaryotic RNA and plays important roles in cellular regulation. Traditional high-throughput sequencing experiments used to explore functional mechanisms are time-consuming and labor-intensive, and most of the proposed methods focused on limited species types. To further understand the relevant biological mechanisms among different species with the same RNA modification, it is necessary to develop a computational scheme that can be applied to different species. To achieve this, we proposed an attention-based deep learning method, adaptive-m6A, which consists of convolutional neural network, bi-directional long short-term memory and an attention mechanism, to identify m6A sites in multiple species. In addition, three conventional machine learning (ML) methods, including support vector machine, random forest and logistic regression classifiers, were considered in this work. In addition to the performance of ML methods for multi-species prediction, the optimal performance of adaptive-m6A yielded an accuracy of 0.9832 and the area under the receiver operating characteristic curve of 0.98. Moreover, the motif analysis and cross-validation among different species were conducted to test the robustness of one model towards multiple species, which helped improve our understanding about the sequence characteristics and biological functions of RNA modifications in different species.
Collapse
Affiliation(s)
- Rulan Wang
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, 51872, Shenzhen, P.R. China
| | - Chia-Ru Chung
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, 51872, Shenzhen, P.R. China
- School of Life Sciences, University of Science and Technology of China, 230026, Hefei, Anhui, P.R. China
| | - Hsien-Da Huang
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, 51872, Shenzhen, P.R. China
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, 51872, Shenzhen, P.R. China
| | - Tzong-Yi Lee
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, 51872, Shenzhen, P.R. China
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, Longgang District, 51872, Shenzhen, P.R. China
| |
Collapse
|
13
|
M6A-BERT-Stacking: A Tissue-Specific Predictor for Identifying RNA N6-Methyladenosine Sites Based on BERT and Stacking Strategy. Symmetry (Basel) 2023. [DOI: 10.3390/sym15030731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023] Open
Abstract
As the most abundant RNA methylation modification, N6-methyladenosine (m6A) could regulate asymmetric and symmetric division of hematopoietic stem cells and play an important role in various diseases. Therefore, the precise identification of m6A sites around the genomes of different species is a critical step to further revealing their biological functions and influence on these diseases. However, the traditional wet-lab experimental methods for identifying m6A sites are often laborious and expensive. In this study, we proposed an ensemble deep learning model called m6A-BERT-Stacking, a powerful predictor for the detection of m6A sites in various tissues of three species. First, we utilized two encoding methods, i.e., di ribonucleotide index of RNA (DiNUCindex_RNA) and k-mer word segmentation, to extract RNA sequence features. Second, two encoding matrices together with the original sequences were respectively input into three different deep learning models in parallel to train three sub-models, namely residual networks with convolutional block attention module (Resnet-CBAM), bidirectional long short-term memory with attention (BiLSTM-Attention), and pre-trained bidirectional encoder representations from transformers model for DNA-language (DNABERT). Finally, the outputs of all sub-models were ensembled based on the stacking strategy to obtain the final prediction of m6A sites through the fully connected layer. The experimental results demonstrated that m6A-BERT-Stacking outperformed most of the existing methods based on the same independent datasets.
Collapse
|
14
|
Taguchi YH. Bioinformatic tools for epitranscriptomics. Am J Physiol Cell Physiol 2023; 324:C447-C457. [PMID: 36468841 DOI: 10.1152/ajpcell.00437.2022] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 11/17/2022] [Accepted: 11/22/2022] [Indexed: 12/12/2022]
Abstract
The epitranscriptome, defined as RNA modifications that do not involve alterations in the nucleotide sequence, is a popular topic in the genomic sciences. Because we need massive computational techniques to identify epitranscriptomes within individual transcripts, many tools have been developed to infer epitranscriptomic sites as well as to process datasets using high-throughput sequencing. In this review, we summarize recent developments in epitranscriptome spatial detection and data analysis and discuss their progression.
Collapse
Affiliation(s)
- Y-H Taguchi
- Department of Physics, Chuo University, Tokyo, Japan
| |
Collapse
|
15
|
Zhang S, Wang J, Li X, Liang Y. M6A-GSMS: Computational identification of N 6-methyladenosine sites with GBDT and stacking learning in multiple species. J Biomol Struct Dyn 2022; 40:12380-12391. [PMID: 34459713 DOI: 10.1080/07391102.2021.1970628] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
N6-methyladenosine (m6A) is one of the most abundant forms of RNA methylation modifications currently known. It involves a wide range of biological processes, including degradation, stability, alternative splicing, etc. Therefore, the development of convenient and efficient m6A prediction technologies are urgent. In this work, a novel predictor based on GBDT and stacking learning is developed to identify m6A sites, which is called M6A-GSMS. To achieve accurate prediction, we explore RNA sequence information from four aspects: correlation, structure, physicochemical properties and pseudo ribonucleic acid composition. After using the GBDT algorithm for feature selection, a stacking model is constructed by combining seven basic classifiers. Compared with other state-of-the-art methods, the results show that M6A-GSMS can obtain excellent performance for identifying the m6A sites. The prediction accuracy of A.thaliana, D.melanogaster, M.musculus, S.cerevisiae and Human reaches 88.4%, 60.8%, 80.5%, 92.4% and 61.8%, respectively. This method provides an effective prediction for the investigation of m6A sites. In addition, all the datasets and codes are currently available at https://github.com/Wang-Jinyue/M6A-GSMS.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, P. R. China
| | - Jinyue Wang
- School of Mathematics and Statistics, Xidian University, Xi'an, P. R. China
| | - Xinjie Li
- School of Mathematics and Statistics, Xidian University, Xi'an, P. R. China
| | - Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an, P. R. China
| |
Collapse
|
16
|
Zou J, Liu H, Tan W, Chen YQ, Dong J, Bai SY, Wu ZX, Zeng Y. Dynamic regulation and key roles of ribonucleic acid methylation. Front Cell Neurosci 2022; 16:1058083. [PMID: 36601431 PMCID: PMC9806184 DOI: 10.3389/fncel.2022.1058083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Ribonucleic acid (RNA) methylation is the most abundant modification in biological systems, accounting for 60% of all RNA modifications, and affects multiple aspects of RNA (including mRNAs, tRNAs, rRNAs, microRNAs, and long non-coding RNAs). Dysregulation of RNA methylation causes many developmental diseases through various mechanisms mediated by N 6-methyladenosine (m6A), 5-methylcytosine (m5C), N 1-methyladenosine (m1A), 5-hydroxymethylcytosine (hm5C), and pseudouridine (Ψ). The emerging tools of RNA methylation can be used as diagnostic, preventive, and therapeutic markers. Here, we review the accumulated discoveries to date regarding the biological function and dynamic regulation of RNA methylation/modification, as well as the most popularly used techniques applied for profiling RNA epitranscriptome, to provide new ideas for growth and development.
Collapse
Affiliation(s)
- Jia Zou
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
| | - Hui Liu
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
| | - Wei Tan
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China
| | - Yi-qi Chen
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
| | - Jing Dong
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
| | - Shu-yuan Bai
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China
| | - Zhao-xia Wu
- Community Health Service Center, Wuchang Hospital, Wuhan, China
| | - Yan Zeng
- Community Health Service Center, Geriatric Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, China,Brain Science and Advanced Technology Institute, School of Medicine, Wuhan University of Science and Technology, Wuhan, China,School of Public Health, Wuhan University of Science and Technology, Wuhan, China,*Correspondence: Yan Zeng,
| |
Collapse
|
17
|
Luo Z, Lou L, Qiu W, Xu Z, Xiao X. Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning. Int J Mol Sci 2022; 23:15490. [PMID: 36555143 PMCID: PMC9778682 DOI: 10.3390/ijms232415490] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/03/2022] [Accepted: 12/05/2022] [Indexed: 12/13/2022] Open
Abstract
N6-methyladenosine (m6A) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA m6A transcriptome-wide at base resolution via experimental approaches, which are generally time-consuming and expensive. Developing computational methods is a good strategy for accurate in silico detection of m6A modification sites from the large amount of RNA sequence data. Unfortunately, the existing computational models are usually only for m6A site prediction in a single species, without considering the tissue level of species, while most of them are constructed based on low-confidence level data generated by an m6A antibody immunoprecipitation (IP)-based sequencing method, thereby restricting reliability and generalizability of proposed models. Here, we review recent advances in computational prediction of m6A sites and construct a new computational approach named im6APred using ensemble deep learning to accurately identify m6A sites based on high-confidence level data in multiple tissues of mammals. Our model im6APred builds upon a comprehensive evaluation of multiple classification methods, including four traditional classification algorithms and three deep learning methods and their ensembles. The optimal base-classifier combinations are then chosen by five-fold cross-validation test to achieve an effective stacked model. Our model im6APred can produce the area under the receiver operating characteristic curve (AUROC) in the range of 0.82-0.91 on independent tests, indicating that our model has the ability to learn general methylation rules on RNA bases and generalize to m6A transcriptome-wide identification. Moreover, AUROCs in the range of 0.77-0.96 were achieved using cross-species/tissues validation on the benchmark dataset, demonstrating differences in predictive performance at the tissue level and the need for constructing tissue-specific models for m6A site prediction.
Collapse
Affiliation(s)
| | | | | | - Zhaochun Xu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China
| | - Xuan Xiao
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China
| |
Collapse
|
18
|
Liao J, Wang Q, Wu F, Huang Z. In Silico Methods for Identification of Potential Active Sites of Therapeutic Targets. Molecules 2022; 27:7103. [PMID: 36296697 PMCID: PMC9609013 DOI: 10.3390/molecules27207103] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/12/2022] [Accepted: 08/25/2022] [Indexed: 07/30/2023] Open
Abstract
Target identification is an important step in drug discovery, and computer-aided drug target identification methods are attracting more attention compared with traditional drug target identification methods, which are time-consuming and costly. Computer-aided drug target identification methods can greatly reduce the searching scope of experimental targets and associated costs by identifying the diseases-related targets and their binding sites and evaluating the druggability of the predicted active sites for clinical trials. In this review, we introduce the principles of computer-based active site identification methods, including the identification of binding sites and assessment of druggability. We provide some guidelines for selecting methods for the identification of binding sites and assessment of druggability. In addition, we list the databases and tools commonly used with these methods, present examples of individual and combined applications, and compare the methods and tools. Finally, we discuss the challenges and limitations of binding site identification and druggability assessment at the current stage and provide some recommendations and future perspectives.
Collapse
Affiliation(s)
- Jianbo Liao
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan 523808, China
| | - Qinyu Wang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
| | - Fengxu Wu
- Hubei Key Laboratory of Wudang Local Chinese Medicine Research, School of Pharmaceutical Sciences, Hubei University of Medicine, Shiyan 442000, China
| | - Zunnan Huang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang 524023, China
| |
Collapse
|
19
|
Wang H, Zhao S, Cheng Y, Bi S, Zhu X. MTDeepM6A-2S: A two-stage multi-task deep learning method for predicting RNA N6-methyladenosine sites of Saccharomyces cerevisiae. Front Microbiol 2022; 13:999506. [PMID: 36274691 PMCID: PMC9579691 DOI: 10.3389/fmicb.2022.999506] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/16/2022] [Indexed: 11/13/2022] Open
Abstract
N6-methyladenosine (m6A) is one of the most important RNA modifications, which is involved in many biological activities. Computational methods have been developed to detect m6A sites due to their high efficiency and low costs. As one of the most widely utilized model organisms, many methods have been developed for predicting m6A sites of Saccharomyces cerevisiae. However, the generalization of these methods was hampered by the limited size of the benchmark datasets. On the other hand, over 60,000 low resolution m6A sites and more than 10,000 base resolution m6A sites of Saccharomyces cerevisiae are recorded in RMBase and m6A-Atlas, respectively. The base resolution m6A sites are often obtained from low resolution results by post calibration. In view of these, we proposed a two-stage deep learning method, named MTDeepM6A-2S, to predict RNA m6A sites of Saccharomyces cerevisiae based on RNA sequence information. In the first stage, a multi-task model with convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) deep framework was built to not only detect the low resolution m6A sites but also assign a reasonable probability for the predicted site. In the second stage, a transfer-learning strategy was used to build the model to predict the base resolution m6A sites from those low resolution m6A sites. The effectiveness of our model was validated on both training and independent test sets. The results show that our model outperforms other state-of-the-art models on the independent test set, which indicates that our model holds high potential to become a useful tool for epitranscriptomics analysis.
Collapse
|
20
|
PSP-PJMI: An innovative feature representation algorithm for identifying DNA N4-methylcytosine sites. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.05.060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
21
|
Ma L, He LN, Kang S, Gu B, Gao S, Zuo Z. Advances in detecting N6-methyladenosine modification in circRNAs. Methods 2022; 205:234-246. [PMID: 35878749 DOI: 10.1016/j.ymeth.2022.07.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 07/15/2022] [Accepted: 07/18/2022] [Indexed: 12/14/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of noncoding RNAs with covalently single-stranded closed loop structures derived from back-splicing event of linear precursor mRNAs (pre-mRNAs). N6-methyladenosine (m6A), the most abundant epigenetic modification in eukaryotic RNAs, has been shown to play a crucial role in regulating the fate and biological function of circRNAs, and thus affecting various physiological and pathological processes. Accurate identification of m6A modification in circRNAs is an essential step to fully elucidate the crosstalk between m6A and circRNAs. In recent years, the rapid development of high-throughput sequencing technology and bioinformatic methodology has propelled the establishment of a multitude of approaches to detect circRNAs and m6A modification, including in vitro-based and in silico methods. Based on this, the research community has started on a new journey to develop methods for identification of m6A modification in circRNAs. In this review, we provide a comprehensive review and evaluation of the existing methods responsible for detecting circRNAs, m6A modification, and especially, m6A modification in circRNAs, which mainly focused on those developed based on high-throughput technologies and methodology of bioinformatics. This handy reference can help researchers figure out towards which direction this field will go.
Collapse
Affiliation(s)
- Lixia Ma
- State Key Laboratory of Esophageal Cancer Prevention & Treatment, Henan Key Laboratory of Microbiome and Esophageal Cancer Prevention and Treatment, Henan Key Laboratory of Cancer Epigenetics, Cancer Hospital, The First Affiliated Hospital (College of Clinical Medical) of Henan University of Science and Technology, Luoyang, China
| | - Li-Na He
- Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Shiyang Kang
- Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Bianli Gu
- State Key Laboratory of Esophageal Cancer Prevention & Treatment, Henan Key Laboratory of Microbiome and Esophageal Cancer Prevention and Treatment, Henan Key Laboratory of Cancer Epigenetics, Cancer Hospital, The First Affiliated Hospital (College of Clinical Medical) of Henan University of Science and Technology, Luoyang, China
| | - Shegan Gao
- State Key Laboratory of Esophageal Cancer Prevention & Treatment, Henan Key Laboratory of Microbiome and Esophageal Cancer Prevention and Treatment, Henan Key Laboratory of Cancer Epigenetics, Cancer Hospital, The First Affiliated Hospital (College of Clinical Medical) of Henan University of Science and Technology, Luoyang, China.
| | - Zhixiang Zuo
- Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China.
| |
Collapse
|
22
|
CNNLSTMac4CPred: A Hybrid Model for N4-Acetylcytidine Prediction. Interdiscip Sci 2022; 14:439-451. [PMID: 35106702 DOI: 10.1007/s12539-021-00500-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 12/04/2021] [Accepted: 12/13/2021] [Indexed: 12/23/2022]
Abstract
N4-Acetylcytidine (ac4C) is a highly conserved post-transcriptional and an extensively existing RNA modification, playing versatile roles in the cellular processes. Due to the limitation of techniques and knowledge, large-scale identification of ac4C is still a challenging task. RNA sequences are like sentences containing semantics in the natural language. Inspired by the semantics of language, we proposed a hybrid model for ac4C prediction. The model used long short-term memory and convolution neural network to extract the semantic features hidden in the sequences. The semantic and the two traditional features (k-nucleotide frequencies and pseudo tri-tuple nucleotide composition) were combined to represent ac4C or non-ac4C sequences. The eXtreme Gradient Boosting was used as the learning algorithm. Five-fold cross-validation over the training set consisting of 1160 ac4C and 10,855 non-ac4C sequences obtained the area under the receiver operating characteristic curve (AUROC) of 0.9004, and the independent test over 469 ac4C and 4343 non-ac4C sequences reached an AUROC of 0.8825. The model obtained a sensitivity of 0.6474 in the five-fold cross-validation and 0.6290 in the independent test, outperforming two state-of-the-art methods. The performance of semantic features alone was better than those of k-nucleotide frequencies and pseudo tri-tuple nucleotide composition, implying that ac4C sequences are of semantics. The proposed hybrid model was implemented into a user-friendly web-server which is freely available to scientific communities: http://47.113.117.61/ac4c/ . The presented model and tool are beneficial to identify ac4C on large scale.
Collapse
|
23
|
Yu B, Zhang Y, Wang X, Gao H, Sun J, Gao X. Identification of DNA modification sites based on elastic net and bidirectional gated recurrent unit with convolutional neural network. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
24
|
Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties. Int J Mol Sci 2022; 23:ijms23063044. [PMID: 35328461 PMCID: PMC8950657 DOI: 10.3390/ijms23063044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 02/25/2022] [Accepted: 03/09/2022] [Indexed: 12/03/2022] Open
Abstract
Dihydrouridine (D) is an abundant post-transcriptional modification present in transfer RNA from eukaryotes, bacteria, and archaea. D has contributed to treatments for cancerous diseases. Therefore, the precise detection of D modification sites can enable further understanding of its functional roles. Traditional experimental techniques to identify D are laborious and time-consuming. In addition, there are few computational tools for such analysis. In this study, we utilized eleven sequence-derived feature extraction methods and implemented five popular machine algorithms to identify an optimal model. During data preprocessing, data were partitioned for training and testing. Oversampling was also adopted to reduce the effect of the imbalance between positive and negative samples. The best-performing model was obtained through a combination of random forest and nucleotide chemical property modeling. The optimized model presented high sensitivity and specificity values of 0.9688 and 0.9706 in independent tests, respectively. Our proposed model surpassed published tools in independent tests. Furthermore, a series of validations across several aspects was conducted in order to demonstrate the robustness and reliability of our model.
Collapse
|
25
|
Wang H, Wang S, Zhang Y, Bi S, Zhu X. A brief review of machine learning methods for RNA methylation sites prediction. Methods 2022; 203:399-421. [DOI: 10.1016/j.ymeth.2022.03.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 02/15/2022] [Accepted: 03/01/2022] [Indexed: 02/07/2023] Open
|
26
|
Arif M, Ahmed S, Ge F, Kabir M, Khan YD, Yu DJ, Thafar M. StackACPred: Prediction of anticancer peptides by integrating optimized multiple feature descriptors with stacked ensemble approach. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS 2022; 220:104458. [DOI: 10.1016/j.chemolab.2021.104458] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
|
27
|
Cui C, Wu X, Zhou Y. GlyinsRNA: a webserver for predicting glycosylation sites on small RNAs. RNA Biol 2021; 18:600-603. [PMID: 34559595 DOI: 10.1080/15476286.2021.1982574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
Abstract
Versatile RNA modifications play important roles in post-transcriptional regulations of gene expression, among which glycosylation modifications on small RNAs emerge as a novel clade whose characteristics need further interrogations. Here, we demonstrated that the sequence pattern around RNA glycosylation sites was not random and could be exploited for glycosylation site prediction. A machine learning predictor, GlyinsRNA, which integrated multiple RNA sequence representation encodings, was established. GlyinsRNA achieved AUROC (area under the receiver operating characteristic curve) of 0.7933 and 0.7979 in five-fold cross-validation and independent tests, respectively. GlyinsRNA was implemented as an online webserver, where both the predicted glycosylation sites and the overrepresented RNA-binding protein (RBP)-related motifs were annotated to facilitate the users. GlyinsRNA webserver is freely available at http://www.rnanut.net/glyinsrna.
Collapse
Affiliation(s)
- Chunmei Cui
- Department of Biomedical Informatics, Moe Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Xiaobin Wu
- Department of Biomedical Informatics, Moe Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Yuan Zhou
- Department of Biomedical Informatics, Moe Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| |
Collapse
|
28
|
Wang X, Lin X, Wang R, Han N, Fan K, Han L, Ding Z. A Feature Fusion Predictor for RNA Pseudouridine Sites with Particle Swarm Optimizer Based Feature Selection and Ensemble Learning Approach. Curr Issues Mol Biol 2021; 43:1844-1858. [PMID: 34889887 PMCID: PMC8929013 DOI: 10.3390/cimb43030129] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 10/17/2021] [Accepted: 10/19/2021] [Indexed: 01/28/2023] Open
Abstract
RNA pseudouridine modification is particularly important in a variety of cellular biological and physiological processes. It plays a significant role in understanding RNA functions, RNA structure stabilization, translation processes, etc. To understand its functional mechanisms, it is necessary to accurately identify pseudouridine sites in RNA sequences. Although some computational methods have been proposed for the identification of pseudouridine sites, it is still a challenge to improve the identification accuracy and generalization ability. To address this challenge, a novel feature fusion predictor, named PsoEL-PseU, is proposed for the prediction of pseudouridine sites. Firstly, this study systematically and comprehensively explored different types of feature descriptors and determined six feature descriptors with various properties. To improve the feature representation ability, a binary particle swarm optimizer was used to capture the optimal feature subset for six feature descriptors. Secondly, six individual predictors were trained by using the six optimal feature subsets. Finally, to fuse the effects of all six features, six individual predictors were fused into an ensemble predictor by a parallel fusion strategy. Ten-fold cross-validation on three benchmark datasets indicated that the PsoEL-PseU predictor significantly outperformed the current state-of-the-art predictors. Additionally, the new predictor achieved better accuracy in the independent dataset evaluation-accuracy which is significantly higher than that of its existing counterparts-and the user-friendly webserver developed by the PsoEL-PseU predictor has been made freely accessible.
Collapse
Affiliation(s)
- Xiao Wang
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; (X.L.); (R.W.); (N.H.); (L.H.); (Z.D.)
- Correspondence:
| | - Xi Lin
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; (X.L.); (R.W.); (N.H.); (L.H.); (Z.D.)
| | - Rong Wang
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; (X.L.); (R.W.); (N.H.); (L.H.); (Z.D.)
| | - Nijia Han
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; (X.L.); (R.W.); (N.H.); (L.H.); (Z.D.)
| | - Kaiqi Fan
- School of Material and Chemical Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China;
| | - Lijun Han
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; (X.L.); (R.W.); (N.H.); (L.H.); (Z.D.)
| | - Zhaoyuan Ding
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; (X.L.); (R.W.); (N.H.); (L.H.); (Z.D.)
| |
Collapse
|
29
|
Zhou Y, Yang J, Tian Z, Zeng J, Shen W. Research progress concerning m 6A methylation and cancer. Oncol Lett 2021; 22:775. [PMID: 34589154 PMCID: PMC8442141 DOI: 10.3892/ol.2021.13036] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 08/20/2021] [Indexed: 12/12/2022] Open
Abstract
N6-methyladenosine (m6A) methylation is a type of methylation modification on RNA molecules, which was first discovered in 1974, and has become a hot topic in life science in recent years. m6A modification is an epigenetic regulation similar to DNA and histone modification and is dynamically reversible in mammalian cells. This chemical marker of RNA is produced by m6A 'writers' (methylase) and can be degraded by m6A 'erasers' (demethylase). Methylated reading protein is the 'reader', that can recognize the mRNA containing m6A and regulate the expression of downstream genes accordingly. m6A methylation is involved in all stages of the RNA life cycle, including RNA processing, nuclear export, translation and regulation of RNA degradation, indicating that m6A plays a crucial role in RNA metabolism. Recent studies have shown that m6A modification is a complicated regulatory network in different cell lines, tissues and spatio-temporal models, and m6A methylation is associated with the occurrence and development of tumors. The present review describes the regulatory mechanism and physiological functions of m6A methylation, and its research progress in several types of human tumor, to provide novel approaches for early diagnosis and targeted treatment of cancer.
Collapse
Affiliation(s)
- Yang Zhou
- Department of Cell Biology, School of Medicine of Yangzhou University, Yangzhou, Jiangsu 225000, P.R. China
| | - Jie Yang
- Department of Cell Biology, School of Medicine of Yangzhou University, Yangzhou, Jiangsu 225000, P.R. China
| | - Zheng Tian
- Department of Cell Biology, School of Medicine of Yangzhou University, Yangzhou, Jiangsu 225000, P.R. China
| | - Jing Zeng
- Department of Cell Biology, School of Medicine of Yangzhou University, Yangzhou, Jiangsu 225000, P.R. China
| | - Weigan Shen
- Department of Cell Biology, School of Medicine of Yangzhou University, Yangzhou, Jiangsu 225000, P.R. China
| |
Collapse
|
30
|
BERT-m7G: A Transformer Architecture Based on BERT and Stacking Ensemble to Identify RNA N7-Methylguanosine Sites from Sequence Information. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:7764764. [PMID: 34484416 PMCID: PMC8413034 DOI: 10.1155/2021/7764764] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 08/13/2021] [Indexed: 01/19/2023]
Abstract
As one of the most prevalent posttranscriptional modifications of RNA, N7-methylguanosine (m7G) plays an essential role in the regulation of gene expression. Accurate identification of m7G sites in the transcriptome is invaluable for better revealing their potential functional mechanisms. Although high-throughput experimental methods can locate m7G sites precisely, they are overpriced and time-consuming. Hence, it is imperative to design an efficient computational method that can accurately identify the m7G sites. In this study, we propose a novel method via incorporating BERT-based multilingual model in bioinformatics to represent the information of RNA sequences. Firstly, we treat RNA sequences as natural sentences and then employ bidirectional encoder representations from transformers (BERT) model to transform them into fixed-length numerical matrices. Secondly, a feature selection scheme based on the elastic net method is constructed to eliminate redundant features and retain important features. Finally, the selected feature subset is input into a stacking ensemble classifier to predict m7G sites, and the hyperparameters of the classifier are tuned with tree-structured Parzen estimator (TPE) approach. By 10-fold cross-validation, the performance of BERT-m7G is measured with an ACC of 95.48% and an MCC of 0.9100. The experimental results indicate that the proposed method significantly outperforms state-of-the-art prediction methods in the identification of m7G modifications.
Collapse
|
31
|
Wang M, Xie J, Xu S. M6A-BiNP: predicting N 6-methyladenosine sites based on bidirectional position-specific propensities of polynucleotides and pointwise joint mutual information. RNA Biol 2021; 18:2498-2512. [PMID: 34161188 PMCID: PMC8632114 DOI: 10.1080/15476286.2021.1930729] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
N6-methyladenosine (m6A) plays an important role in various biological processes. Identifying m6A site is a key step in exploring its biological functions. One of the biggest challenges in identifying m6A sites is how to extract features comprising rich categorical information to distinguish m6A and non-m6A sites. To address this challenge, we propose bidirectional dinucleotide and trinucleotide position-specific propensities, respectively, in this paper. Based on this, we propose two feature-encoding algorithms: Position-Specific Propensities and Pointwise Mutual Information (PSP-PMI) and Position-Specific Propensities and Pointwise Joint Mutual Information (PSP-PJMI). PSP-PMI is based on the bidirectional dinucleotide propensity and the pointwise mutual information, while PSP-PJMI is based on the bidirectional trinucleotide position-specific propensity and the proposed pointwise joint mutual information in this paper. We introduce parameters α and β in PSP-PMI and PSP-PJMI, respectively, to represent the distance from the nucleotide to its forward or backward adjacent nucleotide or dinucleotide, so as to extract features containing local and global classification information. Finally, we propose the M6A-BiNP predictor based on PSP-PMI or PSP-PJMI and SVM classifier. The 10-fold cross-validation experimental results on the benchmark datasets of non-single-base resolution and single-base resolution demonstrate that PSP-PMI and PSP-PJMI can extract features with strong capabilities to identify m6A and non-m6A sites. The M6A-BiNP predictor based on our proposed feature encoding algorithm PSP-PJMI is better than the state-of-the-art predictors, and it is so far the best model to identify m6A and non-m6A sites.
Collapse
Affiliation(s)
- Mingzhao Wang
- College of Life Sciences, Shaanxi Normal University, Xi'an, China.,School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Juanying Xie
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Shengquan Xu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
32
|
Wang Y, Guo R, Huang L, Yang S, Hu X, He K. m6AGE: A Predictor for N6-Methyladenosine Sites Identification Utilizing Sequence Characteristics and Graph Embedding-Based Geometrical Information. Front Genet 2021; 12:670852. [PMID: 34122525 PMCID: PMC8191635 DOI: 10.3389/fgene.2021.670852] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 04/29/2021] [Indexed: 11/30/2022] Open
Abstract
N6-methyladenosine (m6A) is one of the most prevalent RNA post-transcriptional modifications and is involved in various vital biological processes such as mRNA splicing, exporting, stability, and so on. Identifying m6A sites contributes to understanding the functional mechanism and biological significance of m6A. The existing biological experimental methods for identifying m6A sites are time-consuming and costly. Thus, developing a high confidence computational method is significant to explore m6A intrinsic characters. In this study, we propose a predictor called m6AGE which utilizes sequence-derived and graph embedding features. To the best of our knowledge, our predictor is the first to combine sequence-derived features and graph embeddings for m6A site prediction. Comparison results show that our proposed predictor achieved the best performance compared with other predictors on four public datasets across three species. On the A101 dataset, our predictor outperformed 1.34% (accuracy), 0.0227 (Matthew's correlation coefficient), 5.63% (specificity), and 0.0081 (AUC) than comparing predictors, which indicates that m6AGE is a useful tool for m6A site prediction. The source code of m6AGE is available at https://github.com/bokunoBike/m6AGE.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Rui Guo
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China
| | - Lan Huang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China
| | - Sen Yang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China
| | - Xuemei Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China
| | - Kai He
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China
| |
Collapse
|
33
|
Epigenetics: Roles and therapeutic implications of non-coding RNA modifications in human cancers. MOLECULAR THERAPY. NUCLEIC ACIDS 2021; 25:67-82. [PMID: 34188972 PMCID: PMC8217334 DOI: 10.1016/j.omtn.2021.04.021] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
As next-generation sequencing (NGS) is leaping forward, more than 160 covalent RNA modification processes have been reported, and they are widely present in every organism and overall RNA type. Many modification processes of RNA introduce a new layer to the gene regulation process, resulting in novel RNA epigenetics. The commonest RNA modification includes pseudouridine (Ψ), N 7-methylguanosine (m7G), 5-hydroxymethylcytosine (hm5C), 5-methylcytosine (m5C), N 1-methyladenosine (m1A), N 6-methyladenosine (m6A), and others. In this study, we focus on non-coding RNAs (ncRNAs) to summarize the epigenetic consequences of RNA modifications, and the pathogenesis of cancer, as diagnostic markers and therapeutic targets for cancer, as well as the mechanisms affecting the immune environment of cancer. In addition, we summarize the current status of epigenetic drugs for tumor therapy based on ncRNA modifications and the progress of bioinformatics methods in elucidating RNA modifications in recent years.
Collapse
|
34
|
Zhang L, Qin X, Liu M, Xu Z, Liu G. DNN-m6A: A Cross-Species Method for Identifying RNA N6-Methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion. Genes (Basel) 2021; 12:354. [PMID: 33670877 PMCID: PMC7997228 DOI: 10.3390/genes12030354] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 02/22/2021] [Accepted: 02/25/2021] [Indexed: 12/16/2022] Open
Abstract
As a prevalent existing post-transcriptional modification of RNA, N6-methyladenosine (m6A) plays a crucial role in various biological processes. To better radically reveal its regulatory mechanism and provide new insights for drug design, the accurate identification of m6A sites in genome-wide is vital. As the traditional experimental methods are time-consuming and cost-prohibitive, it is necessary to design a more efficient computational method to detect the m6A sites. In this study, we propose a novel cross-species computational method DNN-m6A based on the deep neural network (DNN) to identify m6A sites in multiple tissues of human, mouse and rat. Firstly, binary encoding (BE), tri-nucleotide composition (TNC), enhanced nucleic acid composition (ENAC), K-spaced nucleotide pair frequencies (KSNPFs), nucleotide chemical property (NCP), pseudo dinucleotide composition (PseDNC), position-specific nucleotide propensity (PSNP) and position-specific dinucleotide propensity (PSDP) are employed to extract RNA sequence features which are subsequently fused to construct the initial feature vector set. Secondly, we use elastic net to eliminate redundant features while building the optimal feature subset. Finally, the hyper-parameters of DNN are tuned with Bayesian hyper-parameter optimization based on the selected feature subset. The five-fold cross-validation test on training datasets show that the proposed DNN-m6A method outperformed the state-of-the-art method for predicting m6A sites, with an accuracy (ACC) of 73.58%-83.38% and an area under the curve (AUC) of 81.39%-91.04%. Furthermore, the independent datasets achieved an ACC of 72.95%-83.04% and an AUC of 80.79%-91.09%, which shows an excellent generalization ability of our proposed method.
Collapse
Affiliation(s)
- Lu Zhang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China; (L.Z.); (X.Q.); (M.L.)
| | - Xinyi Qin
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China; (L.Z.); (X.Q.); (M.L.)
| | - Min Liu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China; (L.Z.); (X.Q.); (M.L.)
| | - Ziwei Xu
- Polytech Nantes, Bâtiment Ireste, 44300 Nantes, France;
| | - Guangzhong Liu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China; (L.Z.); (X.Q.); (M.L.)
| |
Collapse
|
35
|
Zhuang J, Liu D, Lin M, Qiu W, Liu J, Chen S. PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm. Front Genet 2021; 12:773882. [PMID: 34868261 PMCID: PMC8637112 DOI: 10.3389/fgene.2021.773882] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 10/04/2021] [Indexed: 11/16/2022] Open
Abstract
Background: Pseudouridine (Ψ) is a common ribonucleotide modification that plays a significant role in many biological processes. The identification of Ψ modification sites is of great significance for disease mechanism and biological processes research in which machine learning algorithms are desirable as the lab exploratory techniques are expensive and time-consuming. Results: In this work, we propose a deep learning framework, called PseUdeep, to identify Ψ sites of three species: H. sapiens, S. cerevisiae, and M. musculus. In this method, three encoding methods are used to extract the features of RNA sequences, that is, one-hot encoding, K-tuple nucleotide frequency pattern, and position-specific nucleotide composition. The three feature matrices are convoluted twice and fed into the capsule neural network and bidirectional gated recurrent unit network with a self-attention mechanism for classification. Conclusion: Compared with other state-of-the-art methods, our model gets the highest accuracy of the prediction on the independent testing data set S-200; the accuracy improves 12.38%, and on the independent testing data set H-200, the accuracy improves 0.68%. Moreover, the dimensions of the features we derive from the RNA sequences are only 109,109, and 119 in H. sapiens, M. musculus, and S. cerevisiae, which is much smaller than those used in the traditional algorithms. On evaluation via tenfold cross-validation and two independent testing data sets, PseUdeep outperforms the best traditional machine learning model available. PseUdeep source code and data sets are available at https://github.com/dan111262/PseUdeep.
Collapse
Affiliation(s)
- Jujuan Zhuang
- College of Science, Dalian Maritime University, Dalian, China
| | - Danyang Liu
- College of Science, Dalian Maritime University, Dalian, China
| | - Meng Lin
- College of Science, Dalian Maritime University, Dalian, China
| | - Wenjing Qiu
- Electrical and Information Engineering, Anhui University of Technology, Anhui, China
- Geneis (Beijing) Co., Ltd., Beijing, China
| | | | - Size Chen
- Department of Oncology, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Guangdong Provincial Engineering Research Center for Esophageal Cancer Precise Therapy, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Central Laboratory, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- *Correspondence: Size Chen,
| |
Collapse
|
36
|
Ao C, Yu L, Zou Q. Prediction of bio-sequence modifications and the associations with diseases. Brief Funct Genomics 2020; 20:1-18. [PMID: 33313647 DOI: 10.1093/bfgp/elaa023] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 11/09/2020] [Accepted: 11/10/2020] [Indexed: 12/22/2022] Open
Abstract
Modifications of protein, RNA and DNA play an important role in many biological processes and are related to some diseases. Therefore, accurate identification and comprehensive understanding of protein, RNA and DNA modification sites can promote research on disease treatment and prevention. With the development of sequencing technology, the number of known sequences has continued to increase. In the past decade, many computational tools that can be used to predict protein, RNA and DNA modification sites have been developed. In this review, we comprehensively summarized the modification site predictors for three different biological sequences and the association with diseases. The relevant web server is accessible at http://lab.malab.cn/∼acy/PTM_data/ some sample data on protein, RNA and DNA modification can be downloaded from that website.
Collapse
|
37
|
Chen X, Xiong Y, Liu Y, Chen Y, Bi S, Zhu X. m5CPred-SVM: a novel method for predicting m5C sites of RNA. BMC Bioinformatics 2020; 21:489. [PMID: 33126851 PMCID: PMC7602301 DOI: 10.1186/s12859-020-03828-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 10/21/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND As one of the most common post-transcriptional modifications (PTCM) in RNA, 5-cytosine-methylation plays important roles in many biological functions such as RNA metabolism and cell fate decision. Through accurate identification of 5-methylcytosine (m5C) sites on RNA, researchers can better understand the exact role of 5-cytosine-methylation in these biological functions. In recent years, computational methods of predicting m5C sites have attracted lots of interests because of its efficiency and low-cost. However, both the accuracy and efficiency of these methods are not satisfactory yet and need further improvement. RESULTS In this work, we have developed a new computational method, m5CPred-SVM, to identify m5C sites in three species, H. sapiens, M. musculus and A. thaliana. To build this model, we first collected benchmark datasets following three recently published methods. Then, six types of sequence-based features were generated based on RNA segments and the sequential forward feature selection strategy was used to obtain the optimal feature subset. After that, the performance of models based on different learning algorithms were compared, and the model based on the support vector machine provided the highest prediction accuracy. Finally, our proposed method, m5CPred-SVM was compared with several existing methods, and the result showed that m5CPred-SVM offered substantially higher prediction accuracy than previously published methods. It is expected that our method, m5CPred-SVM, can become a useful tool for accurate identification of m5C sites. CONCLUSION In this study, by introducing position-specific propensity related features, we built a new model, m5CPred-SVM, to predict RNA m5C sites of three different species. The result shows that our model outperformed the existing state-of-art models. Our model is available for users through a web server at https://zhulab.ahu.edu.cn/m5CPred-SVM .
Collapse
Affiliation(s)
- Xiao Chen
- School of Sciences, Anhui Agricultural University, Hefei, 230036 Anhui China
| | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Yinbo Liu
- School of Sciences, Anhui Agricultural University, Hefei, 230036 Anhui China
| | - Yuqing Chen
- School of Sciences, Anhui Agricultural University, Hefei, 230036 Anhui China
| | - Shoudong Bi
- School of Sciences, Anhui Agricultural University, Hefei, 230036 Anhui China
| | - Xiaolei Zhu
- School of Sciences, Anhui Agricultural University, Hefei, 230036 Anhui China
| |
Collapse
|
38
|
Khan F, Khan M, Iqbal N, Khan S, Muhammad Khan D, Khan A, Wei DQ. Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach. Front Genet 2020; 11:539227. [PMID: 33093842 PMCID: PMC7527634 DOI: 10.3389/fgene.2020.539227] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 08/13/2020] [Indexed: 01/20/2023] Open
Abstract
Meiotic recombination is the driving force of evolutionary development and an important source of genetic variation. The meiotic recombination does not take place randomly in a chromosome but occurs in some regions of the chromosome. A region in chromosomes with higher rate of meiotic recombination events are considered as hotspots and a region where frequencies of the recombination events are lower are called coldspots. Prediction of meiotic recombination spots provides useful information about the basic functionality of inheritance and genome diversity. This study proposes an intelligent computational predictor called iRSpots-DNN for the identification of recombination spots. The proposed predictor is based on a novel feature extraction method and an optimized deep neural network (DNN). The DNN was employed as a classification engine whereas, the novel features extraction method was developed to extract meaningful features for the identification of hotspots and coldspots across the yeast genome. Unlike previous algorithms, the proposed feature extraction avoids bias among different selected features and preserved the sequence discriminant properties along with the sequence-structure information simultaneously. This study also considered other effective classifiers named support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to predict recombination spots. Experimental results on a benchmark dataset with 10-fold cross-validation showed that iRSpots-DNN achieved the highest accuracy, i.e., 95.81%. Additionally, the performance of the proposed iRSpots-DNN is significantly better than the existing predictors on a benchmark dataset. The relevant benchmark dataset and source code are freely available at: https://github.com/Fatima-Khan12/iRspot_DNN/tree/master/iRspot_DNN.
Collapse
Affiliation(s)
- Fatima Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Mukhtaj Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Nadeem Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Salman Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Dost Muhammad Khan
- Department of Statistics, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Abbas Khan
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Ministry of Education, Shanghai, China.,Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
39
|
Ahmed S, Kabir M, Arif M, Khan ZU, Yu DJ. DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information. Anal Biochem 2020; 612:113955. [PMID: 32949607 DOI: 10.1016/j.ab.2020.113955] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 08/30/2020] [Accepted: 09/11/2020] [Indexed: 12/29/2022]
Abstract
Phosphorylation is a ubiquitous type of post-translational modification (PTM) that occurs in both eukaryotic and prokaryotic cells where in a phosphate group binds with amino acid residues. These specific residues, i.e., serine (S), threonine (T), and tyrosine (Y), exhibit diverse functions at the molecular level. Recent studies have determined that some diseases such as cancer, diabetes, and neurodegenerative diseases are caused by abnormal phosphorylation. Based on its potential applications in biological research and drug development, the large-scale identification of phosphorylation sites has attracted interest. Existing wet-lab technologies for targeting phosphorylation sites are overpriced and time consuming. Thus, computational algorithms that can efficiently accelerate the annotation of phosphorylation sites from massive protein sequences are needed. Numerous machine learning-based methods have been implemented for phosphorylation sites prediction. However, despite extensive efforts, existing computational approaches continue to have inadequate performance, particularly in terms of overall ACC, MCC, and AUC. In this paper, we report a novel deep learning-based predictor to overcome these performance hurdles, DeepPPSite, which was constructed using a stacked long short-term memory recurrent network for predicting phosphorylation sites. The proposed technique expediently learns the protein representations from conjoint protein descriptors. The experimental results indicated that our model achieved superior performance on the training dataset for S, T and Y, with MCC values of 0.608, 0.602, and 0.558, respectively, using a 10-fold cross-validation test. We further determined the generalization efficacy of the proposed predictor DeepPPSite by conducting a rigorous independent test. The predictive MCC values were 0.358, 0.356, and 0.350 for the S, T, and Y phosphorylation sites, respectively. Rigorous cross-validation and independent validation tests for the three types of phosphorylation sites demonstrated that the designed DeepPPSite tool significantly outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Saeed Ahmed
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Muhammad Kabir
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Muhammad Arif
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Zaheer Ullah Khan
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China.
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
40
|
Karthiya R, Khandelia P. m6A RNA Methylation: Ramifications for Gene Expression and Human Health. Mol Biotechnol 2020; 62:467-484. [PMID: 32840728 DOI: 10.1007/s12033-020-00269-5] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/14/2020] [Indexed: 12/12/2022]
Abstract
Cellular transcriptomes are frequently adorned by a variety of chemical modification marks, which in turn have a profound influence on its functioning. Of these modifications, the one which has invited a lot of attention in the recent years is m6A RNA methylation, leading to the development of RNA epigenetics or epitranscriptomics as a frontier research area. m6A RNA methylation is one of the most abundant reversible internal modification seen in cellular RNAs. Studies in the last few years have not only shed light on the molecular machinery involved in m6A RNA methylation but also on the impact of this modification in regulating gene expression and hence biological processes. In this review, we will emphasize the biological impact of this modification in normal organismal development and diseases.
Collapse
Affiliation(s)
- R Karthiya
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani - Hyderabad Campus, Jawahar Nagar, Kapra Mandal, Medchal District, Hyderabad, Telangana, 500078, India
| | - Piyush Khandelia
- Department of Biological Sciences, Birla Institute of Technology and Science, Pilani - Hyderabad Campus, Jawahar Nagar, Kapra Mandal, Medchal District, Hyderabad, Telangana, 500078, India.
| |
Collapse
|
41
|
Liu L, Song B, Ma J, Song Y, Zhang SY, Tang Y, Wu X, Wei Z, Chen K, Su J, Rong R, Lu Z, de Magalhães JP, Rigden DJ, Zhang L, Zhang SW, Huang Y, Lei X, Liu H, Meng J. Bioinformatics approaches for deciphering the epitranscriptome: Recent progress and emerging topics. Comput Struct Biotechnol J 2020; 18:1587-1604. [PMID: 32670500 PMCID: PMC7334300 DOI: 10.1016/j.csbj.2020.06.010] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Revised: 06/02/2020] [Accepted: 06/07/2020] [Indexed: 12/13/2022] Open
Abstract
Post-transcriptional RNA modification occurs on all types of RNA and plays a vital role in regulating every aspect of RNA function. Thanks to the development of high-throughput sequencing technologies, transcriptome-wide profiling of RNA modifications has been made possible. With the accumulation of a large number of high-throughput datasets, bioinformatics approaches have become increasing critical for unraveling the epitranscriptome. We review here the recent progress in bioinformatics approaches for deciphering the epitranscriptomes, including epitranscriptome data analysis techniques, RNA modification databases, disease-association inference, general functional annotation, and studies on RNA modification site prediction. We also discuss the limitations of existing approaches and offer some future perspectives.
Collapse
Affiliation(s)
- Lian Liu
- School of Computer Sciences, Shannxi Normal University, Xi’an, Shaanxi 710119, China
| | - Bowen Song
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Jiani Ma
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
| | - Yi Song
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Song-Yao Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, Shaanxi 710072, China
| | - Yujiao Tang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Xiangyu Wu
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
| | - Zhen Wei
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
| | - Kunqi Chen
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
| | - Jionglong Su
- Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
| | - Rong Rong
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Zhiliang Lu
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - João Pedro de Magalhães
- Institute of Ageing & Chronic Disease, University of Liverpool, L7 8TX, Liverpool, United Kingdom
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Lin Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
| | - Shao-Wu Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
| | - Yufei Huang
- Department of Electrical and Computer Engineering, University of Texas at San Antonio, San Antonio, TX, 78249, USA
- Department of Epidemiology and Biostatistics, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
| | - Xiujuan Lei
- School of Computer Sciences, Shannxi Normal University, Xi’an, Shaanxi 710119, China
| | - Hui Liu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
| | - Jia Meng
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China
- AI University Research Centre, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China
- Institute of Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| |
Collapse
|
42
|
Dou L, Li X, Ding H, Xu L, Xiang H. Prediction of m5C Modifications in RNA Sequences by Combining Multiple Sequence Features. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 21:332-342. [PMID: 32645685 PMCID: PMC7340967 DOI: 10.1016/j.omtn.2020.06.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 06/03/2020] [Accepted: 06/04/2020] [Indexed: 12/14/2022]
Abstract
5-Methylcytosine (m5C) is a well-known post-transcriptional modification that plays significant roles in biological processes, such as RNA metabolism, tRNA recognition, and stress responses. Traditional high-throughput techniques on identification of m5C sites are usually time consuming and expensive. In addition, the number of RNA sequences shows explosive growth in the post-genomic era. Thus, machine-learning-based methods are urgently requested to quickly predict RNA m5C modifications with high accuracy. Here, we propose a noval support-vector-machine (SVM)-based tool, called iRNA-m5C_SVM, by combining multiple sequence features to identify m5C sites in Arabidopsis thaliana. Eight kinds of popular feature-extraction methods were first investigated systematically. Then, four well-performing features were incorporated to construct a comprehensive model, including position-specific propensity (PSP) (PSNP, PSDP, and PSTP, associated with frequencies of nucleotides, dinucleotides, and trinucleotides, respectively), nucleotide composition (nucleic acid, di-nucleotide, and tri-nucleotide compositions; NAC, DNC, and TNC, respectively), electron-ion interaction pseudopotentials of trinucleotide (PseEIIPs), and general parallel correlation pseudo-dinucleotide composition (PC-PseDNC-general). Evaluated accuracies over 10-fold cross-validation and independent tests achieved 73.06% and 80.15%, respectively, which showed the best predictive performances in A. thaliana among existing models. It is believed that the proposed model in this work can be a promising alternative for further research on m5C modification sites in plant.
Collapse
Affiliation(s)
- Lijun Dou
- School of Automotive and Transportation Engineering, Shenzhen Polytechnic, Shenzhen, China; Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaoling Li
- Department of Oncology, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Hui Ding
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China.
| | - Huaikun Xiang
- School of Automotive and Transportation Engineering, Shenzhen Polytechnic, Shenzhen, China.
| |
Collapse
|
43
|
Liu L, Lei X, Fang Z, Tang Y, Meng J, Wei Z. LITHOPHONE: Improving lncRNA Methylation Site Prediction Using an Ensemble Predictor. Front Genet 2020; 11:545. [PMID: 32582286 PMCID: PMC7297269 DOI: 10.3389/fgene.2020.00545] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 05/06/2020] [Indexed: 12/31/2022] Open
Abstract
N 6-methyladenosine (m6A) is one of the most widely studied epigenetic modifications, which plays an important role in many biological processes, such as splicing, RNA localization, and degradation. Studies have shown that m6A on lncRNA has important functions, including regulating the expression and functions of lncRNA, regulating the synthesis of pre-mRNA, promoting the proliferation of cancer cells, and affecting cell differentiation and many others. Although a number of methods have been proposed to predict m6A RNA methylation sites, most of these methods aimed at general m6A sites prediction without noticing the uniqueness of the lncRNA methylation prediction problem. Since many lncRNAs do not have a polyA tail and cannot be captured in the polyA selection step of the most widely adopted RNA-seq library preparation protocol, lncRNA methylation sites cannot be effectively captured and are thus likely to be significantly underrepresented in existing experimental data affecting the accuracy of existing predictors. In this paper, we propose a new computational framework, LITHOPHONE, which stands for long noncoding RNA methylation sites prediction from sequence characteristics and genomic information with an ensemble predictor. We show that the methylation sites of lncRNA and mRNA have different patterns exhibited in the extracted features and should be differently handled when making predictions. Due to the used experiment protocols, the number of known lncRNA m6A sites is limited, and insufficient to train a reliable predictor; thus, the performance can be improved by combining both lncRNA and mRNA data using an ensemble predictor. We show that the newly developed LITHOPHONE approach achieved a reasonably good performance when tested on independent datasets (AUC: 0.966 and 0.835 under full transcript and mature mRNA modes, respectively), marking a substantial improvement compared with existing methods. Additionally, LITHOPHONE was applied to scan the entire human lncRNAome for all possible lncRNA m6A sites, and the results are freely accessible at: http://180.208.58.19/lith/.
Collapse
Affiliation(s)
- Lian Liu
- School of Computer Sciences, Shannxi Normal University, Xi'an, China
| | - Xiujuan Lei
- School of Computer Sciences, Shannxi Normal University, Xi'an, China
| | - Zengqiang Fang
- School of Computer Sciences, Shannxi Normal University, Xi'an, China
| | - Yujiao Tang
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Jia Meng
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Zhen Wei
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, China
| |
Collapse
|
44
|
Zhu X, He J, Zhao S, Tao W, Xiong Y, Bi S. A comprehensive comparison and analysis of computational predictors for RNA N6-methyladenosine sites of Saccharomyces cerevisiae. Brief Funct Genomics 2020; 18:367-376. [PMID: 31609411 DOI: 10.1093/bfgp/elz018] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 07/07/2019] [Accepted: 07/15/2019] [Indexed: 12/16/2022] Open
Abstract
N6-methyladenosine (m6A) modification, as one of the commonest post-transcription modifications in RNAs, has been reported to be highly related to many biological processes. Over the past decade, several tools for m6A sites prediction of Saccharomyces cerevisiae have been developed and are freely available online. However, the quality of predictions by these tools is difficult to quantify and compare. In this study, an independent dataset M6Atest6540 was compiled to systematically evaluate nine publicly available m6A prediction tools for S. cerevisiae. The experimental results indicate that RAM-ESVM achieved the best performance on M6Atest6540; however, most models performed substantially worse than their performances reported in the original papers. The benchmark dataset Met2614, which was used as the training dataset for the nine methods, were further analyzed by using a position bias index. The results demonstrated the significantly different bias of dataset Met2614 compared with the RNA segments around m6A sites recorded in RMBase. Moreover, newMet2614 was collected by randomly selecting RNA segments from non-redundant data recorded in RMBase, and three different kinds of features were extracted. The performances of the models built on Met2614 and newMet2614 with the features were compared, which shows the better generalization of models built on newMet2614. Our results also indicate the position-specific propensity-based features outperform other features, although they are also easily over-fitted on a biased dataset.
Collapse
Affiliation(s)
- Xiaolei Zhu
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China.,School of Life Sciences, Anhui University, Hefei, Anhui 230601, China
| | - Jingjing He
- School of Life Sciences, Anhui University, Hefei, Anhui 230601, China
| | - Shihao Zhao
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Wei Tao
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Shoudong Bi
- School of Sciences, Anhui Agricultural University, Hefei, Anhui 230036, China
| |
Collapse
|
45
|
Zhu ZM, Huo FC, Pei DS. Function and evolution of RNA N6-methyladenosine modification. Int J Biol Sci 2020; 16:1929-1940. [PMID: 32398960 PMCID: PMC7211178 DOI: 10.7150/ijbs.45231] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Accepted: 04/05/2020] [Indexed: 02/06/2023] Open
Abstract
N6-methyladenosine (m6A) is identified as the most prevalent and abundant internal RNA modification, especially within eukaryotic mRNAs, which has attracted much attention in recent years since its importance for regulating gene expression and deciding cell fate. m6A modification is installed by RNA methyltransferases METTL3, METTL14 and WTAP (Writers), removed by the demethylases FTO and ALKBH5 (Erasers) and recognized by m6A binding proteins, such as YT521-B homology YTH domain-containing proteins (Readers). Accumulating evidence shows that m6A RNA methylation participates in almost all aspects of RNA processing, implying an association with important bioprocesses. In this review, we mainly summarize and discuss the functional relevance and importance of m6A modification in cellular processes.
Collapse
Affiliation(s)
- Zhi-Man Zhu
- Department of Pathology, Xuzhou Medical University, Xuzhou 221004, China
| | - Fu-Chun Huo
- Department of Pathology, Xuzhou Medical University, Xuzhou 221004, China
| | - Dong-Sheng Pei
- Department of Pathology, Xuzhou Medical University, Xuzhou 221004, China
| |
Collapse
|
46
|
Govindaraj RG, Subramaniyam S, Manavalan B. Extremely-randomized-tree-based Prediction of N 6-Methyladenosine Sites in Saccharomyces cerevisiae. Curr Genomics 2020; 21:26-33. [PMID: 32655295 PMCID: PMC7324895 DOI: 10.2174/1389202921666200219125625] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 12/28/2019] [Accepted: 01/24/2020] [Indexed: 02/07/2023] Open
Abstract
Introduction N6-methyladenosine (m6A) is one of the most common post-transcriptional modifications in RNA, which has been related to several biological processes. The accurate prediction of m6A sites from RNA sequences is one of the challenging tasks in computational biology. Several computational methods utilizing machine-learning algorithms have been proposed that accelerate in silico screening of m6A sites, thereby drastically reducing the experimental time and labor costs involved. Methodology In this study, we proposed a novel computational predictor termed ERT-m6Apred, for the accurate prediction of m6A sites. To identify the feature encodings with more discriminative capability, we applied a two-step feature selection technique on seven different feature encodings and identified the corresponding optimal feature set. Results Subsequently, performance comparison of the corresponding optimal feature set-based extremely randomized tree model revealed that Pseudo k-tuple composition encoding, which includes 14 physicochemical properties significantly outperformed other encodings. Moreover, ERT-m6Apred achieved an accuracy of 78.84% during cross-validation analysis, which is comparatively better than recently reported predictors. Conclusion In summary, ERT-m6Apred predicts Saccharomyces cerevisiae m6A sites with higher accuracy, thus facilitating biological hypothesis generation and experimental validations.
Collapse
Affiliation(s)
- Rajiv G Govindaraj
- 1HotSpot Therapeutics, 50 Milk Street, 16 Floor, Boston, MA02109, USA; 2Research and Development Center, In-silicogen Inc., Yongin-si 16954, Gyeonggi-do, Republic of Korea; 3Department of Biotechnology, Dr. N.G.P. Arts and Science College, Coimbatore, Tamil Nadu641048, India; 4Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Sathiyamoorthy Subramaniyam
- 1HotSpot Therapeutics, 50 Milk Street, 16 Floor, Boston, MA02109, USA; 2Research and Development Center, In-silicogen Inc., Yongin-si 16954, Gyeonggi-do, Republic of Korea; 3Department of Biotechnology, Dr. N.G.P. Arts and Science College, Coimbatore, Tamil Nadu641048, India; 4Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Balachandran Manavalan
- 1HotSpot Therapeutics, 50 Milk Street, 16 Floor, Boston, MA02109, USA; 2Research and Development Center, In-silicogen Inc., Yongin-si 16954, Gyeonggi-do, Republic of Korea; 3Department of Biotechnology, Dr. N.G.P. Arts and Science College, Coimbatore, Tamil Nadu641048, India; 4Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
| |
Collapse
|
47
|
Li Y, Wang J, Huang C, Shen M, Zhan H, Xu K. RNA N6-methyladenosine: a promising molecular target in metabolic diseases. Cell Biosci 2020; 10:19. [PMID: 32110378 PMCID: PMC7035649 DOI: 10.1186/s13578-020-00385-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Accepted: 02/11/2020] [Indexed: 12/12/2022] Open
Abstract
N6-methyladenosine is a prevalent and abundant transcriptome modification, and its methylation regulates the various aspects of RNAs, including transcription, translation, processing and metabolism. The methylation of N6-methyladenosine is highly associated with numerous cellular processes, which plays important roles in the development of physiological process and diseases. The high prevalence of metabolic diseases poses a serious threat to human health, but its pathological mechanisms remain poorly understood. Recent studies have reported that the progression of metabolic diseases is closely related to the expression of RNA N6-methyladenosine modification. In this review, we aim to summarize the biological and clinical significance of RNA N6-methyladenosine modification in metabolic diseases, including obesity, type 2 diabetes, non-alcoholic fatty liver disease, hypertension, cardiovascular diseases, osteoporosis and immune-related metabolic diseases.
Collapse
Affiliation(s)
- Yan Li
- 1Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, 610072 Sichuan China
| | - Jiawen Wang
- 1Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, 610072 Sichuan China
| | - Chunyan Huang
- Houjie Hospital of Dongguan, Dongguan, 523945 Guangdong China
| | - Meng Shen
- Chengdu Tumor Hospital, Chengdu, 610041 Sichuan China
| | - Huakui Zhan
- 1Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, 610072 Sichuan China
| | - Keyang Xu
- 4Hangzhou Xixi Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, 310023 Zhejiang China
| |
Collapse
|
48
|
Zhao J, Cao Y, Zhang L. Exploring the computational methods for protein-ligand binding site prediction. Comput Struct Biotechnol J 2020; 18:417-426. [PMID: 32140203 PMCID: PMC7049599 DOI: 10.1016/j.csbj.2020.02.008] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 01/23/2020] [Accepted: 02/11/2020] [Indexed: 12/21/2022] Open
Abstract
Proteins participate in various essential processes in vivo via interactions with other molecules. Identifying the residues participating in these interactions not only provides biological insights for protein function studies but also has great significance for drug discoveries. Therefore, predicting protein-ligand binding sites has long been under intense research in the fields of bioinformatics and computer aided drug discovery. In this review, we first introduce the research background of predicting protein-ligand binding sites and then classify the methods into four categories, namely, 3D structure-based, template similarity-based, traditional machine learning-based and deep learning-based methods. We describe representative algorithms in each category and elaborate on machine learning and deep learning-based prediction methods in more detail. Finally, we discuss the trends and challenges of the current research such as molecular dynamics simulation based cryptic binding sites prediction, and highlight prospective directions for the near future.
Collapse
Affiliation(s)
- Jingtian Zhao
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| |
Collapse
|
49
|
Wu P, Mo Y, Peng M, Tang T, Zhong Y, Deng X, Xiong F, Guo C, Wu X, Li Y, Li X, Li G, Zeng Z, Xiong W. Emerging role of tumor-related functional peptides encoded by lncRNA and circRNA. Mol Cancer 2020; 19:22. [PMID: 32019587 PMCID: PMC6998289 DOI: 10.1186/s12943-020-1147-3] [Citation(s) in RCA: 371] [Impact Index Per Article: 74.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 01/28/2020] [Indexed: 02/08/2023] Open
Abstract
Non-coding RNAs do not encode proteins and regulate various oncological processes. They are also important potential cancer diagnostic and prognostic biomarkers. Bioinformatics and translation omics have begun to elucidate the roles and modes of action of the functional peptides encoded by ncRNA. Here, recent advances in long non-coding RNA (lncRNA) and circular RNA (circRNA)-encoded small peptides are compiled and synthesized. We introduce both the computational and analytical methods used to forecast prospective ncRNAs encoding oncologically functional oligopeptides. We also present numerous specific lncRNA and circRNA-encoded proteins and their cancer-promoting or cancer-inhibiting molecular mechanisms. This information may expedite the discovery, development, and optimization of novel and efficacious cancer diagnostic, therapeutic, and prognostic protein-based tools derived from non-coding RNAs. The role of ncRNA-encoding functional peptides has promising application perspectives and potential challenges in cancer research. The aim of this review is to provide a theoretical basis and relevant references, which may promote the discovery of more functional peptides encoded by ncRNAs, and further develop novel anticancer therapeutic targets, as well as diagnostic and prognostic cancer markers.
Collapse
Affiliation(s)
- Pan Wu
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Yongzhen Mo
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Miao Peng
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Ting Tang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Yu Zhong
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Xiangying Deng
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Fang Xiong
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Can Guo
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Xu Wu
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Yong Li
- Department of Medicine, Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
| | - Xiaoling Li
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
| | - Guiyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Zhaoyang Zeng
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Wei Xiong
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Translational Radiation Oncology, Hunan Cancer Hospital and The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, Hunan, China.
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, Hunan, China.
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, the Third Xiangya Hospital, Central South University, Changsha, Hunan, China.
| |
Collapse
|
50
|
Liu L, Lei X, Meng J, Wei Z. WITMSG: Large-scale Prediction of Human Intronic m 6A RNA Methylation Sites from Sequence and Genomic Features. Curr Genomics 2020; 21:67-76. [PMID: 32655300 PMCID: PMC7324894 DOI: 10.2174/1389202921666200211104140] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/14/2020] [Accepted: 01/27/2020] [Indexed: 02/07/2023] Open
Abstract
INTRODUCTION N 6-methyladenosine (m6A) is one of the most widely studied epigenetic modifications. It plays important roles in various biological processes, such as splicing, RNA localization and degradation, many of which are related to the functions of introns. Although a number of computational approaches have been proposed to predict the m6A sites in different species, none of them were optimized for intronic m6A sites. As existing experimental data overwhelmingly relied on polyA selection in sample preparation and the intronic RNAs are usually underrepresented in the captured RNA library, the accuracy of general m6A sites prediction approaches is limited for intronic m6A sites prediction task. METHODOLOGY A computational framework, WITMSG, dedicated to the large-scale prediction of intronic m6A RNA methylation sites in humans has been proposed here for the first time. Based on the random forest algorithm and using only known intronic m6A sites as the training data, WITMSG takes advantage of both conventional sequence features and a variety of genomic characteristics for improved prediction performance of intron-specific m6A sites. RESULTS AND CONCLUSION It has been observed that WITMSG outperformed competing approaches (trained with all the m6A sites or intronic m6A sites only) in 10-fold cross-validation (AUC: 0.940) and when tested on independent datasets (AUC: 0.946). WITMSG was also applied intronome-wide in humans to predict all possible intronic m6A sites, and the prediction results are freely accessible at http://rnamd.com/intron/.
Collapse
Affiliation(s)
| | - Xiujuan Lei
- Address correspondence to these authors at the School of Computer Sciences, Shannxi Normal University, Xi’an, Shaanxi, 710119, China; E-mail: ; and Department of Biological Sciences, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China; E-mail:
| | | | - Zhen Wei
- Address correspondence to these authors at the School of Computer Sciences, Shannxi Normal University, Xi’an, Shaanxi, 710119, China; E-mail: ; and Department of Biological Sciences, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China; E-mail:
| |
Collapse
|