Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sato K, Kato Y. Prediction of RNA secondary structure including pseudoknots for long sequences. Brief Bioinform 2021;23:6380459. [PMID: 34601552 PMCID: PMC8769711 DOI: 10.1093/bib/bbab395] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 08/13/2021] [Accepted: 08/30/2021] [Indexed: 12/28/2022] Open

For:	Sato K, Kato Y. Prediction of RNA secondary structure including pseudoknots for long sequences. Brief Bioinform 2021;23:6380459. [PMID: 34601552 PMCID: PMC8769711 DOI: 10.1093/bib/bbab395] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 08/13/2021] [Accepted: 08/30/2021] [Indexed: 12/28/2022] Open

Number

Cited by Other Article(s)

Omnes L, Angel E, Bartet P, Radvanyi F, Tahi F. A divide-and-conquer approach based on deep learning for long RNA secondary structure prediction: Focus on pseudoknots identification. PLoS One 2025;20:e0314837. [PMID: 40279361 PMCID: PMC12026937 DOI: 10.1371/journal.pone.0314837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 03/04/2025] [Indexed: 04/27/2025] Open

Yang J, Sato K, Loza M, Park SJ, Nakai K. RNA secondary structure prediction by conducting multi-class classifications. Comput Struct Biotechnol J 2025;27:1449-1459. [PMID: 40256169 PMCID: PMC12008525 DOI: 10.1016/j.csbj.2025.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 03/29/2025] [Accepted: 04/01/2025] [Indexed: 04/22/2025] Open

La Rosa M, Fiannaca A, Mendolia I, La Paglia L, Urso A. GL4SDA: Predicting snoRNA-disease associations using GNNs and LLM embeddings. Comput Struct Biotechnol J 2025;27:1023-1033. [PMID: 40160859 PMCID: PMC11952811 DOI: 10.1016/j.csbj.2025.03.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 03/04/2025] [Accepted: 03/08/2025] [Indexed: 04/02/2025] Open

Abstract

Small nucleolar RNAs (snoRNAs) play essential roles in various cellular processes, and their associations with diseases are increasingly recognized. Identifying these snoRNA-disease relationships is critical for advancing our understanding of their functional roles and potential therapeutic implications. This work presents a novel approach, called GL4SDA, to predict snoRNA-disease associations using Graph Neural Networks (GNN) and Large Language Models. Our methodology leverages the unique strengths of heterogeneous graph structures to model complex biological interactions. Differently from existing methods, we define a set of features able to capture deeper information content related to the inner attributes of both snoRNAs and diseases and design a GNN model based on highly performing layers, which can maximize results on this representation. We consider snoRNA secondary structures and disease embeddings derived from large language models to obtain snoRNAs and disease node features, respectively. By combining structural features of snoRNAs with rich semantic embeddings of diseases, we construct a feature-rich graph representation that improves the predictive performance of our model. We evaluate our approach using different architectures that exploit the capabilities of many graph convolutional layers and compare the results with three other state-of-the-art graph-based predictors. GL4SDA demonstrates improved scores in link prediction tasks and demonstrates its potential implication as a tool for exploring snoRNA-disease relationships. We also validate our findings through biological case studies about cancer diseases, highlighting the practical application of our method in real-world scenarios and obtaining the most important snoRNA features using explainable artificial intelligence methods.

Collapse

Kagaya Y, Zhang Z, Ibtehaz N, Wang X, Nakamura T, Punuru PD, Kihara D. NuFold: end-to-end approach for RNA tertiary structure prediction with flexible nucleobase center representation. Nat Commun 2025;16:881. [PMID: 39837861 PMCID: PMC11751094 DOI: 10.1038/s41467-025-56261-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 01/13/2025] [Indexed: 01/23/2025] Open

Maghraby A, Alzalaty M. Genome-wide identification, characterization, and functional analysis of the CHX, SOS, and RLK genes in Solanum lycopersicum under salt stress. Sci Rep 2025;15:1142. [PMID: 39774029 PMCID: PMC11707246 DOI: 10.1038/s41598-024-83221-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 12/12/2024] [Indexed: 01/11/2025] Open

Abstract

The cation/proton exchanger (CHX), salt overly sensitive (SOS), and receptor-like kinase (RLK) genes play significant roles in the response to salt stress in plants. This study is the first to identify the SOS gene in Solanum lycopersicum (tomato) through genome-wide analysis under salt stress conditions. Quantitative reverse transcription PCR (qRT-PCR) results indicated that the expression levels of CHX, SOS, and RLK genes were upregulated, with fold changes of 1.83, 1.49, and 1.55, respectively, after 12 h of exposure to salt stress. Genome-wide analysis revealed 21 CHX, 5 SOS, and 86 RLK genes in S. lycopersicum. CHX genes were found on chromosomes 2, 3, 4, 5, 6, 7, 8, 9, 11, and 12 of S. lycopersicum. SOS genes were found on chromosomes 1, 4, 6, and 10. RLK genes were found on all chromosomes of S. lycopersicum. The Ka/Ks ratios indicate that the CHX, SOS, and RLK genes have been primarily influenced by purifying selection. This suggests that these genes have faced strong environmental pressures throughout their evolution. Purifying selection typically results in a decrease in genetic diversity. The estimated duplication time for CHX paralogous gene pairs ranged from approximately 26.965 to 245.413 million years ago (Mya), while the duplication time for SOS paralogous gene pairs ranged from around 116.682 to 275.631 Mya. For RLK paralogous gene pairs, the duplication time varied from approximately 27.689 to 239.376 Mya. Synteny analysis of the CHX, SOS, and RLK genes demonstrated collinear relationships with orthologous genes in Arabidopsis thaliana, but no collinearity orthologous relationships in Oryza sativa (rice). Furthermore, the analysis revealed that there were 6 orthologous SlCHX genes, 2 orthologous SlSOS genes, and 44 orthologous SlRLK genes paired with those in A. thaliana. The results of the present study may help to elucidate the role of the CHX, SOS, and RLK genes in salt stress in S. lycopersicum.

Collapse

Oleynikov M, Jaffrey SR. RNA tertiary structure and conformational dynamics revealed by BASH MaP. eLife 2024;13:RP98540. [PMID: 39625751 PMCID: PMC11614387 DOI: 10.7554/elife.98540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2024] Open

Boon WX, Sia BZ, Ng CH. Prediction of the effects of the top 10 synonymous mutations from 26645 SARS-CoV-2 genomes of early pandemic phase. F1000Res 2024;10:1053. [PMID: 39268187 PMCID: PMC11391198 DOI: 10.12688/f1000research.72896.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/11/2024] [Indexed: 09/15/2024] Open

Abstract

Background

The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) had led to a global pandemic since December 2019. SARS-CoV-2 is a single-stranded RNA virus, which mutates at a higher rate. Multiple works had been done to study nonsynonymous mutations, which change protein sequences. However, there is little study on the effects of SARS-CoV-2 synonymous mutations, which may affect viral fitness. This study aims to predict the effect of synonymous mutations on the SARS-CoV-2 genome.

Methods

A total of 26645 SARS-CoV-2 genomic sequences retrieved from Global Initiative on Sharing all Influenza Data (GISAID) database were aligned using MAFFT. Then, the mutations and their respective frequency were identified. Multiple RNA secondary structures prediction tools, namely RNAfold, IPknot++ and MXfold2 were applied to predict the effect of the mutations on RNA secondary structure and their base pair probabilities was estimated using MutaRNA. Relative synonymous codon usage (RSCU) analysis was also performed to measure the codon usage bias (CUB) of SARS-CoV-2.

Results

A total of 150 synonymous mutations were identified. The synonymous mutation identified with the highest frequency is C3037U mutation in the nsp3 of ORF1a. Of these top 10 highest frequency synonymous mutations, C913U, C3037U, U16176C and C18877U mutants show pronounced changes between wild type and mutant in all 3 RNA secondary structure prediction tools, suggesting these mutations may have some biological impact on viral fitness. These four mutations show changes in base pair probabilities. All mutations except U16176C change the codon to a more preferred codon, which may result in higher translation efficiency.

Conclusion

Synonymous mutations in SARS-CoV-2 genome may affect RNA secondary structure, changing base pair probabilities and possibly resulting in a higher translation rate. However, lab experiments are required to validate the results obtained from prediction analysis.

Collapse

Fallah A, Havaei SA, Sedighian H, Kachuei R, Fooladi AAI. Prediction of aptamer affinity using an artificial intelligence approach. J Mater Chem B 2024;12:8825-8842. [PMID: 39158322 DOI: 10.1039/d4tb00909f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]

Qi F, Chen J, Chen Y, Sun J, Lin Y, Chen Z, Kapranov P. Evaluating Performance of Different RNA Secondary Structure Prediction Programs Using Self-cleaving Ribozymes. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024;22:qzae043. [PMID: 39317944 PMCID: PMC12016570 DOI: 10.1093/gpbjnl/qzae043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/02/2024] [Accepted: 06/05/2024] [Indexed: 09/26/2024]

Oleynikov M, Jaffrey SR. RNA tertiary structure and conformational dynamics revealed by BASH MaP. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.11.589009. [PMID: 38645201 PMCID: PMC11030352 DOI: 10.1101/2024.04.11.589009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]

Huang X, Du Z. Possible involvement of three-stemmed pseudoknots in regulating translational initiation in human mRNAs. PLoS One 2024;19:e0307541. [PMID: 39038036 PMCID: PMC11262651 DOI: 10.1371/journal.pone.0307541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 07/08/2024] [Indexed: 07/24/2024] Open

Abstract

RNA pseudoknots play a crucial role in various cellular functions. Established pseudoknots show significant variation in both size and structural complexity. Specifically, three-stemmed pseudoknots are characterized by an additional stem-loop embedded in their structure. Recent findings highlight these pseudoknots as bacterial riboswitches and potent stimulators for programmed ribosomal frameshifting in RNA viruses like SARS-CoV2. To investigate the possible presence of functional three-stemmed pseudoknots in human mRNAs, we employed in-house developed computational methods to detect such structures within a dataset comprising 21,780 full-length human mRNA sequences. Numerous three-stemmed pseudoknots were identified. A selected set of 14 potential instances are presented, in which the start codon of the mRNA is found in close proximity either upstream, downstream, or within the identified three-stemmed pseudoknot. These pseudoknots likely play a role in translational initiation regulation. The probability of their existence gains support from their ranking as the most stable pseudoknot identified in the entire mRNA sequence, structural conservation across homologous mRNAs, stereochemical feasibility as demonstrated by structural modeling, and classification as members of the CPK-1 pseudoknot family, which includes many well-established pseudoknots. Furthermore, in four of the mRNAs, two or three closely spaced or tandem three-stemmed pseudoknots were identified. These findings suggest the frequent occurrence of three-stemmed pseudoknots in human mRNAs. A stepwise co-transcriptional folding mechanism is proposed for the formation of a three-stemmed pseudoknot structure. Our results not only provide fresh insights into the structures and functions of pseudoknots but also unveil the potential to target pseudoknots for treating human diseases.

Collapse

Kolaitis A, Makris E, Karagiannis AA, Tsanakas P, Pavlatos C. Knotify_V2.0: Deciphering RNA Secondary Structures with H-Type Pseudoknots and Hairpin Loops. Genes (Basel) 2024;15:670. [PMID: 38927606 PMCID: PMC11203014 DOI: 10.3390/genes15060670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/19/2024] [Accepted: 05/22/2024] [Indexed: 06/28/2024] Open

Abstract

Accurately predicting the pairing order of bases in RNA molecules is essential for anticipating RNA secondary structures. Consequently, this task holds significant importance in unveiling previously unknown biological processes. The urgent need to comprehend RNA structures has been accentuated by the unprecedented impact of the widespread COVID-19 pandemic. This paper presents a framework, Knotify_V2.0, which makes use of syntactic pattern recognition techniques in order to predict RNA structures, with a specific emphasis on tackling the demanding task of predicting H-type pseudoknots that encompass bulges and hairpins. By leveraging the expressive capabilities of a Context-Free Grammar (CFG), the suggested framework integrates the inherent benefits of CFG and makes use of minimum free energy and maximum base pairing criteria. This integration enables the effective management of this inherently ambiguous task. The main contribution of Knotify_V2.0 compared to earlier versions lies in its capacity to identify additional motifs like bulges and hairpins within the internal loops of the pseudoknot. Notably, the proposed methodology, Knotify_V2.0, demonstrates superior accuracy in predicting core stems compared to state-of-the-art frameworks. Knotify_V2.0 exhibited exceptional performance by accurately identifying both core base pairing that form the ground truth pseudoknot in 70% of the examined sequences. Furthermore, Knotify_V2.0 narrowed the performance gap with Knotty, which had demonstrated better performance than Knotify and even surpassed it in Recall and F1-score metrics. Knotify_V2.0 achieved a higher count of true positives (tp) and a significantly lower count of false negatives (fn) compared to Knotify, highlighting improvements in Prediction and Recall metrics, respectively. Consequently, Knotify_V2.0 achieved a higher F1-score than any other platform. The source code and comprehensive implementation details of Knotify_V2.0 are publicly available on GitHub.

Collapse

Maghraby A, Alzalaty M. Genome-wide identification and evolutionary analysis of the AP2/EREBP, COX and LTP genes in Zea mays L. under drought stress. Sci Rep 2024;14:7610. [PMID: 38556556 PMCID: PMC10982304 DOI: 10.1038/s41598-024-57376-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 03/18/2024] [Indexed: 04/02/2024] Open

Abstract

AP2 (APETALA2)/EREBP (ethylene-responsive element-binding protein), cytochrome c oxidase (COX) and nonspecific lipid transfer proteins (LTP) play important roles in the response to drought stress. This is the first study to identify the COX gene in Zea mays L. via genome-wide analysis. The qRT‒PCR results indicated that AP2/EREBP, COX and LTP were downregulated, with fold changes of 0.84, 0.53 and 0.31, respectively, after 12 h of drought stress. Genome-wide analysis identified 78 AP2/EREBP, 6 COX and 10 LTP genes in Z. mays L. Domain analysis confirmed the presence of the AP2 domain, Cyt_c_Oxidase_Vb domain and nsLTP1 in the AP2/EREBP, COX and LTP proteins, respectively. The AP2/EREBP protein family (AP2) includes five different domain types: the AP2/ERF domain, the EREBP-like factor (EREBP), the ethylene responsive factor (ERF), the dehydration responsive element binding protein (DREB) and the SHN SHINE. Synteny analysis of the AP2/EREBP, COX and LTP genes revealed collinearity orthologous relationships in O. sativa, H. vulgare and A. thaliana. AP2/EREBP genes were found on the 10 chromosomes of Z. mays L. COX genes were found on chromosomes 1, 3, 4, 5, 7 and 8. LTP genes were found on chromosomes 1, 3, 6, 8, 9 and 10. In the present study, the Ka/Ks ratios of the AP2/EREBP paralogous pairs indicated that the AP2/EREBP genes were influenced primarily by purifying selection, which indicated that the AP2/EREBP genes received strong environmental pressure during evolution. The Ka/Ks ratios of the COX-3/COX-4 paralogous pairs indicate that the COX-3/COX-4 genes were influenced primarily by Darwinian selection (driving change). For the LTP genes, the Ka/Ks ratios of the LTP-1/LTP-10, LTP-5/LTP-3 and LTP-4/LTP-8 paralogous pairs indicate that these genes were influenced primarily by purifying selection, while the Ka/Ks ratios of the LTP-2/LTP-6 paralogous pairs indicate that these genes were influenced primarily by Darwinian selection. The duplication time of the AP2/EREBP paralogous gene pairs in Z. mays L. ranged from approximately 9.364 to 100.935 Mya. The duplication time of the COX-3/COX-4 paralogous gene pair was approximately 5.217 Mya. The duplication time of the LTP paralogous gene pairs ranged from approximately 19.064 to 96.477 Mya. The major focus of research is to identify the genes that are responsible for drought stress tolerance to improve maize for drought stress tolerance. The results of the present study will improve the understanding of the functions of the AP2/EREBP, COX and LTP genes in response to drought stress.

Collapse

Gong T, Ju F, Bu D. Accurate prediction of RNA secondary structure including pseudoknots through solving minimum-cost flow with learned potentials. Commun Biol 2024;7:297. [PMID: 38461362 PMCID: PMC10924946 DOI: 10.1038/s42003-024-05952-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 02/21/2024] [Indexed: 03/11/2024] Open

Loyer G, Reinharz V. Concurrent prediction of RNA secondary structures with pseudoknots and local 3D motifs in an integer programming framework. Bioinformatics 2024;40:btae022. [PMID: 38230755 PMCID: PMC10868335 DOI: 10.1093/bioinformatics/btae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/30/2023] [Accepted: 01/12/2024] [Indexed: 01/18/2024] Open

Rocca R, Grillone K, Citriniti EL, Gualtieri G, Artese A, Tagliaferri P, Tassone P, Alcaro S. Targeting non-coding RNAs: Perspectives and challenges of in-silico approaches. Eur J Med Chem 2023;261:115850. [PMID: 37839343 DOI: 10.1016/j.ejmech.2023.115850] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/08/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023]

Ballarino M, Pepe G, Helmer-Citterich M, Palma A. Exploring the landscape of tools and resources for the analysis of long non-coding RNAs. Comput Struct Biotechnol J 2023;21:4706-4716. [PMID: 37841333 PMCID: PMC10568309 DOI: 10.1016/j.csbj.2023.09.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/28/2023] [Accepted: 09/28/2023] [Indexed: 10/17/2023] Open

Sato K, Hamada M. Recent trends in RNA informatics: a review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery. Brief Bioinform 2023;24:bbad186. [PMID: 37232359 PMCID: PMC10359090 DOI: 10.1093/bib/bbad186] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 05/27/2023] Open

Lin BC, Katneni U, Jankowska KI, Meyer D, Kimchi-Sarfaty C. In silico methods for predicting functional synonymous variants. Genome Biol 2023;24:126. [PMID: 37217943 PMCID: PMC10204308 DOI: 10.1186/s13059-023-02966-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 05/10/2023] [Indexed: 05/24/2023] Open

Makris E, Kolaitis A, Andrikos C, Moulos V, Tsanakas P, Pavlatos C. Knotify+: Toward the Prediction of RNA H-Type Pseudoknots, Including Bulges and Internal Loops. Biomolecules 2023;13:biom13020308. [PMID: 36830677 PMCID: PMC9953189 DOI: 10.3390/biom13020308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 01/25/2023] [Accepted: 02/01/2023] [Indexed: 02/09/2023] Open

Fukunaga T, Hamada M. LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods. BIOINFORMATICS ADVANCES 2022;2:vbac078. [PMID: 36699418 PMCID: PMC9710674 DOI: 10.1093/bioadv/vbac078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 10/13/2022] [Accepted: 10/21/2022] [Indexed: 11/05/2022]

Bugnon LA, Edera AA, Prochetto S, Gerard M, Raad J, Fenoy E, Rubiolo M, Chorostecki U, Gabaldón T, Ariel F, Di Persia LE, Milone DH, Stegmayer G. Secondary structure prediction of long noncoding RNA: review and experimental comparison of existing approaches. Brief Bioinform 2022;23:6606044. [PMID: 35692094 DOI: 10.1093/bib/bbac205] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 05/02/2022] [Accepted: 05/04/2022] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

In contrast to messenger RNAs, the function of the wide range of existing long noncoding RNAs (lncRNAs) largely depends on their structure, which determines interactions with partner molecules. Thus, the determination or prediction of the secondary structure of lncRNAs is critical to uncover their function. Classical approaches for predicting RNA secondary structure have been based on dynamic programming and thermodynamic calculations. In the last 4 years, a growing number of machine learning (ML)-based models, including deep learning (DL), have achieved breakthrough performance in structure prediction of biomolecules such as proteins and have outperformed classical methods in short transcripts folding. Nevertheless, the accurate prediction for lncRNA still remains far from being effectively solved. Notably, the myriad of new proposals has not been systematically and experimentally evaluated.

RESULTS

In this work, we compare the performance of the classical methods as well as the most recently proposed approaches for secondary structure prediction of RNA sequences using a unified and consistent experimental setup. We use the publicly available structural profiles for 3023 yeast RNA sequences, and a novel benchmark of well-characterized lncRNA structures from different species. Moreover, we propose a novel metric to assess the predictive performance of methods, exclusively based on the chemical probing data commonly used for profiling RNA structures, avoiding any potential bias incorporated by computational predictions when using dot-bracket references. Our results provide a comprehensive comparative assessment of existing methodologies, and a novel and public benchmark resource to aid in the development and comparison of future approaches.

AVAILABILITY

Full source code and benchmark datasets are available at: https://github.com/sinc-lab/lncRNA-folding.

CONTACT

lbugnon@sinc.unl.edu.ar.

Collapse

Affiliation(s)

L A Bugnon Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
A A Edera Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
S Prochetto Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina.,IAL, CONICET, Ciudad Universitaria UNL, (3000) Santa Fe, Argentina
M Gerard Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
J Raad Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
E Fenoy Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
M Rubiolo Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
U Chorostecki Barcelona Supercomputing Center (BSC-CNS), Institute of Research in Biomedicine (IRB), Spain
T Gabaldón Barcelona Supercomputing Center (BSC-CNS), Institute of Research in Biomedicine (IRB), Spain.,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain.,Centro de Investigación Biomédica En Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain
F Ariel IAL, CONICET, Ciudad Universitaria UNL, (3000) Santa Fe, Argentina
L E Di Persia Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
D H Milone Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina
G Stegmayer Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria, Santa Fe, Argentina

Collapse

Moudgal N, Arhin G, Frank AT. Using Unassigned NMR Chemical Shifts to Model RNA Secondary Structure. J Phys Chem A 2022;126:2739-2745. [PMID: 35470661 DOI: 10.1021/acs.jpca.2c00456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Computer-aided comprehensive explorations of RNA structural polymorphism through complementary simulation methods. QRB DISCOVERY 2022. [PMID: 37529277 PMCID: PMC10392686 DOI: 10.1017/qrd.2022.19] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open