1
|
Hejret V, Varadarajan NM, Klimentova E, Gresova K, Giassa IC, Vanacova S, Alexiou P. Analysis of chimeric reads characterises the diverse targetome of AGO2-mediated regulation. Sci Rep 2023; 13:22895. [PMID: 38129478 PMCID: PMC10739727 DOI: 10.1038/s41598-023-49757-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023] Open
Abstract
Argonaute proteins are instrumental in regulating RNA stability and translation. AGO2, the major mammalian Argonaute protein, is known to primarily associate with microRNAs, a family of small RNA 'guide' sequences, and identifies its targets primarily via a 'seed' mediated partial complementarity process. Despite numerous studies, a definitive experimental dataset of AGO2 'guide'-'target' interactions remains elusive. Our study employs two experimental methods-AGO2 CLASH and AGO2 eCLIP, to generate thousands of AGO2 target sites verified by chimeric reads. These chimeric reads contain both the AGO2 loaded small RNA 'guide' and the target sequence, providing a robust resource for modeling AGO2 binding preferences. Our novel analysis pipeline reveals thousands of AGO2 target sites driven by microRNAs and a significant number of AGO2 'guides' derived from fragments of other small RNAs such as tRNAs, YRNAs, snoRNAs, rRNAs, and more. We utilize convolutional neural networks to train machine learning models that accurately predict the binding potential for each 'guide' class and experimentally validate several interactions. In conclusion, our comprehensive analysis of the AGO2 targetome broadens our understanding of its 'guide' repertoire and potential function in development and disease. Moreover, we offer practical bioinformatic tools for future experiments and the prediction of AGO2 targets. All data and code from this study are freely available at https://github.com/ML-Bioinfo-CEITEC/HybriDetector/ .
Collapse
Affiliation(s)
- Vaclav Hejret
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 62500, Brno, Czech Republic
| | - Nandan Mysore Varadarajan
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 62500, Brno, Czech Republic
| | - Eva Klimentova
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Masaryk University, 62500, Brno, Czech Republic
| | - Katarina Gresova
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic
| | - Ilektra-Chara Giassa
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic
| | - Stepanka Vanacova
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic.
| | - Panagiotis Alexiou
- Central European Institute of Technology, Masaryk University, 62500, Brno, Czech Republic.
- Department of Applied Biomedical Science, Faculty of Health Sciences, University of Malta, Msida, MSD 2080, Malta.
- Centre for Molecular Medicine & Biobanking, University of Malta, Msida, MSD 2080, Malta.
| |
Collapse
|
2
|
Forgac F, Munkova D, Munk M, Kelebercova L. Evaluating automatic sentence alignment approaches on English-Slovak sentences. Sci Rep 2023; 13:20123. [PMID: 37978270 PMCID: PMC10656450 DOI: 10.1038/s41598-023-47479-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 11/14/2023] [Indexed: 11/19/2023] Open
Abstract
Parallel texts represent a very valuable resource in many applications of natural language processing. The fundamental step in creating parallel corpus is the alignment. Sentence alignment is the issue of finding correspondence between source sentences and their equivalent translations in the target text. A number of automatic sentence alignment approaches were proposed including neural networks, which can be divided into length-based, lexicon-based, and translation-based. In our study, we used five different aligners, namely Bilingual sentence aligner (BSA), Hunalign, Bleualign, Vecalign, and Bertalign. We evaluated both, the performance of the Bertalign in terms of accuracy against the up to now employed aligners as well as among each other in the language pair English-Sovak. We created our custom corpus consisting of texts collected in 2021 and 2022. Vecalign and Bertalign performed statistically significantly best and BSA the worst. Hunalign and Bleualign achieved the same performance in terms of F1 score. However, Bleualign achieved the most diverse results in terms of performance.
Collapse
Affiliation(s)
- Frantisek Forgac
- Faculty of Natural Sciences and Informatics, Constantine the Philosopher University in Nitra, Nitra, Slovakia.
| | - Dasa Munkova
- Faculty of Natural Sciences and Informatics, Constantine the Philosopher University in Nitra, Nitra, Slovakia
| | - Michal Munk
- Faculty of Natural Sciences and Informatics, Constantine the Philosopher University in Nitra, Nitra, Slovakia
- Science and Research Centre, University of Pardubice, Pardubice, Czech Republic
| | - Livia Kelebercova
- Faculty of Natural Sciences and Informatics, Constantine the Philosopher University in Nitra, Nitra, Slovakia
| |
Collapse
|