1
|
Pieroni M, Madeddu F, Di Martino J, Arcieri M, Parisi V, Bottoni P, Castrignanò T. MD-Ligand-Receptor: A High-Performance Computing Tool for Characterizing Ligand-Receptor Binding Interactions in Molecular Dynamics Trajectories. Int J Mol Sci 2023; 24:11671. [PMID: 37511429 PMCID: PMC10380688 DOI: 10.3390/ijms241411671] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 07/15/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023] Open
Abstract
Molecular dynamics simulation is a widely employed computational technique for studying the dynamic behavior of molecular systems over time. By simulating macromolecular biological systems consisting of a drug, a receptor and a solvated environment with thousands of water molecules, MD allows for realistic ligand-receptor binding interactions (lrbi) to be studied. In this study, we present MD-ligand-receptor (MDLR), a state-of-the-art software designed to explore the intricate interactions between ligands and receptors over time using molecular dynamics trajectories. Unlike traditional static analysis tools, MDLR goes beyond simply taking a snapshot of ligand-receptor binding interactions (lrbi), uncovering long-lasting molecular interactions and predicting the time-dependent inhibitory activity of specific drugs. With MDLR, researchers can gain insights into the dynamic behavior of complex ligand-receptor systems. Our pipeline is optimized for high-performance computing, capable of efficiently processing vast molecular dynamics trajectories on multicore Linux servers or even multinode HPC clusters. In the latter case, MDLR allows the user to analyze large trajectories in a very short time. To facilitate the exploration and visualization of lrbi, we provide an intuitive Python notebook (Jupyter), which allows users to examine and interpret the results through various graphical representations.
Collapse
Affiliation(s)
- Michele Pieroni
- Department of Computer Science, "Sapienza" University of Rome, V. le Regina Elena 295, 00161 Rome, Italy
| | - Francesco Madeddu
- Department of Computer Science, "Sapienza" University of Rome, V. le Regina Elena 295, 00161 Rome, Italy
| | - Jessica Di Martino
- Department of Ecological and Biological Sciences, Tuscia University, Viale dell'Università s.n.c., 01100 Viterbo, Italy
| | - Manuel Arcieri
- Department of Health Technology, Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Valerio Parisi
- Department of Physics, "Sapienza" University of Rome, P. le Aldo Moro, 5, 00185 Rome, Italy
| | - Paolo Bottoni
- Department of Computer Science, "Sapienza" University of Rome, V. le Regina Elena 295, 00161 Rome, Italy
| | - Tiziana Castrignanò
- Department of Ecological and Biological Sciences, Tuscia University, Viale dell'Università s.n.c., 01100 Viterbo, Italy
| |
Collapse
|
2
|
D'Antonio M, Libro P, Picardi E, Pesole G, Castrignanò T. RAP: A Web Tool for RNA-Seq Data Analysis. Methods Mol Biol 2021; 2284:393-415. [PMID: 33835454 DOI: 10.1007/978-1-0716-1307-8_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Since 1950 main studies of RNA regarded its role in the protein synthesis. Later insights showed that only a small portion of RNA codes for proteins where the rest could have different functional roles. With the advent of Next Generation Sequencing (NGS) and in particular with RNA-seq technology the cost of sequencing production dropped down. Among the NGS application areas, the transcriptome analysis, that is, the analysis of transcripts in a cell, their quantification for a specific developmental stage or treatment condition, became more and more adopted in the laboratories. As a consequence in the last decade new insights were gained in the understanding of both transcriptome complexity and involvement of RNA molecules in cellular processes. For what concerns computational advances, bioinformatics research developed new methods for analyzing RNA-seq data. The comparison among transcriptome profiles from several samples is often a difficult task for nonexpert programmers. Here, in this chapter, we introduce RAP (RNA-Seq Analysis Pipeline), a completely automated web tool for transcriptome analysis. It is a user-friendly web tool implementing a detailed transcriptome workflow to detect differential expressed genes and transcript, identify spliced junctions and constitutive or alternative polyadenylation sites and predict gene fusion events. Through the web interface the researchers can get all this information without any knowledge of the underlying High Performance Computing infrastructure.
Collapse
Affiliation(s)
- Mattia D'Antonio
- SuperComputing Applications and Innovation Department, CINECA, Rome, Italy
| | - Pietro Libro
- Department of Ecological and Biological Sciences (DEB), University of Tuscia, Viterbo, Italy
| | - Ernesto Picardi
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari, Bari, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Bari, Italy
- Consorzio Interuniversitario Biotecnologie, Trieste, Italy
| | - Graziano Pesole
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari, Bari, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Bari, Italy
- Consorzio Interuniversitario Biotecnologie, Trieste, Italy
| | - Tiziana Castrignanò
- Department of Ecological and Biological Sciences (DEB), University of Tuscia, Viterbo, Italy.
| |
Collapse
|
3
|
Somaschini A, Di Bella S, Cusi C, Raddrizzani L, Leone A, Carapezza G, Mazza T, Isacchi A, Bosotti R. Mining potentially actionable kinase gene fusions in cancer cell lines with the KuNG FU database. Sci Data 2020; 7:420. [PMID: 33257674 PMCID: PMC7705673 DOI: 10.1038/s41597-020-00761-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 10/29/2020] [Indexed: 12/02/2022] Open
Abstract
Inhibition of kinase gene fusions (KGFs) has proven successful in cancer treatment and continues to represent an attractive research area, due to kinase druggability and clinical validation. Indeed, literature and public databases report a remarkable number of KGFs as potential drug targets, often identified by in vitro characterization of tumor cell line models and confirmed also in clinical samples. However, KGF molecular and experimental information can sometimes be sparse and partially overlapping, suggesting the need for a specific annotation database of KGFs, conveniently condensing all the molecular details that can support targeted drug development pipelines and diagnostic approaches. Here, we describe KuNG FU (KiNase Gene FUsion), a manually curated database collecting detailed annotations on KGFs that were identified and experimentally validated in human cancer cell lines from multiple sources, exclusively focusing on in-frame KGF events retaining an intact kinase domain, representing potentially active driver kinase targets. To our knowledge, KuNG FU represents to date the largest freely accessible homogeneous and curated database of kinase gene fusions in cell line models.
Collapse
Affiliation(s)
- Alessio Somaschini
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Sebastiano Di Bella
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Carlo Cusi
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Laura Raddrizzani
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Antonella Leone
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Giovanni Carapezza
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Tommaso Mazza
- Bioinformatics Unit, IRCCS "Casa Sollievo della Sofferenza", Research Hospital, San Giovanni Rotondo, Italy
| | - Antonella Isacchi
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy
| | - Roberta Bosotti
- NMS Oncology, Nerviano Medical Sciences, NMS Group, 20014, Nerviano, Milan, Italy.
| |
Collapse
|
4
|
Castrignanò T, Gioiosa S, Flati T, Cestari M, Picardi E, Chiara M, Fratelli M, Amente S, Cirilli M, Tangaro MA, Chillemi G, Pesole G, Zambelli F. ELIXIR-IT HPC@CINECA: high performance computing resources for the bioinformatics community. BMC Bioinformatics 2020; 21:352. [PMID: 32838759 PMCID: PMC7446135 DOI: 10.1186/s12859-020-03565-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND The advent of Next Generation Sequencing (NGS) technologies and the concomitant reduction in sequencing costs allows unprecedented high throughput profiling of biological systems in a cost-efficient manner. Modern biological experiments are increasingly becoming both data and computationally intensive and the wealth of publicly available biological data is introducing bioinformatics into the "Big Data" era. For these reasons, the effective application of High Performance Computing (HPC) architectures is becoming progressively more recognized also by bioinformaticians. Here we describe HPC resources provisioning pilot programs dedicated to bioinformaticians, run by the Italian Node of ELIXIR (ELIXIR-IT) in collaboration with CINECA, the main Italian supercomputing center. RESULTS Starting from April 2016, CINECA and ELIXIR-IT launched the pilot Call "ELIXIR-IT HPC@CINECA", offering streamlined access to HPC resources for bioinformatics. Resources are made available either through web front-ends to dedicated workflows developed at CINECA or by providing direct access to the High Performance Computing systems through a standard command-line interface tailored for bioinformatics data analysis. This allows to offer to the biomedical research community a production scale environment, continuously updated with the latest available versions of publicly available reference datasets and bioinformatic tools. Currently, 63 research projects have gained access to the HPC@CINECA program, for a total handout of ~ 8 Millions of CPU/hours and, for data storage, ~ 100 TB of permanent and ~ 300 TB of temporary space. CONCLUSIONS Three years after the beginning of the ELIXIR-IT HPC@CINECA program, we can appreciate its impact over the Italian bioinformatics community and draw some considerations. Several Italian researchers who applied to the program have gained access to one of the top-ranking public scientific supercomputing facilities in Europe. Those investigators had the opportunity to sensibly reduce computational turnaround times in their research projects and to process massive amounts of data, pursuing research approaches that would have been otherwise difficult or impossible to undertake. Moreover, by taking advantage of the wealth of documentation and training material provided by CINECA, participants had the opportunity to improve their skills in the usage of HPC systems and be better positioned to apply to similar EU programs of greater scale, such as PRACE. To illustrate the effective usage and impact of the resources awarded by the program - in different research applications - we report five successful use cases, which have already published their findings in peer-reviewed journals.
Collapse
Affiliation(s)
- Tiziana Castrignanò
- Department of Ecological and Biological Sciences (DEB), University of Tuscia, Viterbo, Italy.
| | - Silvia Gioiosa
- CINECA, SuperComputing Applications and Innovation Department, Rome, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy
| | - Tiziano Flati
- CINECA, SuperComputing Applications and Innovation Department, Rome, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy
| | - Mirko Cestari
- CINECA, SuperComputing Applications and Innovation Department, Rome, Italy
| | - Ernesto Picardi
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "A. Moro", Bari, Italy
| | - Matteo Chiara
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy.,Department of Biosciences, University of Milan, Milan, Italy
| | - Maddalena Fratelli
- IRCCS-Istituto di Ricerche Farmacologiche "Mario Negri", Milano, Milan, Italy
| | - Stefano Amente
- Department of Molecular Medicine and Medical Biotechnologies, University of Naples 'Federico II', Naples, Italy
| | - Marco Cirilli
- Department of Agricultural and Environmental Sciences - Production, Landscape, Agroenergy (DISAA), University of Milan, Milan, Italy
| | - Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy
| | - Giovanni Chillemi
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy.,Department for Innovation in Biological, Agro-food and Forest systems (DIBAF), University of Tuscia, Viterbo, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy. .,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "A. Moro", Bari, Italy.
| | - Federico Zambelli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (IBIOM-CNR), Bari, Italy. .,Department of Biosciences, University of Milan, Milan, Italy.
| |
Collapse
|
5
|
Gioiosa S, Bolis M, Flati T, Massini A, Garattini E, Chillemi G, Fratelli M, Castrignanò T. Massive NGS data analysis reveals hundreds of potential novel gene fusions in human cell lines. Gigascience 2018; 7:5026623. [PMID: 29860514 PMCID: PMC6207142 DOI: 10.1093/gigascience/giy062] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 05/29/2018] [Indexed: 01/19/2023] Open
Abstract
Background Gene fusions derive from chromosomal rearrangements. The resulting chimeric transcripts are often endowed with oncogenic potential. Furthermore, they serve as diagnostic tools for the clinical classification of cancer subgroups with different prognosis and, in some cases, they can provide specific drug targets. To date, many efforts have been carried out to study gene fusion events occurring in tumor samples. In recent years, the availability of a comprehensive next-generation sequencing dataset for all existing human tumor cell lines has provided the opportunity to further investigate these data in order to identify novel and still uncharacterized gene fusion events. Results In our work, we have extensively reanalyzed 935 paired-end RNA-sequencing experiments downloaded from the Cancer Cell Line Encyclopedia repository, aiming at addressing novel putative cell-line specific gene fusion events in human malignancies. The bioinformatics analysis has been performed by the execution of four gene fusion detection algorithms. The results have been further prioritized by running a Bayesian classifier that makes an in silico validation. The collection of fusion events supported by all of the predictive software results in a robust set of ∼1,700 in silico predicted novel candidates suitable for downstream analyses. Given the huge amount of data and information produced, computational results have been systematized in a database named LiGeA. The database can be browsed through a dynamic and interactive web portal, further integrated with validated data from other well-known repositories. Taking advantage of the intuitive query forms, the users can easily access, navigate, filter, and select the putative gene fusions for further validations and studies. They can also find suitable experimental models for a given fusion of interest. Conclusions We believe that the LiGeA resource can represent not only the first compendium of both known and putative novel gene fusion events in the catalog of all of the human malignant cell lines but it can also become a handy starting point for wet-lab biologists who wish to investigate novel cancer biomarkers and specific drug targets.
Collapse
Affiliation(s)
- Silvia Gioiosa
- SCAI-Super Computing Applications and Innovation Department, CINECA, Rome, Italy.,National Council of Research, CNR, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| | - Marco Bolis
- Laboratory of Molecular Biology, IRCCS-Istituto di Ricerche Farmacologiche "Mario Negri," Milano, Italy
| | - Tiziano Flati
- SCAI-Super Computing Applications and Innovation Department, CINECA, Rome, Italy.,National Council of Research, CNR, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| | | | - Enrico Garattini
- Laboratory of Molecular Biology, IRCCS-Istituto di Ricerche Farmacologiche "Mario Negri," Milano, Italy
| | - Giovanni Chillemi
- SCAI-Super Computing Applications and Innovation Department, CINECA, Rome, Italy
| | - Maddalena Fratelli
- Laboratory of Molecular Biology, IRCCS-Istituto di Ricerche Farmacologiche "Mario Negri," Milano, Italy
| | - Tiziana Castrignanò
- SCAI-Super Computing Applications and Innovation Department, CINECA, Rome, Italy
| |
Collapse
|