1
|
Alessandri S, Ratto ML, Rabellino S, Piacenti G, Contaldo SG, Pernice S, Beccuti M, Calogero RA, Alessandri L. CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications. BMC Bioinformatics 2024; 25:110. [PMID: 38475691 DOI: 10.1186/s12859-024-05695-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 02/09/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible research outcomes due to inconsistencies and the lack of standardization in the analysis process. These issues can lead to discrepancies in results, undermining the credibility and impact of bioinformatics research and creating mistrust in the scientific process. To address these challenges, open science practices such as sharing data, code, and methods have been encouraged. RESULTS CREDO, a Customizable, REproducible, DOcker file generator for bioinformatics applications, has been developed as a tool to moderate reproducibility issues by building and distributing docker containers with embedded bioinformatics tools. CREDO simplifies the process of generating Docker images, facilitating reproducibility and efficient research in bioinformatics. The crucial step in generating a Docker image is creating the Dockerfile, which requires incorporating heterogeneous packages and environments such as Bioconductor and Conda. CREDO stores all required package information and dependencies in a Github-compatible format to enhance Docker image reproducibility, allowing easy image creation from scratch. The user-friendly GUI and CREDO's ability to generate modular Docker images make it an ideal tool for life scientists to efficiently create Docker images. Overall, CREDO is a valuable tool for addressing reproducibility issues in bioinformatics research and promoting open science practices.
Collapse
Affiliation(s)
| | - Maria L Ratto
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin, Italy
| | - Sergio Rabellino
- Department of Computer Science, University of Torino, Turin, Italy
| | - Gabriele Piacenti
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin, Italy
| | | | - Simone Pernice
- Department of Computer Science, University of Torino, Turin, Italy
| | - Marco Beccuti
- Department of Computer Science, University of Torino, Turin, Italy
| | - Raffaele A Calogero
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin, Italy.
| | - Luca Alessandri
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin, Italy
- Department of Pathology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
2
|
Cassidy MJ, Wallace DA, Purcell S, Sofer T. Reproducibility in computational sleep research: a call for action. Sleep 2024; 47:zsad143. [PMID: 37235755 PMCID: PMC10782485 DOI: 10.1093/sleep/zsad143] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023] Open
Affiliation(s)
- Michael J Cassidy
- Division of Sleep and Circadian Disorders, Departments of Medicine and Neurology, Brigham and Women’s Hospital, Boston MA, USA
- Department of Medicine, Cardiovascular Institute, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Danielle A Wallace
- Division of Sleep and Circadian Disorders, Departments of Medicine and Neurology, Brigham and Women’s Hospital, Boston MA, USA
- Division of Sleep and Circadian Disorders, Harvard Medical School, Boston MA, USA
| | - Shaun Purcell
- Department of Psychiatry, Brigham and Women’s Hospital, Boston MA, USA
| | - Tamar Sofer
- Division of Sleep and Circadian Disorders, Departments of Medicine and Neurology, Brigham and Women’s Hospital, Boston MA, USA
- Department of Medicine, Cardiovascular Institute, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Division of Sleep and Circadian Disorders, Harvard Medical School, Boston MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
3
|
Bocchini M, Tazzari M, Ravaioli S, Piccinini F, Foca F, Tebaldi M, Nicolini F, Grassi I, Severi S, Calogero RA, Arigoni M, Schrader J, Mazza M, Paganelli G. Circulating hsa-miR-5096 predicts 18F-FDG PET/CT positivity and modulates somatostatin receptor 2 expression: a novel miR-based assay for pancreatic neuroendocrine tumors. Front Oncol 2023; 13:1136331. [PMID: 37287922 PMCID: PMC10242108 DOI: 10.3389/fonc.2023.1136331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 04/25/2023] [Indexed: 06/09/2023] Open
Abstract
Gastro-entero-pancreatic neuroendocrine tumors (GEP-NETs) are rare diseases encompassing pancreatic (PanNETs) and ileal NETs (SINETs), characterized by heterogeneous somatostatin receptors (SSTRs) expression. Treatments for inoperable GEP-NETs are limited, and SSTR-targeted Peptide Receptor Radionuclide Therapy (PRRT) achieves variable responses. Prognostic biomarkers for the management of GEP-NET patients are required. 18F-FDG uptake is a prognostic indicator of aggressiveness in GEP-NETs. This study aims to identify circulating and measurable prognostic miRNAs associated with 18F-FDG-PET/CT status, higher risk and lower response to PRRT. Methods Whole miRNOme NGS profiling was conducted on plasma samples obtained from well-differentiated advanced, metastatic, inoperable G1, G2 and G3 GEP-NET patients enrolled in the non-randomized LUX (NCT02736500) and LUNET (NCT02489604) clinical trials prior to PRRT (screening set, n= 24). Differential expression analysis was performed between 18F-FDG positive (n=12) and negative (n=12) patients. Validation was conducted by Real Time quantitative PCR in two distinct well-differentiated GEP-NET validation cohorts, considering the primary site of origin (PanNETs n=38 and SINETs n=30). The Cox regression was applied to assess independent clinical parameters and imaging for progression-free survival (PFS) in PanNETs. In situ RNA hybridization combined with immunohistochemistry was performed to simultaneously detect miR and protein expression in the same tissue specimens. This novel semi-automated miR-protein protocol was applied in PanNET FFPE specimens (n=9). In vitro functional experiments were performed in PanNET models. Results While no miRNAs emerged to be deregulated in SINETs, hsa-miR-5096, hsa-let-7i-3p and hsa-miR-4311 were found to correlate with 18F-FDG-PET/CT in PanNETs (p-value:<0.005). Statistical analysis has shown that, hsa-miR-5096 can predict 6-month PFS (p-value:<0.001) and 12-month Overall Survival upon PRRT treatment (p-value:<0.05), as well as identify 18F-FDG-PET/CT positive PanNETs with worse prognosis after PRRT (p-value:<0.005). In addition, hsa-miR-5096 inversely correlated with both SSTR2 expression in PanNET tissue and with the 68Gallium-DOTATOC captation values (p-value:<0.05), and accordingly it was able to decrease SSTR2 when ectopically expressed in PanNET cells (p-value:<0.01). Conclusions hsa-miR-5096 well performs as a biomarker for 18F-FDG-PET/CT and as independent predictor of PFS. Moreover, exosome-mediated delivery of hsa-miR-5096 may promote SSTR2 heterogeneity and thus resistance to PRRT.
Collapse
Affiliation(s)
- Martine Bocchini
- Immunotherapy, Cell Therapy and Biobank (ITCB), IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Marcella Tazzari
- Immunotherapy, Cell Therapy and Biobank (ITCB), IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Sara Ravaioli
- Biosciences Laboratory, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Filippo Piccinini
- Scientific Directorate, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
- Department of Medical and Surgical Sciences (DIMEC), University of Bologna, Bologna, Italy
| | - Flavia Foca
- Unit of Biostatistics and Clinical Trials, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Michela Tebaldi
- Unit of Biostatistics and Clinical Trials, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Fabio Nicolini
- Immunotherapy, Cell Therapy and Biobank (ITCB), IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Ilaria Grassi
- Nuclear Medicine and Radiometabolic Unit, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Stefano Severi
- Nuclear Medicine and Radiometabolic Unit, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Raffaele Adolfo Calogero
- Molecular Biotechnology Center, Department of Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Maddalena Arigoni
- Molecular Biotechnology Center, Department of Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Joerg Schrader
- Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Massimiliano Mazza
- Immunotherapy, Cell Therapy and Biobank (ITCB), IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Giovanni Paganelli
- Nuclear Medicine and Radiometabolic Unit, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| |
Collapse
|
4
|
Salemme V, Vedelago M, Sarcinella A, Moietta F, Piccolantonio A, Moiso E, Centonze G, Manco M, Guala A, Lamolinara A, Angelini C, Morellato A, Natalini D, Calogero R, Incarnato D, Oliviero S, Conti L, Iezzi M, Tosoni D, Bertalot G, Freddi S, Tucci FA, De Sanctis F, Frusteri C, Ugel S, Bronte V, Cavallo F, Provero P, Gai M, Taverna D, Turco E, Pece S, Defilippi P. p140Cap inhibits β-Catenin in the breast cancer stem cell compartment instructing a protective anti-tumor immune response. Nat Commun 2023; 14:2350. [PMID: 37169737 PMCID: PMC10175288 DOI: 10.1038/s41467-023-37824-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 04/03/2023] [Indexed: 05/13/2023] Open
Abstract
The p140Cap adaptor protein is a tumor suppressor in breast cancer associated with a favorable prognosis. Here we highlight a function of p140Cap in orchestrating local and systemic tumor-extrinsic events that eventually result in inhibition of the polymorphonuclear myeloid-derived suppressor cell function in creating an immunosuppressive tumor-promoting environment in the primary tumor, and premetastatic niches at distant sites. Integrative transcriptomic and preclinical studies unravel that p140Cap controls an epistatic axis where, through the upstream inhibition of β-Catenin, it restricts tumorigenicity and self-renewal of tumor-initiating cells limiting the release of the inflammatory cytokine G-CSF, required for polymorphonuclear myeloid-derived suppressor cells to exert their local and systemic tumor conducive function. Mechanistically, p140Cap inhibition of β-Catenin depends on its ability to localize in and stabilize the β-Catenin destruction complex, promoting enhanced β-Catenin inactivation. Clinical studies in women show that low p140Cap expression correlates with reduced presence of tumor-infiltrating lymphocytes and more aggressive tumor types in a large cohort of real-life female breast cancer patients, highlighting the potential of p140Cap as a biomarker for therapeutic intervention targeting the β-Catenin/ Tumor-initiating cells /G-CSF/ polymorphonuclear myeloid-derived suppressor cell axis to restore an efficient anti-tumor immune response.
Collapse
Affiliation(s)
- Vincenzo Salemme
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Mauro Vedelago
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Alessandro Sarcinella
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Federico Moietta
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Alessio Piccolantonio
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Enrico Moiso
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Giorgia Centonze
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Marta Manco
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Andrea Guala
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Alessia Lamolinara
- Immuno-Oncology Laboratory, Center for Advanced Studies and Technology (CAST), Department of Neuroscience, Imaging and Clinical Sciences, G. d'Annunzio University of Chieti-Pescara, Chieti-Pescara, Italy
| | - Costanza Angelini
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Alessandro Morellato
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Dora Natalini
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Raffaele Calogero
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands
| | - Salvatore Oliviero
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
- Department of Life Sciences and Systems Biology, University of Turin, Torino, Italy and IIGM, Candiolo, Italy
| | - Laura Conti
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Manuela Iezzi
- Immuno-Oncology Laboratory, Center for Advanced Studies and Technology (CAST), Department of Neuroscience, Imaging and Clinical Sciences, G. d'Annunzio University of Chieti-Pescara, Chieti-Pescara, Italy
| | - Daniela Tosoni
- European Institute of Oncology IRCCS, 20141, Milan, Italy
| | | | - Stefano Freddi
- European Institute of Oncology IRCCS, 20141, Milan, Italy
| | - Francesco A Tucci
- European Institute of Oncology IRCCS, 20141, Milan, Italy
- School of Pathology, University of Milan, Milan, Italy
| | - Francesco De Sanctis
- Immunology Section, Department of Medicine, University of Verona, 37134, Verona, Italy
| | - Cristina Frusteri
- Immunology Section, Department of Medicine, University of Verona, 37134, Verona, Italy
| | - Stefano Ugel
- Immunology Section, Department of Medicine, University of Verona, 37134, Verona, Italy
| | - Vincenzo Bronte
- Immunology Section, Department of Medicine, University of Verona, 37134, Verona, Italy
- Istituto Oncologico Veneto, IRCCS, 35128, Padova, Italy
| | - Federica Cavallo
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Paolo Provero
- Neuroscience Department "Rita Levi Montalcini", University of Torino, Via Cherasco 15, 10126, Torino, Italy
| | - Marta Gai
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Daniela Taverna
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy
| | - Emilia Turco
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy
| | - Salvatore Pece
- European Institute of Oncology IRCCS, 20141, Milan, Italy.
- Department of Oncology and Hemato-Oncology, Università degli Studi di Milano, 20142, Milano, Italy.
| | - Paola Defilippi
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10126, Torino, Italy.
- Molecular Biotechnology Center (MBC) "Guido Tarone", Via Nizza, 52, 10126, Turin, Italy.
| |
Collapse
|
5
|
Kazakova P, Abasolo N, de Cripan SM, Marquès E, Cereto-Massagué A, Garcia L, Canela N, Tormo R, Torrell H. Gut Microbiome and Small RNA Integrative-Omic Perspective of Meconium and Milk-FED Infant Stool Samples. Int J Mol Sci 2023; 24:ijms24098069. [PMID: 37175775 PMCID: PMC10179101 DOI: 10.3390/ijms24098069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 04/24/2023] [Accepted: 04/27/2023] [Indexed: 05/15/2023] Open
Abstract
The human gut microbiome plays an important role in health, and its initial development is conditioned by many factors, such as feeding. It has also been claimed that this colonization is guided by bacterial populations, the dynamic virome, and transkingdom interactions between host and microbial cells, partially mediated by epigenetic signaling. In this article, we characterized the bacteriome, virome, and smallRNome and their interaction in the meconium and stool samples from infants. Bacterial and viral DNA and RNA were extracted from the meconium and stool samples of 2- to 4-month-old milk-fed infants. The bacteriome, DNA and RNA virome, and smallRNome were assessed using 16S rRNA V4 sequencing, viral enrichment sequencing, and small RNA sequencing protocols, respectively. Data pathway analysis and integration were performed using the R package mixOmics. Our findings showed that the bacteriome differed among the three groups, while the virome and smallRNome presented significant differences, mainly between the meconium and stool of milk-fed infants. The gut environment is rapidly acquired after birth, and it is highly adaptable due to the interaction of environmental factors. Additionally, transkingdom interactions between viruses and bacteria can influence host and smallRNome profiles. However, virome characterization has several protocol limitations that must be considered.
Collapse
Affiliation(s)
- Polina Kazakova
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Nerea Abasolo
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Sara Martinez de Cripan
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | | | - Adrià Cereto-Massagué
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Lorena Garcia
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Núria Canela
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Ramón Tormo
- ESPGHAN, European Society for Paediatric Gastroenterology, Hepatology and Nutrition, 1201 Geneva, Switzerland
- Gastroenterology and Nutrition Pediatric Center, 08006 Barcelona, Spain
| | - Helena Torrell
- Eurecat, Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| |
Collapse
|
6
|
Plasma microRNAs as potential biomarkers in early Alzheimer disease expression. Sci Rep 2022; 12:15589. [PMID: 36114255 PMCID: PMC9481579 DOI: 10.1038/s41598-022-19862-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 09/06/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractThe microRNAs (miRNAs) are potential biomarkers for complex pathologies due to their involvement in the regulation of several pathways. Alzheimer Disease (AD) requires new biomarkers in minimally invasive samples that allow an early diagnosis. The aim of this work is to study miRNAS as potential AD biomarkers and their role in the pathology development. In this study, participants (n = 46) were classified into mild cognitive impairment due to AD (MCI-AD, n = 19), preclinical AD (n = 8) and healthy elderly controls (n = 19), according to CSF biomarkers levels (amyloid β42, total tau, phosphorylated tau) and neuropsychological assessment. Then, plasma miRNAomic expression profiles were analysed by Next Generation Sequencing. Finally, the selected miRNAs were validated by quantitative PCR (q-PCR). A panel of 11 miRNAs was selected from omics expression analysis, and 8 of them were validated by q-PCR. Individually, they did not show statistically significant differences among participant groups. However, a multivariate model including these 8 miRNAs revealed a potential association with AD for three of them. Specifically, relatively lower expression levels of miR-92a-3p and miR-486-5p are observed in AD patients, and relatively higher levels of miR-29a-3p are observed in AD patients. These biomarkers could be involved in the regulation of pathways such as synaptic transmission, structural functions, cell signalling and metabolism or transcription regulation. Some plasma miRNAs (miRNA-92a-3p, miRNA-486-5p, miRNA-29a-3p) are slightly dysregulated in AD, being potential biomarkers of the pathology. However, more studies with a large sample size should be carried out to verify these results, as well as to further investigate the mechanisms of action of these miRNAs.
Collapse
|
7
|
The human "contaminome": bacterial, viral, and computational contamination in whole genome sequences from 1000 families. Sci Rep 2022; 12:9863. [PMID: 35701436 PMCID: PMC9198055 DOI: 10.1038/s41598-022-13269-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 05/18/2022] [Indexed: 01/11/2023] Open
Abstract
The unmapped readspace of whole genome sequencing data tends to be large but is often ignored. We posit that it contains valuable signals of both human infection and contamination. Using unmapped and poorly aligned reads from whole genome sequences (WGS) of over 1000 families and nearly 5000 individuals, we present insights into common viral, bacterial, and computational contamination that plague whole genome sequencing studies. We present several notable results: (1) In addition to known contaminants such as Epstein-Barr virus and phiX, sequences from whole blood and lymphocyte cell lines contain many other contaminants, likely originating from storage, prep, and sequencing pipelines. (2) Sequencing plate and biological sample source of a sample strongly influence contamination profile. And, (3) Y-chromosome fragments not on the human reference genome commonly mismap to bacterial reference genomes. Both experiment-derived and computational contamination is prominent in next-generation sequencing data. Such contamination can compromise results from WGS as well as metagenomics studies, and standard protocols for identifying and removing contamination should be developed to ensure the fidelity of sequencing-based studies.
Collapse
|
8
|
Steenwyk JL, Buida Iii TJ, Gonçalves C, Goltz DC, Morales G, Mead ME, LaBella AL, Chavez CM, Schmitz JE, Hadjifrangiskou M, Li Y, Rokas A. BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data. Genetics 2022; 221:6583183. [PMID: 35536198 PMCID: PMC9252278 DOI: 10.1093/genetics/iyac079] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 05/03/2022] [Indexed: 11/14/2022] Open
Abstract
Bioinformatic analysis-such as genome assembly quality assessment, alignment summary statistics, relative synonymous codon usage, file format conversion, and processing and analysis-is integrated into diverse disciplines in the biological sciences. Several command-line pieces of software have been developed to conduct some of these individual analyses, but unified toolkits that conduct all these analyses are lacking. To address this gap, we introduce BioKIT, a versatile command line toolkit that has, upon publication, 42 functions, several of which were community-sourced, that conduct routine and novel processing and analysis of genome assemblies, multiple sequence alignments, coding sequences, sequencing data, and more. To demonstrate the utility of BioKIT, we conducted a comprehensive examination of relative synonymous codon usage across 171 fungal genomes that use alternative genetic codes, showed that the novel metric of gene-wise relative synonymous codon usage can accurately estimate gene-wise codon optimization, evaluated the quality and characteristics of 901 eukaryotic genome assemblies, and calculated alignment summary statistics for 10 phylogenomic data matrices. BioKIT will be helpful in facilitating and streamlining sequence analysis workflows. BioKIT is freely available under the MIT license from GitHub (https://github.com/JLSteenwyk/BioKIT), PyPi (https://pypi.org/project/jlsteenwyk-biokit/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/jlsteenwyk-biokit). Documentation, user tutorials, and instructions for requesting new features are available online (https://jlsteenwyk.com/BioKIT).
Collapse
Affiliation(s)
- Jacob L Steenwyk
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA.,Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | | | - Carla Gonçalves
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA.,Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA.,Associate Laboratory i4HB-Institute for Health and Bioeconomy, NOVA School of Science and Technology, NOVA University Lisbon, 2819-516 Caparica, Portugal.,UCIBIO-Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, NOVA University Lisbon, 2819-516 Caparica, Portugal
| | | | - Grace Morales
- Department of Pathology, Microbiology & Immunology, Center for Personalized Microbiology, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Matthew E Mead
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA.,Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Abigail L LaBella
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA.,Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Christina M Chavez
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA.,Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Jonathan E Schmitz
- Department of Pathology, Microbiology & Immunology, Center for Personalized Microbiology, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Maria Hadjifrangiskou
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA.,Department of Pathology, Microbiology & Immunology, Center for Personalized Microbiology, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Yuanning Li
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA.,Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
9
|
Sproviero D, Gagliardi S, Zucca S, Arigoni M, Giannini M, Garofalo M, Fantini V, Pansarasa O, Avenali M, Ramusino MC, Diamanti L, Minafra B, Perini G, Zangaglia R, Costa A, Ceroni M, Calogero RA, Cereda C. Extracellular Vesicles Derived From Plasma of Patients With Neurodegenerative Disease Have Common Transcriptomic Profiling. Front Aging Neurosci 2022; 14:785741. [PMID: 35250537 PMCID: PMC8889100 DOI: 10.3389/fnagi.2022.785741] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 01/13/2022] [Indexed: 11/15/2022] Open
Abstract
Objectives There is a lack of effective biomarkers for neurodegenerative diseases (NDs) such as Alzheimer's disease (AD), Parkinson's disease (PD), amyotrophic lateral sclerosis (ALS), and frontotemporal dementia. Extracellular vesicle (EV) RNA cargo can have an interesting potential as a non-invasive biomarker for NDs. However, the knowledge about the abundance of EV-mRNAs and their contribution to neurodegeneration is not clear. Methods Large and small EVs (LEVs and SEVs) were isolated from plasma of patients and healthy volunteers (control, CTR) by differential centrifugation and filtration, and RNA was extracted. Whole transcriptome was carried out using next generation sequencing (NGS). Results Coding RNA (i.e., mRNA) but not long non-coding RNAs (lncRNAs) in SEVs and LEVs of patients with ALS could be distinguished from healthy CTRs and from other NDs using the principal component analysis (PCA). Some mRNAs were found in commonly deregulated between SEVs of patients with ALS and frontotemporal dementia (FTD), and they were classified in mRNA processing and splicing pathways. In LEVs, instead, one mRNA and one antisense RNA (i.e., MAP3K7CL and AP003068.3) were found to be in common among ALS, FTD, and PD. No deregulated mRNAs were found in EVs of patients with AD. Conclusion Different RNA regulation occurs in LEVs and SEVs of NDs. mRNAs and lncRNAs are present in plasma-derived EVs of NDs, and there are common and specific transcripts that characterize LEVs and SEVs from the NDs considered in this study.
Collapse
Affiliation(s)
- Daisy Sproviero
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Stella Gagliardi
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
- *Correspondence: Stella Gagliardi
| | - Susanna Zucca
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
- EnGenome SRL, Pavia, Italy
| | - Maddalena Arigoni
- Department of Molecular Biotechnology and Health Sciences, Bioinformatics and Genomics Unit, University of Turin, Turin, Italy
| | - Marta Giannini
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
| | - Maria Garofalo
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
- Department of Biology and Biotechnology (“L. Spallanzani”), University of Pavia, Pavia, Italy
| | - Valentina Fantini
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- Laboratory of Neurobiology and Neurogenetic, Golgi-Cenci Foundation, Milan, Italy
| | - Orietta Pansarasa
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Micol Avenali
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- Neurorehabilitation Unit, IRCCS Mondino Foundation, Pavia, Italy
| | - Matteo Cotta Ramusino
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- Unit of Behavioral Neurology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Luca Diamanti
- Neuro-Oncology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (SRCCS) Mondino Foundation, Pavia, Italy
| | - Brigida Minafra
- Parkinson Disease and Movement Disorders Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Giulia Perini
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- Unit of Behavioral Neurology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Roberta Zangaglia
- Parkinson Disease and Movement Disorders Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Alfredo Costa
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- Unit of Behavioral Neurology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Mauro Ceroni
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- Unit of Behavioral Neurology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| | - Raffaele A. Calogero
- Department of Molecular Biotechnology and Health Sciences, Bioinformatics and Genomics Unit, University of Turin, Turin, Italy
| | - Cristina Cereda
- Genomic and Post-genomic Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Mondino Foundation, Pavia, Italy
| |
Collapse
|
10
|
Peña-Bautista C, Álvarez-Sánchez L, Cañada-Martínez AJ, Baquero M, Cháfer-Pericás C. Epigenomics and Lipidomics Integration in Alzheimer Disease: Pathways Involved in Early Stages. Biomedicines 2021; 9:biomedicines9121812. [PMID: 34944628 PMCID: PMC8698767 DOI: 10.3390/biomedicines9121812] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 11/23/2021] [Accepted: 11/29/2021] [Indexed: 01/17/2023] Open
Abstract
Background: Alzheimer Disease (AD) is the most prevalent dementia. However, the physiopathological mechanisms involved in its development are unclear. In this sense, a multi-omics approach could provide some progress. Methods: Epigenomic and lipidomic analysis were carried out in plasma samples from patients with mild cognitive impairment (MCI) due to AD (n = 22), and healthy controls (n = 5). Then, omics integration between microRNAs (miRNAs) and lipids was performed by Sparse Partial Least Squares (s-PLS) regression and target genes for the selected miRNAs were identified. Results: 25 miRNAs and 25 lipids with higher loadings in the sPLS regression were selected. Lipids from phosphatidylethanolamines (PE), lysophosphatidylcholines (LPC), ceramides, phosphatidylcholines (PC), triglycerides (TG) and several long chain fatty acids families were identified as differentially expressed in AD. Among them, several fatty acids showed strong positive correlations with miRNAs studied. In fact, these miRNAs regulated genes implied in fatty acids metabolism, as elongation of very long-chain fatty acids (ELOVL), and fatty acid desaturases (FADs). Conclusions: The lipidomic–epigenomic integration showed that several lipids and miRNAs were differentially expressed in AD, being the fatty acids mechanisms potentially involved in the disease development. However, further work about targeted analysis should be carried out in a larger cohort, in order to validate these preliminary results and study the proposed pathways in detail.
Collapse
Affiliation(s)
- Carmen Peña-Bautista
- Alzheimer’s Disease Research Group, Health Research Institute La Fe, 46026 Valencia, Spain; (C.P.-B.); (L.Á.-S.); (M.B.)
| | - Lourdes Álvarez-Sánchez
- Alzheimer’s Disease Research Group, Health Research Institute La Fe, 46026 Valencia, Spain; (C.P.-B.); (L.Á.-S.); (M.B.)
- Division of Neurology, University and Polytechnic Hospital La Fe, 46026 Valencia, Spain
| | | | - Miguel Baquero
- Alzheimer’s Disease Research Group, Health Research Institute La Fe, 46026 Valencia, Spain; (C.P.-B.); (L.Á.-S.); (M.B.)
- Division of Neurology, University and Polytechnic Hospital La Fe, 46026 Valencia, Spain
| | - Consuelo Cháfer-Pericás
- Alzheimer’s Disease Research Group, Health Research Institute La Fe, 46026 Valencia, Spain; (C.P.-B.); (L.Á.-S.); (M.B.)
- Correspondence: ; Tel.: +34-96-124-67-21; Fax: +34-96-124-57-46
| |
Collapse
|
11
|
Arslan E, Schulz J, Rai K. Machine Learning in Epigenomics: Insights into Cancer Biology and Medicine. Biochim Biophys Acta Rev Cancer 2021; 1876:188588. [PMID: 34245839 PMCID: PMC8595561 DOI: 10.1016/j.bbcan.2021.188588] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 05/29/2021] [Accepted: 07/02/2021] [Indexed: 02/01/2023]
Abstract
The recent deluge of genome-wide technologies for the mapping of the epigenome and resulting data in cancer samples has provided the opportunity for gaining insights into and understanding the roles of epigenetic processes in cancer. However, the complexity, high-dimensionality, sparsity, and noise associated with these data pose challenges for extensive integrative analyses. Machine Learning (ML) algorithms are particularly suited for epigenomic data analyses due to their flexibility and ability to learn underlying hidden structures. We will discuss four overlapping but distinct major categories under ML: dimensionality reduction, unsupervised methods, supervised methods, and deep learning (DL). We review the preferred use cases of these algorithms in analyses of cancer epigenomics data with the hope to provide an overview of how ML approaches can be used to explore fundamental questions on the roles of epigenome in cancer biology and medicine.
Collapse
Affiliation(s)
- Emre Arslan
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America
| | - Jonathan Schulz
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America
| | - Kunal Rai
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America.
| |
Collapse
|
12
|
Tangaro MA, Mandreoli P, Chiara M, Donvito G, Antonacci M, Parisi A, Bianco A, Romano A, Bianchi DM, Cangelosi D, Uva P, Molineris I, Nosi V, Calogero RA, Alessandri L, Pedrini E, Mordenti M, Bonetti E, Sangiorgi L, Pesole G, Zambelli F. Laniakea@ReCaS: exploring the potential of customisable Galaxy on-demand instances as a cloud-based service. BMC Bioinformatics 2021; 22:544. [PMID: 34749633 PMCID: PMC8574934 DOI: 10.1186/s12859-021-04401-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/24/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Improving the availability and usability of data and analytical tools is a critical precondition for further advancing modern biological and biomedical research. For instance, one of the many ramifications of the COVID-19 global pandemic has been to make even more evident the importance of having bioinformatics tools and data readily actionable by researchers through convenient access points and supported by adequate IT infrastructures. One of the most successful efforts in improving the availability and usability of bioinformatics tools and data is represented by the Galaxy workflow manager and its thriving community. In 2020 we introduced Laniakea, a software platform conceived to streamline the configuration and deployment of "on-demand" Galaxy instances over the cloud. By facilitating the set-up and configuration of Galaxy web servers, Laniakea provides researchers with a powerful and highly customisable platform for executing complex bioinformatics analyses. The system can be accessed through a dedicated and user-friendly web interface that allows the Galaxy web server's initial configuration and deployment. RESULTS "Laniakea@ReCaS", the first instance of a Laniakea-based service, is managed by ELIXIR-IT and was officially launched in February 2020, after about one year of development and testing that involved several users. Researchers can request access to Laniakea@ReCaS through an open-ended call for use-cases. Ten project proposals have been accepted since then, totalling 18 Galaxy on-demand virtual servers that employ ~ 100 CPUs, ~ 250 GB of RAM and ~ 5 TB of storage and serve several different communities and purposes. Herein, we present eight use cases demonstrating the versatility of the platform. CONCLUSIONS During this first year of activity, the Laniakea-based service emerged as a flexible platform that facilitated the rapid development of bioinformatics tools, the efficient delivery of training activities, and the provision of public bioinformatics services in different settings, including food safety and clinical research. Laniakea@ReCaS provides a proof of concept of how enabling access to appropriate, reliable IT resources and ready-to-use bioinformatics tools can considerably streamline researchers' work.
Collapse
Affiliation(s)
- Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Pietro Mandreoli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy
| | - Matteo Chiara
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy
| | - Giacinto Donvito
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Marica Antonacci
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Antonio Parisi
- Istituto Zooprofilattico Sperimentale Della Puglia e Della Basilicata, Via Manfredonia 20, 71121, Foggia, Italy
| | - Angelica Bianco
- Istituto Zooprofilattico Sperimentale Della Puglia e Della Basilicata, Via Manfredonia 20, 71121, Foggia, Italy
| | - Angelo Romano
- National Reference Laboratory for Coagulase-Positive Staphylococci Including Staphylococcus Aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Daniela Manila Bianchi
- National Reference Laboratory for Coagulase-Positive Staphylococci Including Staphylococcus Aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Davide Cangelosi
- Clinical Bioinformatics Unit, Scientific Direction, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147, Genova, Italy
| | - Paolo Uva
- Clinical Bioinformatics Unit, Scientific Direction, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147, Genova, Italy
- Italian Institute of Technology, Via Morego 30, 16163, Genova, Italy
| | - Ivan Molineris
- Department of Life Science and System Biology, University of Turin, Via Accademia Albertina, 13-1023, Turin, Italy
| | - Vladimir Nosi
- Department of Computer Science, University of Turin, Via Pessinetto 12, 10049, Turin, Italy
| | - Raffaele A Calogero
- Department of Molecular Biotechnology and Health Sciences, Via Nizza 52, 10126, Turin, Italy
| | - Luca Alessandri
- Department of Molecular Biotechnology and Health Sciences, Via Nizza 52, 10126, Turin, Italy
| | - Elena Pedrini
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Marina Mordenti
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Emanuele Bonetti
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, 20139, Milan, Italy
| | - Luca Sangiorgi
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy.
- Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Via Orabona 4, 70126, Bari, Italy.
| | - Federico Zambelli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy.
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy.
| |
Collapse
|
13
|
Francavilla A, Gagliardi A, Piaggeschi G, Tarallo S, Cordero F, Pensa RG, Impeduglia A, Caviglia GP, Ribaldone DG, Gallo G, Grioni S, Ferrero G, Pardini B, Naccarati A. Faecal miRNA profiles associated with age, sex, BMI, and lifestyle habits in healthy individuals. Sci Rep 2021; 11:20645. [PMID: 34667192 PMCID: PMC8526833 DOI: 10.1038/s41598-021-00014-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 10/05/2021] [Indexed: 12/14/2022] Open
Abstract
For their stability and detectability faecal microRNAs represent promising molecules with potential clinical interest as non-invasive diagnostic and prognostic biomarkers. However, there is no evidence on how stool miRNA profiles change according to an individual’s age, sex, and body mass index (BMI) or how lifestyle habits influence the expression levels of these molecules. We explored the relationship between the stool miRNA levels and common traits (sex, age, BMI, and menopausal status) or lifestyle habits (physical activity, smoking status, coffee, and alcohol consumption) as derived by a self-reported questionnaire, using small RNA-sequencing data of samples from 335 healthy subjects. We detected 151 differentially expressed miRNAs associated with one variable and 52 associated with at least two. Differences in miR-638 levels were associated with age, sex, BMI, and smoking status. The highest number of differentially expressed miRNAs was associated with BMI (n = 92) and smoking status (n = 84), with several miRNAs shared between them. Functional enrichment analyses revealed the involvement of the miRNA target genes in pathways coherent with the analysed variables. Our findings suggest that miRNA profiles in stool may reflect common traits and lifestyle habits and should be considered in relation to disease and association studies based on faecal miRNA expression.
Collapse
Affiliation(s)
- Antonio Francavilla
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Candiolo, Turin, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Turin, Italy
| | - Amedeo Gagliardi
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Candiolo, Turin, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Turin, Italy
| | - Giulia Piaggeschi
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Candiolo, Turin, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Turin, Italy
| | - Sonia Tarallo
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Candiolo, Turin, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Turin, Italy
| | | | - Ruggero G Pensa
- Department of Computer Science, University of Turin, Turin, Italy
| | | | - Gian Paolo Caviglia
- Division of Gastroenterology, Department of Medical Sciences, University of Turin, Turin, Italy
| | | | - Gaetano Gallo
- Department of Medical and Surgical Sciences, University of Catanzaro, Catanzaro, Italy
| | - Sara Grioni
- Epidemiology and Prevention Unit, Fondazione IRCCS Istituto Nazionale Dei Tumori Di Milano, Milan, Italy
| | - Giulio Ferrero
- Department of Computer Science, University of Turin, Turin, Italy.,Department of Clinical and Biological Sciences, University of Turin, Turin, Italy
| | - Barbara Pardini
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Candiolo, Turin, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Turin, Italy
| | - Alessio Naccarati
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Candiolo, Turin, Italy. .,Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Turin, Italy.
| |
Collapse
|
14
|
Frequent mutations of FBXO11 highlight BCL6 as a therapeutic target in Burkitt lymphoma. Blood Adv 2021; 5:5239-5257. [PMID: 34625792 DOI: 10.1182/bloodadvances.2021005682] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 09/07/2021] [Indexed: 11/20/2022] Open
Abstract
The expression of BCL6 in B cell lymphoma can be deregulated by chromosomal translocations, somatic mutations in the promoter regulatory regions or reduced proteasome-mediated degradation. FBXO11 was recently identified as a ubiquitin ligase involved in the degradation of BCL6 and is frequently inactivated in lymphoma or other tumors. Here, we show that FBXO11 mutations are found in 23% of Burkitt lymphoma (BL) patients. FBXO11 mutations impaired BCL6 degradation and the deletion of FBXO11 protein completely stabilized BCL6 levels in human BL cell lines. Conditional deletion of either one or two copies of the FBXO11 gene in mice cooperated with oncogenic MYC and accelerated B cell lymphoma onset, providing experimental evidence that FBXO11 is a haplo-insufficient oncosuppressor in B cell lymphoma. In WT and FBXO11-deficient BL mouse and human cell lines, targeting BCL6 via specific degrader or inhibitors partially impaired lymphoma growth in vitro and in vivo. Inhibition of MYC by the Omomyc mini-protein blocked cell proliferation and increased apoptosis, effects further increased by combined BCL6 targeting. Thus, by validating the functional role of FBXO11 mutations in BL we further highlight the key role of BCL6 in BL biology and provide evidence that innovative therapeutic approaches such as BCL6 degraders and direct MYC inhibition could be exploited as a targeted therapy for BL.
Collapse
|
15
|
Orchestrating and sharing large multimodal data for transparent and reproducible research. Nat Commun 2021; 12:5797. [PMID: 34608132 PMCID: PMC8490371 DOI: 10.1038/s41467-021-25974-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 09/08/2021] [Indexed: 11/08/2022] Open
Abstract
Reproducibility is essential to open science, as there is limited relevance for findings that can not be reproduced by independent research groups, regardless of its validity. It is therefore crucial for scientists to describe their experiments in sufficient detail so they can be reproduced, scrutinized, challenged, and built upon. However, the intrinsic complexity and continuous growth of biomedical data makes it increasingly difficult to process, analyze, and share with the community in a FAIR (findable, accessible, interoperable, and reusable) manner. To overcome these issues, we created a cloud-based platform called ORCESTRA ( orcestra.ca ), which provides a flexible framework for the reproducible processing of multimodal biomedical data. It enables processing of clinical, genomic and perturbation profiles of cancer samples through automated processing pipelines that are user-customizable. ORCESTRA creates integrated and fully documented data objects with persistent identifiers (DOI) and manages multiple dataset versions, which can be shared for future studies.
Collapse
|
16
|
Chi LH, Wu ATH, Hsiao M, Li YC(J. A Transcriptomic Analysis of Head and Neck Squamous Cell Carcinomas for Prognostic Indications. J Pers Med 2021; 11:782. [PMID: 34442426 PMCID: PMC8399099 DOI: 10.3390/jpm11080782] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 08/03/2021] [Accepted: 08/04/2021] [Indexed: 01/27/2023] Open
Abstract
Survival analysis of the Cancer Genome Atlas (TCGA) dataset is a well-known method for discovering gene expression-based prognostic biomarkers of head and neck squamous cell carcinoma (HNSCC). A cutoff point is usually used in survival analysis for patient dichotomization when using continuous gene expression values. There is some optimization software for cutoff determination. However, the software's predetermined cutoffs are usually set at the medians or quantiles of gene expression values. There are also few clinicopathological features available in pre-processed datasets. We applied an in-house workflow, including data retrieving and pre-processing, feature selection, sliding-window cutoff selection, Kaplan-Meier survival analysis, and Cox proportional hazard modeling for biomarker discovery. In our approach for the TCGA HNSCC cohort, we scanned human protein-coding genes to find optimal cutoff values. After adjustments with confounders, clinical tumor stage and surgical margin involvement were found to be independent risk factors for prognosis. According to the results tables that show hazard ratios with Bonferroni-adjusted p values under the optimal cutoff, three biomarker candidates, CAMK2N1, CALML5, and FCGBP, are significantly associated with overall survival. We validated this discovery by using the another independent HNSCC dataset (GSE65858). Thus, we suggest that transcriptomic analysis could help with biomarker discovery. Moreover, the robustness of the biomarkers we identified should be ensured through several additional tests with independent datasets.
Collapse
Affiliation(s)
- Li-Hsing Chi
- The Ph.D. Program for Translational Medicine, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan; (L.-H.C.); (A.T.H.W.)
- Division of Oral and Maxillofacial Surgery, Department of Dentistry, Wan Fang Hospital, Taipei Medical University, Taipei 11600, Taiwan
- Division of Oral and Maxillofacial Surgery, Department of Dentistry, Taipei Medical University Hospital, Taipei Medical University, Taipei 11031, Taiwan
| | - Alexander T. H. Wu
- The Ph.D. Program for Translational Medicine, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan; (L.-H.C.); (A.T.H.W.)
| | - Michael Hsiao
- Genomics Research Center, Academia Sinica, Taipei 115024, Taiwan
- Department of Biochemistry, College of Medicine, Kaohsiung Medical University, Kaohsiung 807378, Taiwan
| | - Yu-Chuan (Jack) Li
- The Ph.D. Program for Translational Medicine, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan; (L.-H.C.); (A.T.H.W.)
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No.172-1, Sec. 2, Keelung Rd., Taipei 106339, Taiwan
| |
Collapse
|
17
|
John A, Muenzen K, Ausmees K. Evaluation of serverless computing for scalable execution of a joint variant calling workflow. PLoS One 2021; 16:e0254363. [PMID: 34242357 PMCID: PMC8270184 DOI: 10.1371/journal.pone.0254363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 06/24/2021] [Indexed: 11/18/2022] Open
Abstract
Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.
Collapse
Affiliation(s)
- Aji John
- Department of Biology, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Kathleen Muenzen
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, United States of America
| | - Kristiina Ausmees
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
18
|
Melendrez MC, Shaw S, Brown CT, Goodner BW, Kvaal C. Editorial: Curriculum Applications in Microbiology: Bioinformatics in the Classroom. Front Microbiol 2021; 12:705233. [PMID: 34276638 PMCID: PMC8281245 DOI: 10.3389/fmicb.2021.705233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 06/07/2021] [Indexed: 11/18/2022] Open
Affiliation(s)
| | - Sophie Shaw
- Centre for Genome Enabled Biology and Medicine, University of Aberdeen, Aberdeen, United Kingdom
| | - C Titus Brown
- Department of Population Health and Reproduction, University of California, Davis, Davis, CA, United States
| | | | - Christopher Kvaal
- Department of Biology, St. Cloud State University, St. Cloud, MN, United States
| |
Collapse
|
19
|
Ferrero G, Licheri N, De Bortoli M, Calogero RA, Beccuti M, Cordero F. Computational Analysis of circRNA Expression Data. Methods Mol Biol 2021; 2284:181-192. [PMID: 33835443 DOI: 10.1007/978-1-0716-1307-8_10] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Analysis of circular RNA (circRNA) expression from RNA-Seq data can be performed with different algorithms and analysis pipelines, tools allowing the extraction of heterogeneous information on the expression of this novel class of RNAs. Computational pipelines were developed to facilitate the analysis of circRNA expression by leveraging different public tools in easy-to-use pipelines. This chapter describes the complete workflow for a computationally reproducible analysis of circRNA expression starting for a public RNA-Seq experiment. The main steps of circRNA prediction, annotation, classification, sequence reconstruction, quantification, and differential expression are illustrated.
Collapse
Affiliation(s)
- Giulio Ferrero
- Department of Computer Science, University of Turin, Turin, Italy.,Department of Clinical and Biological Sciences, University of Turin, Orbassano, Italy
| | - Nicola Licheri
- Department of Computer Science, University of Turin, Turin, Italy
| | - Michele De Bortoli
- Department of Clinical and Biological Sciences, University of Turin, Orbassano, Italy
| | - Raffaele A Calogero
- Department of Molecular Biotechnology and Health Sciences, University of Turin, Turin, Italy
| | - Marco Beccuti
- Department of Computer Science, University of Turin, Turin, Italy
| | | |
Collapse
|
20
|
Righelli D, Angelini C. Easyreporting simplifies the implementation of Reproducible Research layers in R software. PLoS One 2021; 16:e0244122. [PMID: 33970927 PMCID: PMC8109797 DOI: 10.1371/journal.pone.0244122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 04/20/2021] [Indexed: 11/19/2022] Open
Abstract
During last years "irreproducibility" became a general problem in omics data analysis due to the use of sophisticated and poorly described computational procedures. For avoiding misleading results, it is necessary to inspect and reproduce the entire data analysis as a unified product. Reproducible Research (RR) provides general guidelines for public access to the analytic data and related analysis code combined with natural language documentation, allowing third-parties to reproduce the findings. We developed easyreporting, a novel R/Bioconductor package, to facilitate the implementation of an RR layer inside reports/tools. We describe the main functionalities and illustrate the organization of an analysis report using a typical case study concerning the analysis of RNA-seq data. Then, we show how to use easyreporting in other projects to trace R functions automatically. This latter feature helps developers to implement procedures that automatically keep track of the analysis steps. Easyreporting can be useful in supporting the reproducibility of any data analysis project and shows great advantages for the implementation of R packages and GUIs. It turns out to be very helpful in bioinformatics, where the complexity of the analyses makes it extremely difficult to trace all the steps and parameters used in the study.
Collapse
Affiliation(s)
- Dario Righelli
- Department of Statistical Sciences, University of Padova, Padua, Italy
- Istituto per le Applicazioni del Calcolo “Mauro Picone”, National Research Council, Naples, Italy
- * E-mail: (DR); (CA)
| | - Claudia Angelini
- Istituto per le Applicazioni del Calcolo “Mauro Picone”, National Research Council, Naples, Italy
- * E-mail: (DR); (CA)
| |
Collapse
|
21
|
Sproviero D, Gagliardi S, Zucca S, Arigoni M, Giannini M, Garofalo M, Olivero M, Dell’Orco M, Pansarasa O, Bernuzzi S, Avenali M, Cotta Ramusino M, Diamanti L, Minafra B, Perini G, Zangaglia R, Costa A, Ceroni M, Perrone-Bizzozero NI, Calogero RA, Cereda C. Different miRNA Profiles in Plasma Derived Small and Large Extracellular Vesicles from Patients with Neurodegenerative Diseases. Int J Mol Sci 2021; 22:ijms22052737. [PMID: 33800495 PMCID: PMC7962970 DOI: 10.3390/ijms22052737] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/01/2021] [Accepted: 03/02/2021] [Indexed: 12/11/2022] Open
Abstract
Identifying biomarkers is essential for early diagnosis of neurodegenerative diseases (NDs). Large (LEVs) and small extracellular vesicles (SEVs) are extracellular vesicles (EVs) of different sizes and biological functions transported in blood and they may be valid biomarkers for NDs. The aim of our study was to investigate common and different miRNA signatures in plasma derived LEVs and SEVs of Alzheimer’s disease (AD), Parkinson’s disease (PD), Amyotrophic Lateral Sclerosis (ALS) and Fronto-Temporal Dementia (FTD) patients. LEVs and SEVs were isolated from plasma of patients and healthy volunteers (CTR) by filtration and differential centrifugation and RNA was extracted. Small RNAs libraries were carried out by Next Generation Sequencing (NGS). MiRNAs discriminate all NDs diseases from CTRs and they can provide a signature for each NDs. Common enriched pathways for SEVs were instead linked to ubiquitin mediated proteolysis and Toll-like receptor signaling pathways and for LEVs to neurotrophin signaling and Glycosphingolipid biosynthesis pathway. LEVs and SEVs are involved in different pathways and this might give a specificity to their role in the spreading of the disease. The study of common and different miRNAs transported by LEVs and SEVs can be of great interest for biomarker discovery and for pathogenesis studies in neurodegeneration.
Collapse
Affiliation(s)
- Daisy Sproviero
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
| | - Stella Gagliardi
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
| | - Susanna Zucca
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
- EnGenome SRL, 27100 Pavia, Italy
| | - Maddalena Arigoni
- Department of Molecular Biotechnology and Health Sciences, Bioinformatics and Genomics Unit, University of Turin, 10126 Turin, Italy; (M.A.); (R.A.C.)
| | - Marta Giannini
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
- Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy;
| | - Maria Garofalo
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
- Department of Biology and Biotechnology (“L. Spallanzani”), University of Pavia, 27100 Pavia, Italy
| | - Martina Olivero
- Department of Oncology, University of Turin, 10060 Turin, Italy;
| | - Michela Dell’Orco
- Departments of Neurosciences, University of New Mexico School of Medicine, Albuquerque, NM 87131, USA;
| | - Orietta Pansarasa
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
| | - Stefano Bernuzzi
- Immunohematological and Transfusional Service and Centre of Transplantation Immunology, IRCCS “San Matteo Foundation”, 27100 Pavia, Italy;
| | - Micol Avenali
- Neurorehabilitation Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy;
| | - Matteo Cotta Ramusino
- Unit of Behavioral Neurology, IRCCS Mondino Foundation, 27100 Pavia, Italy; (M.C.R.); (G.P.); (M.C.)
| | - Luca Diamanti
- Neuro-Oncology Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy;
| | - Brigida Minafra
- Parkinson Unit and Movement Disorders Mondino Foundation IRCCS, 27100 Pavia, Italy; (B.M.); (R.Z.)
| | - Giulia Perini
- Unit of Behavioral Neurology, IRCCS Mondino Foundation, 27100 Pavia, Italy; (M.C.R.); (G.P.); (M.C.)
| | - Roberta Zangaglia
- Parkinson Unit and Movement Disorders Mondino Foundation IRCCS, 27100 Pavia, Italy; (B.M.); (R.Z.)
| | - Alfredo Costa
- Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy;
- Unit of Behavioral Neurology, IRCCS Mondino Foundation, 27100 Pavia, Italy; (M.C.R.); (G.P.); (M.C.)
| | - Mauro Ceroni
- Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy;
- Unit of Behavioral Neurology, IRCCS Mondino Foundation, 27100 Pavia, Italy; (M.C.R.); (G.P.); (M.C.)
| | - Nora I. Perrone-Bizzozero
- Departments of Neurosciences and Psychiatry and Behavioral Health, University of New Mexico School of Medicine, Albuquerque, NM 87131, USA;
| | - Raffaele A. Calogero
- Department of Molecular Biotechnology and Health Sciences, Bioinformatics and Genomics Unit, University of Turin, 10126 Turin, Italy; (M.A.); (R.A.C.)
| | - Cristina Cereda
- Genomic and post-Genomic Unit, IRCCS Mondino Foundation, 27100 Pavia, Italy; (D.S.); (S.G.); (S.Z.); (M.G.); (M.G.); (O.P.)
- Correspondence: ; Tel.: +39-0382380348
| |
Collapse
|
22
|
Ferrero G, Carpi S, Polini B, Pardini B, Nieri P, Impeduglia A, Grioni S, Tarallo S, Naccarati A. Intake of Natural Compounds and Circulating microRNA Expression Levels: Their Relationship Investigated in Healthy Subjects With Different Dietary Habits. Front Pharmacol 2021; 11:619200. [PMID: 33519486 PMCID: PMC7840481 DOI: 10.3389/fphar.2020.619200] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 11/30/2020] [Indexed: 12/12/2022] Open
Abstract
Diet has a strong influence on many physiological processes, which in turn have important implications on a variety of pathological conditions. In this respect, microRNAs (miRNAs), a class of small non-coding RNAs playing a relevant epigenetic role in controlling gene expression, may represent mediators between the dietary intake and the healthy status. Despite great advances in the field of nutri-epigenomics, it remains unclear how miRNA expression is modulated by the diet and, specifically, the intake of specific nutrients. We investigated the whole circulating miRNome by small RNA-sequencing performed on plasma samples of 120 healthy volunteers with different dietary habits (vegans, vegetarians, and omnivores). Dietary intakes of specific nutrients were estimated for each subject from the information reported in the food-frequency questionnaire previously validated in the EPIC study. We focused hereby on the intake of 23 natural compounds (NCs) of the classes of lipids, micro-elements, and vitamins. We identified 78 significant correlations (rho > 0.300, p-value < 0.05) among the estimated daily intake of 13 NCs and the expression levels of 58 plasma miRNAs. Overall, vitamin D, sodium, and vitamin E correlated with the largest number of miRNAs. All the identified correlations were consistent among the three dietary groups and 22 of them were confirmed as significant (p-value < 0.05) by age-, gender-, and body-mass index-adjusted Generalized Linear regression Model analysis. miR-23a-3p expression levels were related with different NCs including a significant positive correlation with sodium (rho = 0.377) and significant negative correlations with lipid-related NCs and vitamin E. Conversely, the estimated intake of vitamin D was negatively correlated with the expression of the highest number of circulating miRNAs, particularly miR-1277-5p (rho = −0.393) and miR-144-3p (rho = −0.393). Functional analysis of the targets of sodium intake-correlated miRNAs highlighted terms related to cardiac development. A similar approach on targets of those miRNAs correlated with vitamin D intake showed an enrichment in genes involved in hormone metabolisms, while the response to chronic inflammation was among the top enriched processes involving targets of miRNAs negatively related with vitamin E intake. Our findings show that nutrients through the habitual diet influence circulating miRNA profiles and highlight that this aspect must be considered in the nutri-epigenomic research.
Collapse
Affiliation(s)
- Giulio Ferrero
- Department of Clinical and Biological Sciences, University of Turin, Torino, Italy.,Department of Computer Science, University of Turin, Torino, Italy
| | - Sara Carpi
- Department of Pharmacy, University of Pisa, Pisa, Italy.,NEST, Istituto Nanoscienze-CNR and Scuola Normale Superiore, Pisa, Italy
| | | | - Barbara Pardini
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Torino, Italy
| | - Paola Nieri
- Department of Pharmacy, University of Pisa, Pisa, Italy
| | | | - Sara Grioni
- Epidemiology and Prevention Unit, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Milan, Italy
| | - Sonia Tarallo
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Torino, Italy
| | - Alessio Naccarati
- Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Torino, Italy.,Candiolo Cancer Institute, FPO-IRCCS, Torino, Italy
| |
Collapse
|
23
|
Kanzi AM, San JE, Chimukangara B, Wilkinson E, Fish M, Ramsuran V, de Oliveira T. Next Generation Sequencing and Bioinformatics Analysis of Family Genetic Inheritance. Front Genet 2020; 11:544162. [PMID: 33193618 PMCID: PMC7649788 DOI: 10.3389/fgene.2020.544162] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 09/21/2020] [Indexed: 12/29/2022] Open
Abstract
Mendelian and complex genetic trait diseases continue to burden and affect society both socially and economically. The lack of effective tests has hampered diagnosis thus, the affected lack proper prognosis. Mendelian diseases are caused by genetic mutations in a singular gene while complex trait diseases are caused by the accumulation of mutations in either linked or unlinked genomic regions. Significant advances have been made in identifying novel diseases associated mutations especially with the introduction of next generation and third generation sequencing. Regardless, some diseases are still without diagnosis as most tests rely on SNP genotyping panels developed from population based genetic analyses. Analysis of family genetic inheritance using whole genomes, whole exomes or a panel of genes has been shown to be effective in identifying disease-causing mutations. In this review, we discuss next generation and third generation sequencing platforms, bioinformatic tools and genetic resources commonly used to analyze family based genomic data with a focus on identifying inherited or novel disease-causing mutations. Additionally, we also highlight the analytical, ethical and regulatory challenges associated with analyzing personal genomes which constitute the data used for family genetic inheritance.
Collapse
Affiliation(s)
- Aquillah M. Kanzi
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | | | | | | | | | | | | |
Collapse
|
24
|
Castagno P, Pernice S, Ghetti G, Povero M, Pradelli L, Paolotti D, Balbo G, Sereno M, Beccuti M. A computational framework for modeling and studying pertussis epidemiology and vaccination. BMC Bioinformatics 2020; 21:344. [PMID: 32938370 PMCID: PMC7492136 DOI: 10.1186/s12859-020-03648-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 07/09/2020] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to better understand the spreading characteristics of these diseases and to decide on vaccination policies, human interaction controls, and other social measures to counter, mitigate or simply delay the spread of the infectious diseases. Nevertheless, the construction of mathematical models for these diseases and their solutions remain a challenging tasks due to the fact that little effort has been devoted to the definition of a general framework easily accessible even by researchers without advanced modelling and mathematical skills. RESULTS In this paper we describe a new general modeling framework to study epidemiological systems, whose novelties and strengths are: (1) the use of a graphical formalism to simplify the model creation phase; (2) the implementation of an R package providing a friendly interface to access the analysis techniques implemented in the framework; (3) a high level of portability and reproducibility granted by the containerization of all analysis techniques implemented in the framework; (4) a well-defined schema and related infrastructure to allow users to easily integrate their own analysis workflow in the framework. Then, the effectiveness of this framework is showed through a case of study in which we investigate the pertussis epidemiology in Italy. CONCLUSIONS We propose a new general modeling framework for the analysis of epidemiological systems, which exploits Petri Net graphical formalism, R environment, and Docker containerization to derive a tool easily accessible by any researcher even without advanced mathematical and computational skills. Moreover, the framework was implemented following the guidelines defined by Reproducible Bioinformatics Project so it guarantees reproducible analysis and makes simple the developed of new user-defined workflows.
Collapse
Affiliation(s)
- Paolo Castagno
- Department of Computer Science, University of Turin, Turin, Italy
| | - Simone Pernice
- Department of Computer Science, University of Turin, Turin, Italy
| | | | | | | | - Daniela Paolotti
- Data Science for Social Impact and Sustainability, ISI Foundation, Turin, Italy
| | - Gianfranco Balbo
- Department of Computer Science, University of Turin, Turin, Italy
| | - Matteo Sereno
- Department of Computer Science, University of Turin, Turin, Italy
| | - Marco Beccuti
- Department of Computer Science, University of Turin, Turin, Italy.
| |
Collapse
|
25
|
Kolberg L, Raudvere U, Kuzmin I, Vilo J, Peterson H. gprofiler2 -- an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler. F1000Res 2020; 9:ELIXIR-709. [PMID: 33564394 PMCID: PMC7859841 DOI: 10.12688/f1000research.24956.1] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/03/2020] [Indexed: 01/08/2023] Open
Abstract
g:Profiler ( https://biit.cs.ut.ee/gprofiler) is a widely used gene list functional profiling and namespace conversion toolset that has been contributing to reproducible biological data analysis already since 2007. Here we introduce the accompanying R package, gprofiler2, developed to facilitate programmatic access to g:Profiler computations and databases via REST API. The gprofiler2 package provides an easy-to-use functionality that enables researchers to incorporate functional enrichment analysis into automated analysis pipelines written in R. The package also implements interactive visualisation methods to help to interpret the enrichment results and to illustrate them for publications. In addition, gprofiler2 gives access to the versatile gene/protein identifier conversion functionality in g:Profiler enabling to map between hundreds of different identifier types or orthologous species. The gprofiler2 package is freely available at the CRAN repository.
Collapse
Affiliation(s)
- Liis Kolberg
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Uku Raudvere
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| |
Collapse
|
26
|
Kolberg L, Raudvere U, Kuzmin I, Vilo J, Peterson H. gprofiler2 -- an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler. F1000Res 2020; 9:ELIXIR-709. [PMID: 33564394 PMCID: PMC7859841 DOI: 10.12688/f1000research.24956.2] [Citation(s) in RCA: 276] [Impact Index Per Article: 69.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/15/2020] [Indexed: 12/15/2022] Open
Abstract
g:Profiler ( https://biit.cs.ut.ee/gprofiler) is a widely used gene list functional profiling and namespace conversion toolset that has been contributing to reproducible biological data analysis already since 2007. Here we introduce the accompanying R package, gprofiler2, developed to facilitate programmatic access to g:Profiler computations and databases via REST API. The gprofiler2 package provides an easy-to-use functionality that enables researchers to incorporate functional enrichment analysis into automated analysis pipelines written in R. The package also implements interactive visualisation methods to help to interpret the enrichment results and to illustrate them for publications. In addition, gprofiler2 gives access to the versatile gene/protein identifier conversion functionality in g:Profiler enabling to map between hundreds of different identifier types or orthologous species. The gprofiler2 package is freely available at the CRAN repository.
Collapse
Affiliation(s)
- Liis Kolberg
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Uku Raudvere
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Tartu, Tartumaa, 51009, Estonia
| |
Collapse
|
27
|
Yukselen O, Turkyilmaz O, Ozturk AR, Garber M, Kucukural A. DolphinNext: a distributed data processing platform for high throughput genomics. BMC Genomics 2020; 21:310. [PMID: 32306927 PMCID: PMC7168977 DOI: 10.1186/s12864-020-6714-x] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 04/01/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND The emergence of high throughput technologies that produce vast amounts of genomic data, such as next-generation sequencing (NGS) is transforming biological research. The dramatic increase in the volume of data, the variety and continuous change of data processing tools, algorithms and databases make analysis the main bottleneck for scientific discovery. The processing of high throughput datasets typically involves many different computational programs, each of which performs a specific step in a pipeline. Given the wide range of applications and organizational infrastructures, there is a great need for highly parallel, flexible, portable, and reproducible data processing frameworks. Several platforms currently exist for the design and execution of complex pipelines. Unfortunately, current platforms lack the necessary combination of parallelism, portability, flexibility and/or reproducibility that are required by the current research environment. To address these shortcomings, workflow frameworks that provide a platform to develop and share portable pipelines have recently arisen. We complement these new platforms by providing a graphical user interface to create, maintain, and execute complex pipelines. Such a platform will simplify robust and reproducible workflow creation for non-technical users as well as provide a robust platform to maintain pipelines for large organizations. RESULTS To simplify development, maintenance, and execution of complex pipelines we created DolphinNext. DolphinNext facilitates building and deployment of complex pipelines using a modular approach implemented in a graphical interface that relies on the powerful Nextflow workflow framework by providing 1. A drag and drop user interface that visualizes pipelines and allows users to create pipelines without familiarity in underlying programming languages. 2. Modules to execute and monitor pipelines in distributed computing environments such as high-performance clusters and/or cloud 3. Reproducible pipelines with version tracking and stand-alone versions that can be run independently. 4. Modular process design with process revisioning support to increase reusability and pipeline development efficiency. 5. Pipeline sharing with GitHub and automated testing 6. Extensive reports with R-markdown and shiny support for interactive data visualization and analysis. CONCLUSION DolphinNext is a flexible, intuitive, web-based data processing and analysis platform that enables creating, deploying, sharing, and executing complex Nextflow pipelines with extensive revisioning and interactive reporting to enhance reproducible results.
Collapse
Affiliation(s)
- Onur Yukselen
- Bioinformatics Core, University of Massachusetts Medical School, Worcester, MA, 01605, USA
| | - Osman Turkyilmaz
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA, 01605, USA
| | - Ahmet Rasit Ozturk
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA, 01605, USA
| | - Manuel Garber
- Bioinformatics Core, University of Massachusetts Medical School, Worcester, MA, 01605, USA.
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA, 01605, USA.
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, 01605, USA.
| | - Alper Kucukural
- Bioinformatics Core, University of Massachusetts Medical School, Worcester, MA, 01605, USA.
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA, 01605, USA.
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, 01605, USA.
| |
Collapse
|
28
|
Alessandrì L, Cordero F, Beccuti M, Arigoni M, Olivero M, Romano G, Rabellino S, Licheri N, De Libero G, Pace L, Calogero RA. rCASC: reproducible classification analysis of single-cell sequencing data. Gigascience 2020; 8:5565135. [PMID: 31494672 PMCID: PMC6732171 DOI: 10.1093/gigascience/giz105] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2018] [Revised: 04/12/2019] [Accepted: 08/08/2019] [Indexed: 01/05/2023] Open
Abstract
Background Single-cell RNA sequencing is essential for investigating cellular heterogeneity and highlighting cell subpopulation-specific signatures. Single-cell sequencing applications have spread from conventional RNA sequencing to epigenomics, e.g., ATAC-seq. Many related algorithms and tools have been developed, but few computational workflows provide analysis flexibility while also achieving functional (i.e., information about the data and the tools used are saved as metadata) and computational reproducibility (i.e., a real image of the computational environment used to generate the data is stored) through a user-friendly environment. Findings rCASC is a modular workflow providing an integrated analysis environment (from count generation to cell subpopulation identification) exploiting Docker containerization to achieve both functional and computational reproducibility in data analysis. Hence, rCASC provides preprocessing tools to remove low-quality cells and/or specific bias, e.g., cell cycle. Subpopulation discovery can instead be achieved using different clustering techniques based on different distance metrics. Cluster quality is then estimated through the new metric "cell stability score" (CSS), which describes the stability of a cell in a cluster as a consequence of a perturbation induced by removing a random set of cells from the cell population. CSS provides better cluster robustness information than the silhouette metric. Moreover, rCASC's tools can identify cluster-specific gene signatures. Conclusions rCASC is a modular workflow with new features that could help researchers define cell subpopulations and detect subpopulation-specific markers. It uses Docker for ease of installation and to achieve a computation-reproducible analysis. A Java GUI is provided to welcome users without computational skills in R.
Collapse
Affiliation(s)
- Luca Alessandrì
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10125 Torino, Italy
| | - Francesca Cordero
- Department of Computer Science, University of Torino, Corso Svizzera 185, 10149 Torino, Italy
| | - Marco Beccuti
- Department of Computer Science, University of Torino, Corso Svizzera 185, 10149 Torino, Italy
| | - Maddalena Arigoni
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10125 Torino, Italy
| | - Martina Olivero
- Department of Oncology, University of Torino, SP142, 95, 10060 Candiolo (TO), Italy
| | - Greta Romano
- Department of Computer Science, University of Torino, Corso Svizzera 185, 10149 Torino, Italy
| | - Sergio Rabellino
- Department of Computer Science, University of Torino, Corso Svizzera 185, 10149 Torino, Italy
| | - Nicola Licheri
- Department of Computer Science, University of Torino, Corso Svizzera 185, 10149 Torino, Italy
| | - Gennaro De Libero
- Department Biomedizin, University of Basel, Hebelstrasse 20, 4031 Basel, Switzerland
| | - Luigia Pace
- Italian Istitute for Genomic Medicine, IIGM, c/o IRCCS 10060 Candiolo (TO), Italy
| | - Raffaele A Calogero
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Via Nizza 52, 10125 Torino, Italy
| |
Collapse
|
29
|
Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. J Cheminform 2020; 12:9. [PMID: 33430992 PMCID: PMC6988305 DOI: 10.1186/s13321-020-0408-x] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 01/02/2020] [Indexed: 12/11/2022] Open
Abstract
The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand
| | - Samuel Lampa
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
| | - Saw Simeon
- Interdisciplinary Graduate Program in Bioscience, Faculty of Science, Kasetsart University, 10900, Bangkok, Thailand
| | - Matthew Paul Gleeson
- Department of Biomedical Engineering, Faculty of Engineering, King Mongkut's Institute of Technology Ladkrabang, 10520, Bangkok, Thailand.
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden.
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand.
| |
Collapse
|
30
|
Ferrero G, Licheri N, Coscujuela Tarrero L, De Intinis C, Miano V, Calogero RA, Cordero F, De Bortoli M, Beccuti M. Docker4Circ: A Framework for the Reproducible Characterization of circRNAs from RNA-Seq Data. Int J Mol Sci 2019; 21:ijms21010293. [PMID: 31906249 PMCID: PMC6982331 DOI: 10.3390/ijms21010293] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 12/28/2019] [Indexed: 01/09/2023] Open
Abstract
Recent improvements in cost-effectiveness of high-throughput technologies has allowed RNA sequencing of total transcriptomes suitable for evaluating the expression and regulation of circRNAs, a relatively novel class of transcript isoforms with suggested roles in transcriptional and post-transcriptional gene expression regulation, as well as their possible use as biomarkers, due to their deregulation in various human diseases. A limited number of integrated workflows exists for prediction, characterization, and differential expression analysis of circRNAs, none of them complying with computational reproducibility requirements. We developed Docker4Circ for the complete analysis of circRNAs from RNA-Seq data. Docker4Circ runs a comprehensive analysis of circRNAs in human and model organisms, including: circRNAs prediction; classification and annotation using six public databases; back-splice sequence reconstruction; internal alternative splicing of circularizing exons; alignment-free circRNAs quantification from RNA-Seq reads; and differential expression analysis. Docker4Circ makes circRNAs analysis easier and more accessible thanks to: (i) its R interface; (ii) encapsulation of computational tasks into docker images; (iii) user-friendly Java GUI Interface availability; and (iv) no need of advanced bash scripting skills for correct use. Furthermore, Docker4Circ ensures a reproducible analysis since all its tasks are embedded into a docker image following the guidelines provided by Reproducible Bioinformatics Project.
Collapse
Affiliation(s)
- Giulio Ferrero
- Department of Computer Science, University of Turin, 10149 Turin, Italy; (G.F.); (N.L.); (C.D.I.); (F.C.); (M.B.)
| | - Nicola Licheri
- Department of Computer Science, University of Turin, 10149 Turin, Italy; (G.F.); (N.L.); (C.D.I.); (F.C.); (M.B.)
| | - Lucia Coscujuela Tarrero
- Department of Clinical and Biological Sciences, University of Turin, Orbassano, 10043 Turin, Italy; (L.C.T.); (V.M.)
- Center for Genomic Science, Italian Institute of Technology, 20139 Milan, Italy
| | - Carlo De Intinis
- Department of Computer Science, University of Turin, 10149 Turin, Italy; (G.F.); (N.L.); (C.D.I.); (F.C.); (M.B.)
| | - Valentina Miano
- Department of Clinical and Biological Sciences, University of Turin, Orbassano, 10043 Turin, Italy; (L.C.T.); (V.M.)
- Division of Cellular and Molecular Pathology, Department of Pathology, University of Cambridge, Addenbrooke’s Hospital, Cambridge CB2 0QQ, UK
| | - Raffaele Adolfo Calogero
- Department of Molecular Biotechnology and Health Sciences, University of Turin, 10126 Turin, Italy;
| | - Francesca Cordero
- Department of Computer Science, University of Turin, 10149 Turin, Italy; (G.F.); (N.L.); (C.D.I.); (F.C.); (M.B.)
| | - Michele De Bortoli
- Department of Clinical and Biological Sciences, University of Turin, Orbassano, 10043 Turin, Italy; (L.C.T.); (V.M.)
- Correspondence:
| | - Marco Beccuti
- Department of Computer Science, University of Turin, 10149 Turin, Italy; (G.F.); (N.L.); (C.D.I.); (F.C.); (M.B.)
| |
Collapse
|
31
|
Ulfenborg B. Vertical and horizontal integration of multi-omics data with miodin. BMC Bioinformatics 2019; 20:649. [PMID: 31823712 PMCID: PMC6902525 DOI: 10.1186/s12859-019-3224-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 11/14/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Studies on multiple modalities of omics data such as transcriptomics, genomics and proteomics are growing in popularity, since they allow us to investigate complex mechanisms across molecular layers. It is widely recognized that integrative omics analysis holds the promise to unlock novel and actionable biological insights into health and disease. Integration of multi-omics data remains challenging, however, and requires combination of several software tools and extensive technical expertise to account for the properties of heterogeneous data. RESULTS This paper presents the miodin R package, which provides a streamlined workflow-based syntax for multi-omics data analysis. The package allows users to perform analysis of omics data either across experiments on the same samples (vertical integration), or across studies on the same variables (horizontal integration). Workflows have been designed to promote transparent data analysis and reduce the technical expertise required to perform low-level data import and processing. CONCLUSIONS The miodin package is implemented in R and is freely available for use and extension under the GPL-3 license. Package source, reference documentation and user manual are available at https://gitlab.com/algoromics/miodin.
Collapse
|
32
|
Altered Fecal Small RNA Profiles in Colorectal Cancer Reflect Gut Microbiome Composition in Stool Samples. mSystems 2019; 4:4/5/e00289-19. [PMID: 31530647 PMCID: PMC6749105 DOI: 10.1128/msystems.00289-19] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The characteristics of microbial small RNA transcription are largely unknown, while it is of primary importance for a better identification of molecules with functional activities in the gut niche under both healthy and disease conditions. By performing combined analyses of metagenomic and small RNA sequencing (sRNA-Seq) data, we characterized both the human and microbial small RNA contents of stool samples from healthy individuals and from patients with colorectal carcinoma or adenoma. With the integrative analyses of metagenomic and sRNA-Seq data, we identified a human and microbial small RNA signature which can be used to improve diagnosis of the disease. Our analysis of human and gut microbiome small RNA expression is relevant to generation of the first hypotheses about the potential molecular interactions occurring in the gut of CRC patients, and it can be the basis for further mechanistic studies and clinical tests. Dysbiotic configurations of the human gut microbiota have been linked to colorectal cancer (CRC). Human small noncoding RNAs are also implicated in CRC, and recent findings suggest that their release in the gut lumen contributes to shape the gut microbiota. Bacterial small RNAs (bsRNAs) may also play a role in carcinogenesis, but their role has been less extensively explored. Here, we performed small RNA and shotgun sequencing on 80 stool specimens from patients with CRC or with adenomas and from healthy subjects collected in a cross-sectional study to evaluate their combined use as a predictive tool for disease detection. We observed considerable overlap and a correlation between metagenomic and bsRNA quantitative taxonomic profiles obtained from the two approaches. We identified a combined predictive signature composed of 32 features from human and microbial small RNAs and DNA-based microbiome able to accurately classify CRC samples separately from healthy and adenoma samples (area under the curve [AUC] = 0.87). In the present study, we report evidence that host-microbiome dysbiosis in CRC can also be observed by examination of altered small RNA stool profiles. Integrated analyses of the microbiome and small RNAs in the human stool may provide insights for designing more-accurate tools for diagnostic purposes. IMPORTANCE The characteristics of microbial small RNA transcription are largely unknown, while it is of primary importance for a better identification of molecules with functional activities in the gut niche under both healthy and disease conditions. By performing combined analyses of metagenomic and small RNA sequencing (sRNA-Seq) data, we characterized both the human and microbial small RNA contents of stool samples from healthy individuals and from patients with colorectal carcinoma or adenoma. With the integrative analyses of metagenomic and sRNA-Seq data, we identified a human and microbial small RNA signature which can be used to improve diagnosis of the disease. Our analysis of human and gut microbiome small RNA expression is relevant to generation of the first hypotheses about the potential molecular interactions occurring in the gut of CRC patients, and it can be the basis for further mechanistic studies and clinical tests.
Collapse
|
33
|
Takeuchi S, Kawada JI, Horiba K, Okuno Y, Okumura T, Suzuki T, Torii Y, Kawabe S, Wada S, Ikeyama T, Ito Y. Metagenomic analysis using next-generation sequencing of pathogens in bronchoalveolar lavage fluid from pediatric patients with respiratory failure. Sci Rep 2019; 9:12909. [PMID: 31501513 PMCID: PMC6733840 DOI: 10.1038/s41598-019-49372-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 08/23/2019] [Indexed: 01/29/2023] Open
Abstract
Next-generation sequencing (NGS) has been applied in the field of infectious diseases. Bronchoalveolar lavage fluid (BALF) is considered a sterile type of specimen that is suitable for detecting pathogens of respiratory infections. The aim of this study was to comprehensively identify causative pathogens using NGS in BALF samples from immunocompetent pediatric patients with respiratory failure. Ten patients hospitalized with respiratory failure were included. BALF samples obtained in the acute phase were used to prepare DNA- and RNA-sequencing libraries. The libraries were sequenced on MiSeq, and the sequence data were analyzed using metagenome analysis tools. A mean of 2,041,216 total reads were sequenced for each library. Significant bacterial or viral sequencing reads were detected in eight of the 10 patients. Furthermore, candidate pathogens were detected in three patients in whom etiologic agents were not identified by conventional methods. The complete genome of enterovirus D68 was identified in two patients, and phylogenetic analysis suggested that both strains belong to subclade B3, which is an epidemic strain that has spread worldwide in recent years. Our results suggest that NGS can be applied for comprehensive molecular diagnostics as well as surveillance of pathogens in BALF from patients with respiratory infection.
Collapse
Affiliation(s)
- Suguru Takeuchi
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| | - Jun-Ichi Kawada
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan.
| | - Kazuhiro Horiba
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| | - Yusuke Okuno
- Center for Advanced Medicine and Clinical Research, Nagoya University Hospital, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| | - Toshihiko Okumura
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| | - Takako Suzuki
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| | - Yuka Torii
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| | - Shinji Kawabe
- Departments of Infection and Immunity, Aichi Children's Health and Medical Center, 7-426 Morioka-machi, Obu, 474-8710, Japan
| | - Sho Wada
- Division of Pediatric Critical Care Medicine, Aichi Children's Health and Medical Center, 7-426 Morioka-machi, Obu, 474-8710, Japan
| | - Takanari Ikeyama
- Division of Pediatric Critical Care Medicine, Aichi Children's Health and Medical Center, 7-426 Morioka-machi, Obu, 474-8710, Japan
| | - Yoshinori Ito
- Department of Pediatrics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, 466-8550, Japan
| |
Collapse
|
34
|
Alessandrì L, Arigoni M, Calogero R. Differential Expression Analysis in Single-Cell Transcriptomics. Methods Mol Biol 2019; 1979:425-432. [PMID: 31028652 DOI: 10.1007/978-1-4939-9240-9_25] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Differential expression analysis is an important aspect of bulk RNA sequencing (RNAseq). A lot of tools are available, and among them DESeq2 and edgeR are widely used. Since single-cell RNA sequencing (scRNAseq) expression data are zero inflated, single-cell data are quite different from those generated by conventional bulk RNA sequencing. Comparative analysis of tools used to detect differentially expressed genes between two groups of single cells showed that edgeR with quasi-likelihood F-test (QLF) outperforms other methods.In bulk RNAseq, differential expression is mainly used to compare limited number of replicates of two or more biological conditions. However, scRNAseq differential expression analysis might be also instrumental to identify the main players of cells subpopulation organization, thus requiring the use of multiple comparisons tools. Nowadays, edgeR is one of the few tools that are able to handle both zero inflated matrices and multiple comparisons. Here, we provide a guide to the use of edgeR as a tool to detect differential expression in single-cell data.
Collapse
Affiliation(s)
- Luca Alessandrì
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy
| | - Maddalena Arigoni
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy
| | - Raffaele Calogero
- Department of Molecular Biotechnology and Health Sciences, University of Torino, Torino, Italy.
| |
Collapse
|
35
|
Armano G, Fotia G, Manconi A. BITS 2017: the annual meeting of the Italian Society of Bioinformatics. BMC Bioinformatics 2018; 19:352. [PMID: 30367567 PMCID: PMC6191941 DOI: 10.1186/s12859-018-2295-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
This preface introduces the content of the BioMed Central journal Supplement related to the 14th annual meeting of the Bioinformatics Italian Society, held in Cagliari, Italy, from the 5th to the 7th of July, 2017.
Collapse
Affiliation(s)
- Giuliano Armano
- Dept. of Electrical and Electronic Engineer, Univ. of Cagliari, P.zza D'Armi, Cagliari, 09123, Italy
| | - Giorgio Fotia
- Center for Advanced Studies, Research and Development in Sardinia, Loc. Pixina Manna, Cagliari, 09010 Pula, Italy
| | - Andrea Manconi
- National Research Council, Institute for Biomedical Technologies, Via F.lli Cervi, 93, Segrate, 20090, MI, Italy.
| |
Collapse
|