1
|
Morrissey A, Shi J, James DQ, Mahony S. Accurate allocation of multimapped reads enables regulatory element analysis at repeats. Genome Res 2024; 34:937-951. [PMID: 38986578 PMCID: PMC11293539 DOI: 10.1101/gr.278638.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 06/14/2024] [Indexed: 07/12/2024]
Abstract
Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. However, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard "multimapped" reads that align equally well to multiple genomic locations. Because multimapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multimapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multimapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multimapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq data sets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly beneficial in identifying ChIP-seq peaks at centromeres, near segmentally duplicated genes, and in younger TEs, enabling new regulatory analyses in these regions.
Collapse
Affiliation(s)
- Alexis Morrissey
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Jeffrey Shi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Daniela Q James
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
2
|
Almeida da Paz M, Warger S, Taher L. Disregarding multimappers leads to biases in the functional assessment of NGS data. BMC Genomics 2024; 25:455. [PMID: 38720252 PMCID: PMC11078754 DOI: 10.1186/s12864-024-10344-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/24/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND Standard ChIP-seq and RNA-seq processing pipelines typically disregard sequencing reads whose origin is ambiguous ("multimappers"). This usual practice has potentially important consequences for the functional interpretation of the data: genomic elements belonging to clusters composed of highly similar members are left unexplored. RESULTS In particular, disregarding multimappers leads to the underrepresentation in epigenetic studies of recently active transposable elements, such as AluYa5, L1HS and SVAs. Furthermore, this common strategy also has implications for transcriptomic analysis: members of repetitive gene families, such the ones including major histocompatibility complex (MHC) class I and II genes, are under-quantified. CONCLUSION Revealing inherent biases that permeate routine tasks such as functional enrichment analysis, our results underscore the urgency of broadly adopting multimapper-aware bioinformatic pipelines -currently restricted to specific contexts or communities- to ensure the reliability of genomic and transcriptomic studies.
Collapse
Affiliation(s)
| | - Sarah Warger
- Institute of Biomedical Informatics, Graz University of Technology, Graz, Austria
| | - Leila Taher
- Institute of Biomedical Informatics, Graz University of Technology, Graz, Austria.
| |
Collapse
|
3
|
Leonetti P, Consiglio A, Arendt D, Golbik RP, Rubino L, Gursinsky T, Behrens SE, Pantaleo V. Exogenous and endogenous dsRNAs perceived by plant Dicer-like 4 protein in the RNAi-depleted cellular context. Cell Mol Biol Lett 2023; 28:64. [PMID: 37550627 PMCID: PMC10405411 DOI: 10.1186/s11658-023-00469-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/24/2023] [Indexed: 08/09/2023] Open
Abstract
BACKGROUND In plants, RNase III Dicer-like proteins (DCLs) act as sensors of dsRNAs and process them into short 21- to 24-nucleotide (nt) (s)RNAs. Plant DCL4 is involved in the biogenesis of either functional endogenous or exogenous (i.e. viral) short interfering (si)RNAs, thus playing crucial antiviral roles. METHODS In this study we expressed plant DCL4 in Saccharomyces cerevisiae, an RNAi-depleted organism, in which we could highlight the role of dicing as neither Argonautes nor RNA-dependent RNA polymerase is present. We have therefore tested the DCL4 functionality in processing exogenous dsRNA-like substrates, such as a replicase-assisted viral replicon defective-interfering RNA and RNA hairpin substrates, or endogenous antisense transcripts. RESULTS DCL4 was shown to be functional in processing dsRNA-like molecules in vitro and in vivo into 21- and 22-nt sRNAs. Conversely, DCL4 did not efficiently process a replicase-assisted viral replicon in vivo, providing evidence that viral RNAs are not accessible to DCL4 in membranes associated in active replication. Worthy of note, in yeast cells expressing DCL4, 21- and 22-nt sRNAs are associated with endogenous loci. CONCLUSIONS We provide new keys to interpret what was studied so far on antiviral DCL4 in the host system. The results all together confirm the role of sense/antisense RNA-based regulation of gene expression, expanding the sense/antisense atlas of S. cerevisiae. The results described herein show that S. cerevisiae can provide insights into the functionality of plant dicers and extend the S. cerevisiae tool to new biotechnological applications.
Collapse
Affiliation(s)
- Paola Leonetti
- Department of Biology, Agricultural and Food Sciences, National Research Council, Institute for Sustainable Plant Protection, Bari Unit, Bari, Italy
| | - Arianna Consiglio
- Department of Biomedical Sciences, National Research Council, Institute for Biomedical Technologies, Bari Unit, Bari, Italy
| | - Dennis Arendt
- Institute of Biochemistry and Biotechnology, Section Microbial Biotechnology, Martin Luther University Halle-Wittenberg, Halle Saale, Germany
| | - Ralph Peter Golbik
- Institute of Biochemistry and Biotechnology, Section Microbial Biotechnology, Martin Luther University Halle-Wittenberg, Halle Saale, Germany
| | - Luisa Rubino
- Department of Biology, Agricultural and Food Sciences, National Research Council, Institute for Sustainable Plant Protection, Bari Unit, Bari, Italy
| | - Torsten Gursinsky
- Institute of Biochemistry and Biotechnology, Section Microbial Biotechnology, Martin Luther University Halle-Wittenberg, Halle Saale, Germany
| | - Sven-Erik Behrens
- Institute of Biochemistry and Biotechnology, Section Microbial Biotechnology, Martin Luther University Halle-Wittenberg, Halle Saale, Germany
| | - Vitantonio Pantaleo
- Department of Biology, Agricultural and Food Sciences, National Research Council, Institute for Sustainable Plant Protection, Bari Unit, Bari, Italy.
| |
Collapse
|
4
|
Marzano F, Chiara M, Consiglio A, D’Amato G, Gentile M, Mirabelli V, Piane M, Savio C, Fabiani M, D’Elia D, Sbisà E, Scarano G, Lonardo F, Tullo A, Pesole G, Faienza MF. Whole-Exome and Transcriptome Sequencing Expands the Genotype of Majewski Osteodysplastic Primordial Dwarfism Type II. Int J Mol Sci 2023; 24:12291. [PMID: 37569667 PMCID: PMC10418986 DOI: 10.3390/ijms241512291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 07/28/2023] [Accepted: 07/28/2023] [Indexed: 08/13/2023] Open
Abstract
Microcephalic Osteodysplastic Primordial Dwarfism type II (MOPDII) represents the most common form of primordial dwarfism. MOPD clinical features include severe prenatal and postnatal growth retardation, postnatal severe microcephaly, hypotonia, and an increased risk for cerebrovascular disease and insulin resistance. Autosomal recessive biallelic loss-of-function genomic variants in the centrosomal pericentrin (PCNT) gene on chromosome 21q22 cause MOPDII. Over the past decade, exome sequencing (ES) and massive RNA sequencing have been effectively employed for both the discovery of novel disease genes and to expand the genotypes of well-known diseases. In this paper we report the results both the RNA sequencing and ES of three patients affected by MOPDII with the aim of exploring whether differentially expressed genes and previously uncharacterized gene variants, in addition to PCNT pathogenic variants, could be associated with the complex phenotype of this disease. We discovered a downregulation of key factors involved in growth, such as IGF1R, IGF2R, and RAF1, in all three investigated patients. Moreover, ES identified a shortlist of genes associated with deleterious, rare variants in MOPDII patients. Our results suggest that Next Generation Sequencing (NGS) technologies can be successfully applied for the molecular characterization of the complex genotypic background of MOPDII.
Collapse
Affiliation(s)
- Flaviana Marzano
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM–CNR, 70126 Bari, Italy; (F.M.); (A.T.)
| | - Matteo Chiara
- Department of Biosciences, University of Milan, 20133 Milan, Italy;
| | - Arianna Consiglio
- Institute for Biomedical Technologies, ITB-CNR, 70126 Bari, Italy; (A.C.); (V.M.); (D.D.); (E.S.)
| | - Gabriele D’Amato
- Neonatal Intensive Care Unit, Di Venere Hospital, 70012 Bari, Italy
| | | | - Valentina Mirabelli
- Institute for Biomedical Technologies, ITB-CNR, 70126 Bari, Italy; (A.C.); (V.M.); (D.D.); (E.S.)
| | - Maria Piane
- Department of Clinical and Molecular Medicine, Sapienza University, 00185 Rome, Italy;
| | | | - Marco Fabiani
- Department of Experimental Medicine, Sapienza University of Rome, 00185 Rome, Italy;
| | - Domenica D’Elia
- Institute for Biomedical Technologies, ITB-CNR, 70126 Bari, Italy; (A.C.); (V.M.); (D.D.); (E.S.)
| | - Elisabetta Sbisà
- Institute for Biomedical Technologies, ITB-CNR, 70126 Bari, Italy; (A.C.); (V.M.); (D.D.); (E.S.)
| | - Gioacchino Scarano
- Medical Genetics Unit, AORN “San Pio”, Hosp. “G. Rummo”, 82100 Benevento, Italy; (G.S.); (F.L.)
| | - Fortunato Lonardo
- Medical Genetics Unit, AORN “San Pio”, Hosp. “G. Rummo”, 82100 Benevento, Italy; (G.S.); (F.L.)
| | - Apollonia Tullo
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM–CNR, 70126 Bari, Italy; (F.M.); (A.T.)
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM–CNR, 70126 Bari, Italy; (F.M.); (A.T.)
- Department of Biosciences, Biotechnology and Biofarmaceutics, University of Bari “Aldo Moro”, 70126 Bari, Italy
| | - Maria Felicia Faienza
- Pediatric Section, Department of Precision and Regenerative Medicine and Ionian Area, University “A. Moro” of Bari, 70124 Bari, Italy
| |
Collapse
|
5
|
Analysis of Faecal Microbiota and Small ncRNAs in Autism: Detection of miRNAs and piRNAs with Possible Implications in Host-Gut Microbiota Cross-Talk. Nutrients 2022; 14:nu14071340. [PMID: 35405953 PMCID: PMC9000903 DOI: 10.3390/nu14071340] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/07/2022] [Accepted: 03/21/2022] [Indexed: 02/01/2023] Open
Abstract
Intestinal microorganisms impact health by maintaining gut homeostasis and shaping the host immunity, while gut dysbiosis associates with many conditions, including autism, a complex neurodevelopmental disorder with multifactorial aetiology. In autism, gut dysbiosis correlates with symptom severity and is characterised by a reduced bacterial variability and a diminished beneficial commensal relationship. Microbiota can influence the expression of host microRNAs that, in turn, regulate the growth of intestinal bacteria by means of bidirectional host-gut microbiota cross-talk. We investigated possible interactions among intestinal microbes and between them and host transcriptional modulators in autism. To this purpose, we analysed, by "omics" technologies, faecal microbiome, mycobiome, and small non-coding-RNAs (particularly miRNAs and piRNAs) of children with autism and neurotypical development. Patients displayed gut dysbiosis related to a reduction of healthy gut micro- and mycobiota as well as up-regulated transcriptional modulators. The targets of dysregulated non-coding-RNAs are involved in intestinal permeability, inflammation, and autism. Furthermore, microbial families, underrepresented in patients, participate in the production of human essential metabolites negatively influencing the health condition. Here, we propose a novel approach to analyse faeces as a whole, and for the first time, we detected miRNAs and piRNAs in faecal samples of patients with autism.
Collapse
|
6
|
Shah RN, Ruthenburg AJ. Sequence deeper without sequencing more: Bayesian resolution of ambiguously mapped reads. PLoS Comput Biol 2021; 17:e1008926. [PMID: 33872311 PMCID: PMC8084338 DOI: 10.1371/journal.pcbi.1008926] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 04/29/2021] [Accepted: 03/30/2021] [Indexed: 11/18/2022] Open
Abstract
Next-generation sequencing (NGS) has transformed molecular biology and contributed to many seminal insights into genomic regulation and function. Apart from whole-genome sequencing, an NGS workflow involves alignment of the sequencing reads to the genome of study, after which the resulting alignments can be used for downstream analyses. However, alignment is complicated by the repetitive sequences; many reads align to more than one genomic locus, with 15-30% of the genome not being uniquely mappable by short-read NGS. This problem is typically addressed by discarding reads that do not uniquely map to the genome, but this practice can lead to systematic distortion of the data. Previous studies that developed methods for handling ambiguously mapped reads were often of limited applicability or were computationally intensive, hindering their broader usage. In this work, we present SmartMap: an algorithm that augments industry-standard aligners to enable usage of ambiguously mapped reads by assigning weights to each alignment with Bayesian analysis of the read distribution and alignment quality. SmartMap is computationally efficient, utilizing far fewer weighting iterations than previously thought necessary to process alignments and, as such, analyzing more than a billion alignments of NGS reads in approximately one hour on a desktop PC. By applying SmartMap to peak-type NGS data, including MNase-seq, ChIP-seq, and ATAC-seq in three organisms, we can increase read depth by up to 53% and increase the mapped proportion of the genome by up to 18% compared to analyses utilizing only uniquely mapped reads. We further show that SmartMap enables the analysis of more than 140,000 repetitive elements that could not be analyzed by traditional ChIP-seq workflows, and we utilize this method to gain insight into the epigenetic regulation of different classes of repetitive elements. These data emphasize both the dangers of discarding ambiguously mapped reads and their power for driving biological discovery.
Collapse
Affiliation(s)
- Rohan N. Shah
- Pritzker School of Medicine, Division of the Biological Sciences, The University of Chicago, Chicago, Illinois, United States of America
- Department of Molecular Biology and Cell Genetics, Division of the Biological Sciences, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (RNS); (AJR)
| | - Alexander J. Ruthenburg
- Department of Molecular Biology and Cell Genetics, Division of the Biological Sciences, The University of Chicago, Chicago, Illinois, United States of America
- Department of Biochemistry and Molecular Biology, Division of the Biological Sciences, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (RNS); (AJR)
| |
Collapse
|
7
|
Explaining Ovarian Cancer Gene Expression Profiles with Fuzzy Rules and Genetic Algorithms. ELECTRONICS 2021. [DOI: 10.3390/electronics10040375] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The analysis of gene expression data is a complex task, and many tools and pipelines are available to handle big sequencing datasets for case-control (bivariate) studies. In some cases, such as pilot or exploratory studies, the researcher needs to compare more than two groups of samples consisting of a few replicates. Both standard statistical bioinformatic pipelines and innovative deep learning models are unsuitable for extracting interpretable patterns and information from such datasets. In this work, we apply a combination of fuzzy rule systems and genetic algorithms to analyze a dataset composed of 21 samples and 6 classes, useful for approaching the study of expression profiles in ovarian cancer, compared to other ovarian diseases. The proposed method is capable of performing a feature selection among genes that is guided by the genetic algorithm, and of building a set of if-then rules that explain how classes can be distinguished by observing changes in the expression of selected genes. After testing several parameters, the final model consists of 10 genes involved in the molecular pathways of cancer and 10 rules that correctly classify all samples.
Collapse
|
8
|
Marzano F, Caratozzolo MF, Consiglio A, Licciulli F, Liuni S, Sbisà E, D'Elia D, Tullo A, Catalano D. Plant miRNAs Reduce Cancer Cell Proliferation by Targeting MALAT1 and NEAT1: A Beneficial Cross-Kingdom Interaction. Front Genet 2020; 11:552490. [PMID: 33193626 PMCID: PMC7531330 DOI: 10.3389/fgene.2020.552490] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 08/20/2020] [Indexed: 12/12/2022] Open
Abstract
MicroRNAs (miRNAs) are ubiquitous regulators of gene expression, evolutionarily conserved in plants and mammals. In recent years, although a growing number of papers debate the role of plant miRNAs on human gene expression, the molecular mechanisms through which this effect is achieved are still not completely elucidated. Some evidence suggest that this interaction might be sequence specific, and in this work, we investigated this possibility by transcriptomic and bioinformatics approaches. Plant and human miRNA sequences from primary databases were collected and compared for their similarities (global or local alignments). Out of 2,588 human miRNAs, 1,606 showed a perfect match of their seed sequence with the 5′ end of 3,172 plant miRNAs. Further selections were applied based on the role of the human target genes or of the miRNA in cell cycle regulation (as an oncogene, tumor suppressor, or a biomarker for prognosis, or diagnosis in cancer). Based on these criteria, 20 human miRNAs were selected as potential functional analogous of 7 plant miRNAs, which were in turn transfected in different cell lines to evaluate their effect on cell proliferation. A significant decrease was observed in colorectal carcinoma HCT116 cell line. RNA-Seq demonstrated that 446 genes were differentially expressed 72 h after transfection. Noteworthy, we demonstrated that the plant mtr-miR-5754 and gma-miR4995 directly target the tumor-associated long non-coding RNA metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) and nuclear paraspeckle assembly transcript 1 (NEAT1) in a sequence-specific manner. In conclusion, according to other recent discoveries, our study strengthens and expands the hypothesis that plant miRNAs can have a regulatory effect in mammals by targeting both protein-coding and non-coding RNA, thus suggesting new biotechnological applications.
Collapse
Affiliation(s)
- Flaviana Marzano
- Department of Biomedical Sciences, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| | - Mariano Francesco Caratozzolo
- Department of Biomedical Sciences, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| | - Arianna Consiglio
- Department of Biomedical Sciences, Institute for Biomedical Technologies, Bari, Italy
| | - Flavio Licciulli
- Department of Biomedical Sciences, Institute for Biomedical Technologies, Bari, Italy
| | - Sabino Liuni
- Department of Biomedical Sciences, Institute for Biomedical Technologies, Bari, Italy
| | - Elisabetta Sbisà
- Department of Biomedical Sciences, Institute for Biomedical Technologies, Bari, Italy
| | - Domenica D'Elia
- Department of Biomedical Sciences, Institute for Biomedical Technologies, Bari, Italy
| | - Apollonia Tullo
- Department of Biomedical Sciences, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| | - Domenico Catalano
- Department of Biomedical Sciences, Institute for Biomedical Technologies, Bari, Italy
| |
Collapse
|
9
|
Deschamps-Francoeur G, Simoneau J, Scott MS. Handling multi-mapped reads in RNA-seq. Comput Struct Biotechnol J 2020; 18:1569-1576. [PMID: 32637053 PMCID: PMC7330433 DOI: 10.1016/j.csbj.2020.06.014] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 06/06/2020] [Accepted: 06/07/2020] [Indexed: 11/07/2022] Open
Abstract
Many eukaryotic genomes harbour large numbers of duplicated sequences, of diverse biotypes, resulting from several mechanisms including recombination, whole genome duplication and retro-transposition. Such repeated sequences complicate gene/transcript quantification during RNA-seq analysis due to reads mapping to more than one locus, sometimes involving genes embedded in other genes. Genes of different biotypes have dissimilar levels of sequence duplication, with long-noncoding RNAs and messenger RNAs sharing less sequence similarity to other genes than biotypes encoding shorter RNAs. Many strategies have been elaborated to handle these multi-mapped reads, resulting in increased accuracy in gene/transcript quantification, although separate tools are typically used to estimate the abundance of short and long genes due to their dissimilar characteristics. This review discusses the mechanisms leading to sequence duplication, the biotypes affected, the computational strategies employed to deal with multi-mapped reads and the challenges that still remain to be overcome.
Collapse
Affiliation(s)
- Gabrielle Deschamps-Francoeur
- Département de Biochimie et Génomique Fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Joël Simoneau
- Département de Biochimie et Génomique Fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Michelle S. Scott
- Département de Biochimie et Génomique Fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| |
Collapse
|
10
|
Górczak K, Claesen J, Burzykowski T. A Conceptual Framework for Abundance Estimation of Genomic Targets in the Presence of Ambiguous Short Sequencing Reads. J Comput Biol 2020; 27:1232-1247. [PMID: 31895597 DOI: 10.1089/cmb.2019.0272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
RNA sequencing (RNA-seq) is widely used to study gene-, transcript-, or exon expression. To quantify the expression level, millions of short sequenced reads need to be mapped back to a reference genome or transcriptome. Read mapping makes it possible to find a location to which a read is identical or similar. Based upon this alignment, expression summaries, that is, read counts are generated. However, reads may be matched to multiple locations. Such ambiguously mapped reads are often ignored in the analysis, which is a potential loss of information and may cause bias in expression estimation. We present the general principles underlying multiread allocation and unbiased estimation of the expression level of genes, exons, or transcripts in the presence of multiple mapped reads. The underlying principles are derived from a theoretical concept that identifies important sources of information such as the number of uniquely mapped reads, the total target length, and the length of the shared target regions. We show with simulation studies that methods incorporating some or all of the aforementioned sources of information estimate the expression levels of genes, exons, and/or transcripts with a higher precision and accuracy than methods that do not use this information. We identify important sources of information that should be taken into account by methods that estimate the abundance of genes, exons, and/or transcripts to achieve good precision and accuracy.
Collapse
Affiliation(s)
- Katarzyna Górczak
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University, Diepenbeek, Belgium.,Department of Mathematical and Statistical Methods, Poznań University of Life Sciences, Poznań, Poland
| | - Jürgen Claesen
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University, Diepenbeek, Belgium.,Microbiology Unit, Belgian Nuclear Research Centre (SCK•CEN), Mol, Belgium
| | - Tomasz Burzykowski
- Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University, Diepenbeek, Belgium.,Department of Statistics and Medical Informatics, Medical University of Bialystok, Bialystok, Poland
| |
Collapse
|
11
|
Integrated Analysis of microRNA and mRNA Expression Profiles: An Attempt to Disentangle the Complex Interaction Network in Attention Deficit Hyperactivity Disorder. Brain Sci 2019; 9:brainsci9100288. [PMID: 31652596 PMCID: PMC6826944 DOI: 10.3390/brainsci9100288] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 10/16/2019] [Accepted: 10/20/2019] [Indexed: 12/11/2022] Open
Abstract
Attention Deficit Hyperactivity Disorder (ADHD) is a childhood-onset neurodevelopmental disorder, whose etiology and pathogenesis are still largely unknown. In order to uncover novel regulatory networks and molecular pathways possibly related to ADHD, we performed an integrated miRNA and mRNA expression profiling analysis in peripheral blood samples of children with ADHD and age-matched typically developing (TD) children. The expression levels of 13 miRNAs were evaluated with microfluidic qPCR, and differentially expressed (DE) mRNAs were detected on an Illumina HiSeq 2500 genome analyzer. The miRNA targetome was identified using an integrated approach of validated and predicted interaction data extracted from seven different bioinformatic tools. Gene Ontology (GO) and pathway enrichment analyses were carried out. Results showed that six miRNAs (miR-652-3p, miR-942-5p, let-7b-5p, miR-181a-5p, miR-320a, and miR-148b-3p) and 560 genes were significantly DE in children with ADHD compared to TD subjects. After correction for multiple testing, only three miRNAs (miR-652-3p, miR-148b-3p, and miR-942-5p) remained significant. Genes known to be associated with ADHD (e.g., B4GALT2, SLC6A9 TLE1, ANK3, TRIO, TAF1, and SYNE1) were confirmed to be significantly DE in our study. Integrated miRNA and mRNA expression data identified critical key hubs involved in ADHD. Finally, the GO and pathway enrichment analyses of all DE genes showed their deep involvement in immune functions, reinforcing the hypothesis that an immune imbalance might contribute to the ADHD etiology. Despite the relatively small sample size, in this study we were able to build a complex miRNA-target interaction network in children with ADHD that might help in deciphering the disease pathogenesis. Validation in larger samples should be performed in order to possibly suggest novel therapeutic strategies for treating this complex disease.
Collapse
|
12
|
Liguori M, Nuzziello N, Introna A, Consiglio A, Licciulli F, D’Errico E, Scarafino A, Distaso E, Simone IL. Dysregulation of MicroRNAs and Target Genes Networks in Peripheral Blood of Patients With Sporadic Amyotrophic Lateral Sclerosis. Front Mol Neurosci 2018; 11:288. [PMID: 30210287 PMCID: PMC6121079 DOI: 10.3389/fnmol.2018.00288] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2018] [Accepted: 07/31/2018] [Indexed: 01/01/2023] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a progressive and fatal neurodegenerative disease. While genetics and other factors contribute to ALS pathogenesis, critical knowledge is still missing and validated biomarkers for monitoring the disease activity have not yet been identified. To address those aspects we carried out this study with the primary aim of identifying possible miRNAs/mRNAs dysregulation associated with the sporadic form of the disease (sALS). Additionally, we explored miRNAs as modulating factors of the observed clinical features. Study included 56 sALS and 20 healthy controls (HCs). We analyzed the peripheral blood samples of sALS patients and HCs with a high-throughput next-generation sequencing followed by an integrated bioinformatics/biostatistics analysis. Results showed that 38 miRNAs (let-7a-5p, let-7d-5p, let-7f-5p, let-7g-5p, let-7i-5p, miR-103a-3p, miR-106b-3p, miR-128-3p, miR-130a-3p, miR-130b-3p, miR-144-5p, miR-148a-3p, miR-148b-3p, miR-15a-5p, miR-15b-5p, miR-151a-5p, miR-151b, miR-16-5p, miR-182-5p, miR-183-5p, miR-186-5p, miR-22-3p, miR-221-3p, miR-223-3p, miR-23a-3p, miR-26a-5p, miR-26b-5p, miR-27b-3p, miR-28-3p, miR-30b-5p, miR-30c-5p, miR-342-3p, miR-425-5p, miR-451a, miR-532-5p, miR-550a-3p, miR-584-5p, miR-93-5p) were significantly downregulated in sALS. We also found that different miRNAs profiles characterized the bulbar/spinal onset and the progression rate. This observation supports the hypothesis that miRNAs may impact the phenotypic expression of the disease. Genes known to be associated with ALS (e.g., PARK7, C9orf72, ALS2, MATR3, SPG11, ATXN2) were confirmed to be dysregulated in our study. We also identified other potential candidate genes like LGALS3 (implicated in neuroinflammation) and PRKCD (activated in mitochondrial-induced apoptosis). Some of the downregulated genes are involved in molecular bindings to ions (i.e., metals, zinc, magnesium) and in ions-related functions. The genes that we found upregulated were involved in the immune response, oxidation-reduction, and apoptosis. These findings may have important implication for the monitoring, e.g., of sALS progression and therefore represent a significant advance in the elucidation of the disease's underlying molecular mechanisms. The extensive multidisciplinary approach we applied in this study was critically important for its success, especially in complex disorders such as sALS, wherein access to genetic background is a major limitation.
Collapse
Affiliation(s)
- Maria Liguori
- National Research Council, Institute of Biomedical Technologies, Bari Unit, Bari, Italy
| | - Nicoletta Nuzziello
- National Research Council, Institute of Biomedical Technologies, Bari Unit, Bari, Italy
| | - Alessandro Introna
- Department of Basic Sciences, Neurosciences and Sense Organs, University of Bari, Bari, Italy
| | - Arianna Consiglio
- National Research Council, Institute of Biomedical Technologies, Bari Unit, Bari, Italy
| | - Flavio Licciulli
- National Research Council, Institute of Biomedical Technologies, Bari Unit, Bari, Italy
| | - Eustachio D’Errico
- Department of Basic Sciences, Neurosciences and Sense Organs, University of Bari, Bari, Italy
| | - Antonio Scarafino
- Department of Basic Sciences, Neurosciences and Sense Organs, University of Bari, Bari, Italy
| | - Eugenio Distaso
- Department of Basic Sciences, Neurosciences and Sense Organs, University of Bari, Bari, Italy
| | - Isabella L. Simone
- Department of Basic Sciences, Neurosciences and Sense Organs, University of Bari, Bari, Italy
| |
Collapse
|
13
|
Milanesi L, Guffanti A, Mauri G, Masseroli M. BITS 2015: the annual meeting of the Italian Society of Bioinformatics. BMC Bioinformatics 2016; 17:396. [PMID: 28185548 PMCID: PMC5123416 DOI: 10.1186/s12859-016-1187-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
This preface introduces the content of the BioMed Central journal Supplements related to the BITS 2015 meeting, held in Milan, Italy, from the 3th to the 5th of June, 2015.
Collapse
|