1
|
The transcriptomics profiling of blood CD4 and CD8 T-cells in narcolepsy type I. Front Immunol 2023; 14:1249405. [PMID: 38077397 PMCID: PMC10702585 DOI: 10.3389/fimmu.2023.1249405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 10/24/2023] [Indexed: 12/18/2023] Open
Abstract
Background Narcolepsy Type I (NT1) is a rare, life-long sleep disorder arising as a consequence of the extensive destruction of orexin-producing hypothalamic neurons. The mechanisms involved in the destruction of orexin neurons are not yet elucidated but the association of narcolepsy with environmental triggers and genetic susceptibility (strong association with the HLA, TCRs and other immunologically-relevant loci) implicates an immuno-pathological process. Several studies in animal models and on human samples have suggested that T-cells are the main pathogenic culprits. Methods RNA sequencing was performed on four CD4 and CD8 T-cell subsets (naive, effector, effector memory and central memory) sorted by flow cytometry from peripheral blood mononuclear cells (PBMCs) of NT1 patients and HLA-matched healthy donors as well as (age- and sex-) matched individuals suffering from other sleep disorders (OSD). The RNAseq analysis was conducted by comparing the transcriptome of NT1 patients to that of healthy donors and other sleep disorder patients (collectively referred to as the non-narcolepsy controls) in order to identify NT1-specific genes and pathways. Results We determined NT1-specific differentially expressed genes, several of which are involved in tubulin arrangement found in CD4 (TBCB, CCT5, EML4, TPGS1, TPGS2) and CD8 (TTLL7) T cell subsets, which play a role in the immune synapse formation and TCR signaling. Furthermore, we identified genes (GZMB, LTB in CD4 T-cells and NLRP3, TRADD, IL6, CXCR1, FOXO3, FOXP3 in CD8 T-cells) and pathways involved in various aspects of inflammation and inflammatory response. More specifically, the inflammatory profile was identified in the "naive" subset of CD4 and CD8 T-cell. Conclusion We identified NT1-specific differentially expressed genes, providing a cell-type and subset specific catalog describing their functions in T-cells as well as their potential involvement in NT1. Several genes and pathways identified are involved in the formation of the immune synapse and TCR activation as well as inflammation and the inflammatory response. An inflammatory transcriptomic profile was detected in both "naive" CD4 and CD8 T-cell subsets suggesting their possible involvement in the development or progression of the narcoleptic process.
Collapse
|
2
|
A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling. Sci Data 2023; 10:369. [PMID: 37291142 PMCID: PMC10250393 DOI: 10.1038/s41597-023-02249-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 05/16/2023] [Indexed: 06/10/2023] Open
Abstract
Inspired by the production of reference data sets in the Genome in a Bottle project, we sequenced one Charolais heifer with different technologies: Illumina paired-end, Oxford Nanopore, Pacific Biosciences (HiFi and CLR), 10X Genomics linked-reads, and Hi-C. In order to generate haplotypic assemblies, we also sequenced both parents with short reads. From these data, we built two haplotyped trio high quality reference genomes and a consensus assembly, using up-to-date software packages. The assemblies obtained using PacBio HiFi reaches a size of 3.2 Gb, which is significantly larger than the 2.7 Gb ARS-UCD1.2 reference. The BUSCO score of the consensus assembly reaches a completeness of 95.8%, among highly conserved mammal genes. We also identified 35,866 structural variants larger than 50 base pairs. This assembly is a contribution to the bovine pangenome for the "Charolais" breed. These datasets will prove to be useful resources enabling the community to gain additional insight on sequencing technologies for applications such as SNP, indel or structural variant calling, and de novo assembly.
Collapse
|
3
|
srnaMapper: an optimal mapping tool for sRNA-Seq reads. BMC Bioinformatics 2022; 23:495. [DOI: 10.1186/s12859-022-05048-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 11/08/2022] [Indexed: 11/19/2022] Open
Abstract
Abstract
Background
Sequencing is the key method to study the impact of short RNAs, which include micro RNAs, tRNA-derived RNAs, and piwi-interacting RNA, among others. The first step to make use of these reads is to map them to a genome. Existing mapping tools have been developed for long RNAs in mind, and, so far, no tool has been conceived for short RNAs. However, short RNAs have several distinctive features which make them different from messenger RNAs: they are shorter, they are often redundant, they can be produced by duplicated loci, and they may be edited at their ends.
Results
In this work, we present a new tool, srnaMapper, that exhaustively maps these reads with all these features in mind, and is most efficient when applied to reads no longer than 50 base pairs. We show, on several datasets, that srnaMapper is very efficient considering computation time and edition error handling: it retrieves all the hits, with arbitrary number of errors, in time comparable with non-exhaustive tools.
Collapse
|
4
|
Influenza vaccination induces autoimmunity against orexinergic neurons in a mouse model for narcolepsy. Brain 2022; 145:2018-2030. [PMID: 35552381 DOI: 10.1093/brain/awab455] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 11/03/2021] [Accepted: 11/24/2021] [Indexed: 11/12/2022] Open
Abstract
Narcolepsy with cataplexy or narcolepsy type 1 is a disabling chronic sleep disorder resulting from the destruction of orexinergic neurons in the hypothalamus. The tight association of narcolepsy with HLA-DQB1*06:02 strongly suggest an autoimmune origin to this disease. Furthermore, converging epidemiological studies have identified an increased incidence for narcolepsy in Europe following Pandemrix® vaccination against the 2009-2010 pandemic 'influenza' virus strain. The potential immunological link between the Pandemrix® vaccination and narcolepsy remains, however, unknown. Deciphering these mechanisms may reveal pathways potentially at play in most cases of narcolepsy. Here, we developed a mouse model allowing to track and study the T-cell response against 'influenza' virus haemagglutinin, which was selectively expressed in the orexinergic neurons as a new self-antigen. Pandemrix® vaccination in this mouse model resulted in hypothalamic inflammation and selective destruction of orexin-producing neurons. Further investigations on the relative contribution of T-cell subsets in this process revealed that haemagglutinin-specific CD4 T cells were necessary for the development of hypothalamic inflammation, but insufficient for killing orexinergic neurons. Conversely, haemagglutinin-specific CD8 T cells could not initiate inflammation but were the effectors of the destruction of orexinergic neurons. Additional studies revealed pathways potentially involved in the disease process. Notably, the interferon-γ pathway was proven essential, as interferon-γ-deficient CD8 T cells were unable to elicit the loss of orexinergic neurons. Our work demonstrates that an immunopathological process mimicking narcolepsy can be elicited by immune cross-reactivity between a vaccine antigen and a neuronal self-antigen. This process relies on a synergy between autoreactive CD4 and CD8 T cells for disease development. This work furthers our understanding of the mechanisms and pathways potentially involved in the development of a neurological side effect due to a vaccine and, likely, to narcolepsy in general.
Collapse
|
5
|
Major Reorganization of Chromosome Conformation During Muscle Development in Pig. Front Genet 2021; 12:748239. [PMID: 34675966 PMCID: PMC8523936 DOI: 10.3389/fgene.2021.748239] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 09/14/2021] [Indexed: 12/12/2022] Open
Abstract
The spatial organization of the genome in the nucleus plays a crucial role in eukaryotic cell functions, yet little is known about chromatin structure variations during late fetal development in mammals. We performed in situ high-throughput chromosome conformation capture (Hi-C) sequencing of DNA from muscle samples of pig fetuses at two late stages of gestation. Comparative analysis of the resulting Hi-C interaction matrices between both groups showed widespread differences of different types. First, we discovered a complex landscape of stable and group-specific Topologically Associating Domains (TADs). Investigating the nuclear partition of the chromatin into transcriptionally active and inactive compartments, we observed a genome-wide fragmentation of these compartments between 90 and 110 days of gestation. Also, we identified and characterized the distribution of differential cis- and trans-pairwise interactions. In particular, trans-interactions at chromosome extremities revealed a mechanism of telomere clustering further confirmed by 3D Fluorescence in situ Hybridization (FISH). Altogether, we report major variations of the three-dimensional genome conformation during muscle development in pig, involving several levels of chromatin remodeling and structural regulation.
Collapse
|
6
|
Finding differentially expressed sRNA-Seq regions with srnadiff. PLoS One 2021; 16:e0256196. [PMID: 34415926 PMCID: PMC8378736 DOI: 10.1371/journal.pone.0256196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 08/02/2021] [Indexed: 11/19/2022] Open
Abstract
Small RNAs (sRNAs) encompass a great variety of molecules of different kinds, such as microRNAs, small interfering RNAs, Piwi-associated RNA, among others. These sRNAs have a wide range of activities, which include gene regulation, protection against virus, transposable element silencing, and have been identified as a key actor in determining the development of the cell. Small RNA sequencing is thus routinely used to assess the expression of the diversity of sRNAs, usually in the context of differentially expression, where two conditions are compared. Tools that detect differentially expressed microRNAs are numerous, because microRNAs are well documented, and the associated genes are well defined. However, tools are lacking to detect other types of sRNAs, which are less studied, and whose precursor RNA is not well characterized. We present here a new method, called srnadiff, which finds all kinds of differentially expressed sRNAs. To the extent of our knowledge, srnadiff is the first tool that detects differentially expressed sRNAs without the use of external information, such as genomic annotation or additional sequences of sRNAs.
Collapse
|
7
|
Abstract
High-throughput sequencing makes it possible to provide the genome-wide distribution of small non coding RNAs in a single experiment, and contributed greatly to the identification and understanding of these RNAs in the last decade. Small non coding RNAs gather a wide collection of classes, such as microRNAs, tRNA-derived fragments, small nucleolar RNAs and small nuclear RNAs, to name a few. As usual in RNA-seq studies, the sequencing step is followed by a feature quantification step: when a genome is available, the reads are aligned to the genome, their genomic positions are compared to the already available annotations, and the corresponding features are quantified. However, problem arises when many reads map at several positions and while different strategies exist to circumvent this problem, all of them are biased. In this article, we present a new strategy that compares all the reads that map at several positions, and their annotations when available. In many cases, all the hits co-localize with the same feature annotation (a duplicated miRNA or a duplicated gene, for instance). When different annotations exist for a given read, we propose to merge existing features and provide the counts for the merged features. This new strategy has been implemented in a tool, mmannot, freely available at https://github.com/mzytnicki/mmannot.
Collapse
|
8
|
Multi-species annotation of transcriptome and chromatin structure in domesticated animals. BMC Biol 2019; 17:108. [PMID: 31884969 PMCID: PMC6936065 DOI: 10.1186/s12915-019-0726-5] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 11/19/2019] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Comparative genomics studies are central in identifying the coding and non-coding elements associated with complex traits, and the functional annotation of genomes is a critical step to decipher the genotype-to-phenotype relationships in livestock animals. As part of the Functional Annotation of Animal Genomes (FAANG) action, the FR-AgENCODE project aimed to create reference functional maps of domesticated animals by profiling the landscape of transcription (RNA-seq), chromatin accessibility (ATAC-seq) and conformation (Hi-C) in species representing ruminants (cattle, goat), monogastrics (pig) and birds (chicken), using three target samples related to metabolism (liver) and immunity (CD4+ and CD8+ T cells). RESULTS RNA-seq assays considerably extended the available catalog of annotated transcripts and identified differentially expressed genes with unknown function, including new syntenic lncRNAs. ATAC-seq highlighted an enrichment for transcription factor binding sites in differentially accessible regions of the chromatin. Comparative analyses revealed a core set of conserved regulatory regions across species. Topologically associating domains (TADs) and epigenetic A/B compartments annotated from Hi-C data were consistent with RNA-seq and ATAC-seq data. Multi-species comparisons showed that conserved TAD boundaries had stronger insulation properties than species-specific ones and that the genomic distribution of orthologous genes in A/B compartments was significantly conserved across species. CONCLUSIONS We report the first multi-species and multi-assay genome annotation results obtained by a FAANG project. Beyond the generation of reference annotations and the confirmation of previous findings on model animals, the integrative analysis of data from multiple assays and species sheds a new light on the multi-scale selective pressure shaping genome organization from birds to mammals. Overall, these results emphasize the value of FAANG for research on domesticated animals and reinforces the importance of future meta-analyses of the reference datasets being generated by this community on different species.
Collapse
|
9
|
Genome‐wide patterns of transposon proliferation in an evolutionary young hybrid fish. Mol Ecol 2019; 28:1491-1505. [DOI: 10.1111/mec.14969] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Revised: 10/15/2018] [Accepted: 10/23/2018] [Indexed: 01/19/2023]
|
10
|
The siRNA suppressor RTL1 is redox-regulated through glutathionylation of a conserved cysteine in the double-stranded-RNA-binding domain. Nucleic Acids Res 2017; 45:11891-11907. [PMID: 28981840 PMCID: PMC5714217 DOI: 10.1093/nar/gkx820] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 09/13/2017] [Indexed: 01/20/2023] Open
Abstract
RNase III enzymes cleave double stranded (ds)RNA. This is an essential step for regulating the processing of mRNA, rRNA, snoRNA and other small RNAs, including siRNA and miRNA. Arabidopsis thaliana encodes nine RNase III: four DICER-LIKE (DCL) and five RNASE THREE LIKE (RTL). To better understand the molecular functions of RNase III in plants we developed a biochemical assay using RTL1 as a model. We show that RTL1 does not degrade dsRNA randomly, but recognizes specific duplex sequences to direct accurate cleavage. Furthermore, we demonstrate that RNase III and dsRNA binding domains (dsRBD) are both required for dsRNA cleavage. Interestingly, the four DCL and the three RTL that carry dsRBD share a conserved cysteine (C230 in Arabidopsis RTL1) in their dsRBD. C230 is essential for RTL1 and DCL1 activities and is subjected to post-transcriptional modification. Indeed, under oxidizing conditions, glutathionylation of C230 inhibits RTL1 cleavage activity in a reversible manner involving glutaredoxins. We conclude that the redox state of the dsRBD ensures a fine-tune regulation of dsRNA processing by plant RNase III.
Collapse
|
11
|
Abstract
Background RNA-Seq is currently used routinely, and it provides accurate information on gene transcription. However, the method cannot accurately estimate duplicated genes expression. Several strategies have been previously used (drop duplicated genes, distribute uniformly the reads, or estimate expression), but all of them provide biased results. Results We provide here a tool, called mmquant, for computing gene expression, included duplicated genes. If a read maps at different positions, the tool detects that the corresponding genes are duplicated; it merges the genes and creates a merged gene. The counts of ambiguous reads is then based on the input genes and the merged genes. Conclusion mmquant is a drop-in replacement of the widely used tools htseq-count and featureCounts that handles multi-mapping reads in an unabiased way. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1816-4) contains supplementary material, which is available to authorized users.
Collapse
|
12
|
The Arabidopsis hnRNP-Q Protein LIF2 and the PRC1 Subunit LHP1 Function in Concert to Regulate the Transcription of Stress-Responsive Genes. THE PLANT CELL 2016; 28:2197-2211. [PMID: 27495811 PMCID: PMC5059796 DOI: 10.1105/tpc.16.00244] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Revised: 07/28/2016] [Accepted: 08/05/2016] [Indexed: 05/03/2023]
Abstract
LHP1-INTERACTING FACTOR2 (LIF2), a heterogeneous nuclear ribonucleoprotein involved in Arabidopsis thaliana cell fate and stress responses, interacts with LIKE HETEROCHROMATIN PROTEIN1 (LHP1), a Polycomb Repressive Complex1 subunit. To investigate LIF2-LHP1 functional interplay, we mapped their genome-wide distributions in wild-type, lif2, and lhp1 backgrounds, under standard and stress conditions. Interestingly, LHP1-targeted regions form local clusters, suggesting an underlying functional organization of the plant genome. Regions targeted by both LIF2 and LHP1 were enriched in stress-responsive genes, the H2A.Z histone variant, and antagonistic histone marks. We identified specific motifs within the targeted regions, including a G-box-like motif, a GAGA motif, and a telo-box. LIF2 and LHP1 can operate both antagonistically and synergistically. In response to methyl jasmonate treatment, LIF2 was rapidly recruited to chromatin, where it mediated transcriptional gene activation. Thus, LIF2 and LHP1 participate in transcriptional switches in stress-response pathways.
Collapse
|
13
|
Arabidopsis RNASE THREE LIKE2 Modulates the Expression of Protein-Coding Genes via 24-Nucleotide Small Interfering RNA-Directed DNA Methylation. THE PLANT CELL 2016; 28:406-25. [PMID: 26764378 PMCID: PMC4790866 DOI: 10.1105/tpc.15.00540] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Accepted: 01/12/2016] [Indexed: 05/08/2023]
Abstract
RNaseIII enzymes catalyze the cleavage of double-stranded RNA (dsRNA) and have diverse functions in RNA maturation. Arabidopsis thaliana RNASE THREE LIKE2 (RTL2), which carries one RNaseIII and two dsRNA binding (DRB) domains, is a unique Arabidopsis RNaseIII enzyme resembling the budding yeast small interfering RNA (siRNA)-producing Dcr1 enzyme. Here, we show that RTL2 modulates the production of a subset of small RNAs and that this activity depends on both its RNaseIII and DRB domains. However, the mode of action of RTL2 differs from that of Dcr1. Whereas Dcr1 directly cleaves dsRNAs into 23-nucleotide siRNAs, RTL2 likely cleaves dsRNAs into longer molecules, which are subsequently processed into small RNAs by the DICER-LIKE enzymes. Depending on the dsRNA considered, RTL2-mediated maturation either improves (RTL2-dependent loci) or reduces (RTL2-sensitive loci) the production of small RNAs. Because the vast majority of RTL2-regulated loci correspond to transposons and intergenic regions producing 24-nucleotide siRNAs that guide DNA methylation, RTL2 depletion modifies DNA methylation in these regions. Nevertheless, 13% of RTL2-regulated loci correspond to protein-coding genes. We show that changes in 24-nucleotide siRNA levels also affect DNA methylation levels at such loci and inversely correlate with mRNA steady state levels, thus implicating RTL2 in the regulation of protein-coding gene expression.
Collapse
|
14
|
Post-transcriptional gene silencing triggered by sense transgenes involves uncapped antisense RNA and differs from silencing intentionally triggered by antisense transgenes. Nucleic Acids Res 2015. [PMID: 26209135 PMCID: PMC4787800 DOI: 10.1093/nar/gkv753] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Although post-transcriptional gene silencing (PTGS) has been studied for more than a decade, there is still a gap in our understanding of how de novo silencing is initiated against genetic elements that are not supposed to produce double-stranded (ds)RNA. Given the pervasive transcription occurring throughout eukaryote genomes, we tested the hypothesis that unintended transcription could produce antisense (as)RNA molecules that participate to the initiation of PTGS triggered by sense transgenes (S-PTGS). Our results reveal a higher level of asRNA in Arabidopsis thaliana lines that spontaneously trigger S-PTGS than in lines that do not. However, PTGS triggered by antisense transgenes (AS-PTGS) differs from S-PTGS. In particular, a hypomorphic ago1 mutation that suppresses S-PTGS prevents the degradation of asRNA but not sense RNA during AS-PTGS, suggesting a different treatment of coding and non-coding RNA by AGO1, likely because of AGO1 association to polysomes. Moreover, the intended asRNA produced during AS-PTGS is capped whereas the asRNA produced during S-PTGS derives from 3′ maturation of a read-through transcript and is uncapped. Thus, we propose that uncapped asRNA corresponds to the aberrant RNA molecule that is converted to dsRNA by RNA-DEPENDENT RNA POLYMERASE 6 in siRNA-bodies to initiate S-PTGS, whereas capped asRNA must anneal with sense RNA to produce dsRNA that initiate AS-PTGS.
Collapse
|
15
|
Genome expansion of Arabis alpina linked with retrotransposition and reduced symmetric DNA methylation. NATURE PLANTS 2015; 1:14023. [PMID: 27246759 DOI: 10.1038/nplants.2014.23] [Citation(s) in RCA: 114] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 12/10/2014] [Indexed: 05/10/2023]
Abstract
Despite evolutionary conserved mechanisms to silence transposable element activity, there are drastic differences in the abundance of transposable elements even among closely related plant species. We conducted a de novo assembly for the 375 Mb genome of the perennial model plant, Arabis alpina. Analysing this genome revealed long-lasting and recent transposable element activity predominately driven by Gypsy long terminal repeat retrotransposons, which extended the low-recombining pericentromeres and transformed large formerly euchromatic regions into repeat-rich pericentromeric regions. This reduced capacity for long terminal repeat retrotransposon silencing and removal in A. alpina co-occurs with unexpectedly low levels of DNA methylation. Most remarkably, the striking reduction of symmetrical CG and CHG methylation suggests weakened DNA methylation maintenance in A. alpina compared with Arabidopsis thaliana. Phylogenetic analyses indicate a highly dynamic evolution of some components of methylation maintenance machinery that might be related to the unique methylation in A. alpina.
Collapse
|
16
|
Abstract
MOTIVATION Recent technological advances are allowing many laboratories to sequence their research organisms. Available de novo assemblers leave repetitive portions of the genome poorly assembled. Some genomes contain high proportions of transposable elements, and transposable elements appear to be a major force behind diversity and adaptation. Few de novo assemblers for transposable elements exist, and most have either been designed for small genomes or 454 reads. RESULTS In this article, we present a new transposable element de novo assembler, Tedna, which assembles a set of transposable elements directly from the reads. Tedna uses Illumina paired-end reads, the most widely used sequencing technology for de novo assembly, and forms full-length transposable elements. AVAILABILITY AND IMPLEMENTATION Tedna is available at http://urgi.versailles.inra.fr/Tools/Tedna, under the GPLv3 license. It is written in C++11 and only requires the Sparsehash Package, freely available under the New BSD License. Tedna can be used on standard computers with limited RAM resources, although it may also use large memory for better results. Most of the code is parallelized and thus ready for large infrastructures.
Collapse
|
17
|
Abstract
The non-coding transcriptome of the hyperthermophilic archaeon Pyrococcus abyssi is investigated using the RNA-seq technology. A dedicated computational pipeline analyzes RNA-seq reads and prior genome annotation to identify small RNAs, untranslated regions of mRNAs, and cis-encoded antisense transcripts. Unlike other archaea, such as Sulfolobus and Halobacteriales, P. abyssi produces few leaderless mRNA transcripts. Antisense transcription is widespread (215 transcripts) and targets protein-coding genes that appear to evolve more rapidly than average genes. We identify at least three novel H/ACA-like guide RNAs among the newly characterized non-coding RNAs. Long 5′ UTRs in mRNAs of ribosomal proteins and amino-acid biosynthesis genes strongly suggest the presence of cis-regulatory leaders in these mRNAs. We selected a high-interest subset of non-coding RNAs based on their strong promoters, high GC-content, phylogenetic conservation, or abundance. Some of the novel small RNAs and long 5′ UTRs display high GC contents, suggesting unknown structural RNA functions. However, we were surprised to observe that most of the high-interest RNAs are AU-rich, which suggests an absence of stable secondary structure in the high-temperature environment of P. abyssi. Yet, these transcripts display other hallmarks of functionality, such as high expression or high conservation, which leads us to consider possible RNA functions that do not require extensive secondary structure.
Collapse
|
18
|
Identification of a novel microRNA (miRNA) from rice that targets an alternatively spliced transcript of the Nramp6 (Natural resistance-associated macrophage protein 6) gene involved in pathogen resistance. THE NEW PHYTOLOGIST 2013; 199:212-227. [PMID: 23627500 DOI: 10.1111/nph.12292] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2012] [Accepted: 02/26/2013] [Indexed: 05/18/2023]
Abstract
Plants have evolved efficient defence mechanisms to defend themselves from pathogen attack. Although many studies have focused on the transcriptional regulation of defence responses, less is known about the involvement of microRNAs (miRNAs) as post-transcriptional regulators of gene expression in plant immunity. This work investigates miRNAs that are regulated by elicitors from the blast fungus Magnaporthe oryzae in rice (Oryza sativa). Small RNA libraries were constructed from rice tissues and subjected to high-throughput sequencing for the identification of elicitor-responsive miRNAs. Target gene expression was examined by microarray analysis. Transgenic lines were used for the analysis of miRNA functioning in disease resistance. Elicitor treatment is accompanied by dynamic alterations in the expression of a significant number of miRNAs, including new members of annotated miRNAs. Novel miRNAs from rice are proposed. We report a new rice miRNA, osa-miR7695, which negatively regulates an alternatively spliced transcript of OsNramp6 (Natural resistance-associated macrophage protein 6). This novel miRNA experienced natural and domestication selection events during evolution, and its overexpression in rice confers pathogen resistance. This study highlights an miRNA-mediated regulation of OsNramp6 in disease resistance, whilst illustrating the existence of a novel regulatory network that integrates miRNA function and mRNA processing in plant immunity.
Collapse
|
19
|
Detection of non-coding RNA in bacteria and archaea using the DETR'PROK Galaxy pipeline. Methods 2013; 63:60-5. [PMID: 23806640 DOI: 10.1016/j.ymeth.2013.06.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2013] [Revised: 05/21/2013] [Accepted: 06/09/2013] [Indexed: 11/29/2022] Open
Abstract
RNA-seq experiments are now routinely used for the large scale sequencing of transcripts. In bacteria or archaea, such deep sequencing experiments typically produce 10-50 million fragments that cover most of the genome, including intergenic regions. In this context, the precise delineation of the non-coding elements is challenging. Non-coding elements include untranslated regions (UTRs) of mRNAs, independent small RNA genes (sRNAs) and transcripts produced from the antisense strand of genes (asRNA). Here we present a computational pipeline (DETR'PROK: detection of ncRNAs in prokaryotes) based on the Galaxy framework that takes as input a mapping of deep sequencing reads and performs successive steps of clustering, comparison with existing annotation and identification of transcribed non-coding fragments classified into putative 5' UTRs, sRNAs and asRNAs. We provide a step-by-step description of the protocol using real-life example data sets from Vibrio splendidus and Escherichia coli.
Collapse
|
20
|
|
21
|
Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis. J Comput Biol 2012; 19:796-813. [PMID: 22506536 DOI: 10.1089/cmb.2012.0022] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Mapping short reads against a reference genome is classically the first step of many next-generation sequencing data analyses, and it should be as accurate as possible. Because of the large number of reads to handle, numerous sophisticated algorithms have been developped in the last 3 years to tackle this problem. In this article, we first review the underlying algorithms used in most of the existing mapping tools, and then we compare the performance of nine of these tools on a well controled benchmark built for this purpose. We built a set of reads that exist in single or multiple copies in a reference genome and for which there is no mismatch, and a set of reads with three mismatches. We considered as reference genome both the human genome and a concatenation of all complete bacterial genomes. On each dataset, we quantified the capacity of the different tools to retrieve all the occurrences of the reads in the reference genome. Special attention was paid to reads uniquely reported and to reads with multiple hits.
Collapse
|
22
|
Abstract
We present and validate BlastR, a method for efficiently and accurately searching non-coding RNAs. Our approach relies on the comparison of di-nucleotides using BlosumR, a new log-odd substitution matrix. In order to use BlosumR for comparison, we recoded RNA sequences into protein-like sequences. We then showed that BlosumR can be used along with the BlastP algorithm in order to search non-coding RNA sequences. Using Rfam as a gold standard, we benchmarked this approach and show BlastR to be more sensitive than BlastN. We also show that BlastR is both faster and more sensitive than BlastP used with a single nucleotide log-odd substitution matrix. BlastR, when used in combination with WU-BlastP, is about 5% more accurate than WU-BlastN and about 50 times slower. The approach shown here is equally effective when combined with the NCBI-Blast package. The software is an open source freeware available from www.tcoffee.org/blastr.html.
Collapse
|
23
|
Long noncoding RNAs with enhancer-like function in human cells. Cell 2010; 143:46-58. [PMID: 20887892 DOI: 10.1016/j.cell.2010.09.001] [Citation(s) in RCA: 1414] [Impact Index Per Article: 101.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Revised: 07/01/2010] [Accepted: 08/13/2010] [Indexed: 12/21/2022]
Abstract
While the long noncoding RNAs (ncRNAs) constitute a large portion of the mammalian transcriptome, their biological functions has remained elusive. A few long ncRNAs that have been studied in any detail silence gene expression in processes such as X-inactivation and imprinting. We used a GENCODE annotation of the human genome to characterize over a thousand long ncRNAs that are expressed in multiple cell lines. Unexpectedly, we found an enhancer-like function for a set of these long ncRNAs in human cell lines. Depletion of a number of ncRNAs led to decreased expression of their neighboring protein-coding genes, including the master regulator of hematopoiesis, SCL (also called TAL1), Snai1 and Snai2. Using heterologous transcription assays we demonstrated a requirement for the ncRNAs in activation of gene expression. These results reveal an unanticipated role for a class of long ncRNAs in activation of critical regulators of development and differentiation.
Collapse
|
24
|
|
25
|
Abstract
The Weighted Constraint Satisfaction Problem (WCSP) framework allows representing and solving problems involving both hard constraints and cost functions. It has been applied to various problems, including resource allocation, bioinformatics, scheduling, etc. To solve such problems, solvers usually rely on branch-and-bound algorithms equipped with local consistency filtering, mostly soft arc consistency. However, these techniques are not well suited to solve problems with very large domains. Motivated by the resolution of an RNA gene localization problem inside large genomic sequences, and in the spirit of bounds consistency for large domains in crisp CSPs, we introduce soft bounds arc consistency, a new weighted local consistency specifically designed for WCSP with very large domains. Compared to soft arc consistency, BAC provides significantly improved time and space asymptotic complexity. In this paper, we show how the semantics of cost functions can be exploited to further improve the time complexity of BAC. We also compare both in theory and in practice the efficiency of BAC on a WCSP with bounds consistency enforced on a crisp CSP using cost variables. On two different real problems modeled as WCSP, including our RNA gene localization problem, we observe that maintaining bounds arc consistency outperforms arc consistency and also improves over bounds consistency enforced on a constraint model with cost variables.
Collapse
|