1
|
Novel applications of Convolutional Neural Networks in the age of Transformers. Sci Rep 2024; 14:10000. [PMID: 38693215 PMCID: PMC11063149 DOI: 10.1038/s41598-024-60709-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 04/26/2024] [Indexed: 05/03/2024] Open
Abstract
Convolutional Neural Networks (CNNs) have been central to the Deep Learning revolution and played a key role in initiating the new age of Artificial Intelligence. However, in recent years newer architectures such as Transformers have dominated both research and practical applications. While CNNs still play critical roles in many of the newer developments such as Generative AI, they are far from being thoroughly understood and utilised to their full potential. Here we show that CNNs can recognise patterns in images with scattered pixels and can be used to analyse complex datasets by transforming them into pseudo images with minimal processing for any high dimensional dataset, representing a more general approach to the application of CNNs to datasets such as in molecular biology, text, and speech. We introduce a pipeline called DeepMapper, which allows analysis of very high dimensional datasets without intermediate filtering and dimension reduction, thus preserving the full texture of the data, enabling detection of small variations normally deemed 'noise'. We demonstrate that DeepMapper can identify very small perturbations in large datasets with mostly random variables, and that it is superior in speed and on par in accuracy to prior work in processing large datasets with large numbers of features.
Collapse
|
2
|
Natural antisense transcripts as versatile regulators of gene expression. Nat Rev Genet 2024:10.1038/s41576-024-00723-z. [PMID: 38632496 DOI: 10.1038/s41576-024-00723-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/07/2024] [Indexed: 04/19/2024]
Abstract
Long non-coding RNAs (lncRNAs) are emerging as a major class of gene products that have central roles in cell and developmental biology. Natural antisense transcripts (NATs) are an important subset of lncRNAs that are expressed from the opposite strand of protein-coding and non-coding genes and are a genome-wide phenomenon in both eukaryotes and prokaryotes. In eukaryotes, a myriad of NATs participate in regulatory pathways that affect expression of their cognate sense genes. Recent developments in the study of NATs and lncRNAs and large-scale sequencing and bioinformatics projects suggest that whether NATs regulate expression, splicing, stability or translation of the sense transcript is influenced by the pattern and degrees of overlap between the sense-antisense pair. Moreover, epigenetic gene regulatory mechanisms prevail in somatic cells whereas mechanisms dependent on the formation of double-stranded RNA intermediates are prevalent in germ cells. The modulating effects of NATs on sense transcript expression make NATs rational targets for therapeutic interventions.
Collapse
|
3
|
A Kuhnian revolution in molecular biology: Most genes in complex organisms express regulatory RNAs. Bioessays 2023; 45:e2300080. [PMID: 37318305 DOI: 10.1002/bies.202300080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 06/16/2023]
Abstract
Thomas Kuhn described the progress of science as comprising occasional paradigm shifts separated by interludes of 'normal science'. The paradigm that has held sway since the inception of molecular biology is that genes (mainly) encode proteins. In parallel, theoreticians posited that mutation is random, inferred that most of the genome in complex organisms is non-functional, and asserted that somatic information is not communicated to the germline. However, many anomalies appeared, particularly in plants and animals: the strange genetic phenomena of paramutation and transvection; introns; repetitive sequences; a complex epigenome; lack of scaling of (protein-coding) genes and increase in 'noncoding' sequences with developmental complexity; genetic loci termed 'enhancers' that control spatiotemporal gene expression patterns during development; and a plethora of 'intergenic', overlapping, antisense and intronic transcripts. These observations suggest that the original conception of genetic information was deficient and that most genes in complex organisms specify regulatory RNAs, some of which convey intergenerational information. Also see the video abstract here: https://youtu.be/qxeGwahBANw.
Collapse
|
4
|
Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat Rev Mol Cell Biol 2023; 24:430-447. [PMID: 36596869 PMCID: PMC10213152 DOI: 10.1038/s41580-022-00566-8] [Citation(s) in RCA: 306] [Impact Index Per Article: 306.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2022] [Indexed: 01/05/2023]
Abstract
Genes specifying long non-coding RNAs (lncRNAs) occupy a large fraction of the genomes of complex organisms. The term 'lncRNAs' encompasses RNA polymerase I (Pol I), Pol II and Pol III transcribed RNAs, and RNAs from processed introns. The various functions of lncRNAs and their many isoforms and interleaved relationships with other genes make lncRNA classification and annotation difficult. Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and development and other physiological processes. Many lncRNAs associate with chromatin-modifying complexes, are transcribed from enhancers and nucleate phase separation of nuclear condensates and domains, indicating an intimate link between lncRNA expression and the spatial control of gene expression during development. lncRNAs also have important roles in the cytoplasm and beyond, including in the regulation of translation, metabolism and signalling. lncRNAs often have a modular structure and are rich in repeats, which are increasingly being shown to be relevant to their function. In this Consensus Statement, we address the definition and nomenclature of lncRNAs and their conservation, expression, phenotypic visibility, structure and functions. We also discuss research challenges and provide recommendations to advance the understanding of the roles of lncRNAs in development, cell biology and disease.
Collapse
|
5
|
RNA Regulatory Networks 2.0. Int J Mol Sci 2023; 24:ijms24109001. [PMID: 37240347 DOI: 10.3390/ijms24109001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
The central role of RNA molecules in cell biology has been an expanding subject of study since the proposal of the "RNA world" hypothesis 60 years ago [...].
Collapse
|
6
|
RNA out of the mist. Trends Genet 2023; 39:187-207. [PMID: 36528415 DOI: 10.1016/j.tig.2022.11.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 11/08/2022] [Accepted: 11/27/2022] [Indexed: 12/23/2022]
Abstract
RNA has long been regarded primarily as the intermediate between genes and proteins. It was a surprise then to discover that eukaryotic genes are mosaics of mRNA sequences interrupted by large tracts of transcribed but untranslated sequences, and that multicellular organisms also express many long 'intergenic' and antisense noncoding RNAs (lncRNAs). The identification of small RNAs that regulate mRNA translation and half-life did not disturb the prevailing view that animals and plant genomes are full of evolutionary debris and that their development is mainly supervised by transcription factors. Gathering evidence to the contrary involved addressing the low conservation, expression, and genetic visibility of lncRNAs, demonstrating their cell-specific roles in cell and developmental biology, and their association with chromatin-modifying complexes and phase-separated domains. The emerging picture is that most lncRNAs are the products of genetic loci termed 'enhancers', which marshal generic effector proteins to their sites of action to control cell fate decisions during development.
Collapse
|
7
|
Nano3P-seq: transcriptome-wide analysis of gene expression and tail dynamics using end-capture nanopore cDNA sequencing. Nat Methods 2023; 20:75-85. [PMID: 36536091 PMCID: PMC9834059 DOI: 10.1038/s41592-022-01714-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 11/03/2022] [Indexed: 12/24/2022]
Abstract
RNA polyadenylation plays a central role in RNA maturation, fate, and stability. In response to developmental cues, polyA tail lengths can vary, affecting the translation efficiency and stability of mRNAs. Here we develop Nanopore 3' end-capture sequencing (Nano3P-seq), a method that relies on nanopore cDNA sequencing to simultaneously quantify RNA abundance, tail composition, and tail length dynamics at per-read resolution. By employing a template-switching-based sequencing protocol, Nano3P-seq can sequence RNA molecule from its 3' end, regardless of its polyadenylation status, without the need for PCR amplification or ligation of RNA adapters. We demonstrate that Nano3P-seq provides quantitative estimates of RNA abundance and tail lengths, and captures a wide diversity of RNA biotypes. We find that, in addition to mRNA and long non-coding RNA, polyA tails can be identified in 16S mitochondrial ribosomal RNA in both mouse and zebrafish models. Moreover, we show that mRNA tail lengths are dynamically regulated during vertebrate embryogenesis at an isoform-specific level, correlating with mRNA decay. Finally, we demonstrate the ability of Nano3P-seq in capturing non-A bases within polyA tails of various lengths, and reveal their distribution during vertebrate embryogenesis. Overall, Nano3P-seq is a simple and robust method for accurately estimating transcript levels, tail lengths, and tail composition heterogeneity in individual reads, with minimal library preparation biases, both in the coding and non-coding transcriptome.
Collapse
|
8
|
Abstract
Chemical RNA modifications, collectively referred to as the "epitranscriptome," are essential players in fine-tuning gene expression. Our ability to analyze RNA modifications has improved rapidly in recent years, largely due to the advent of high-throughput sequencing methodologies, which typically consist of coupling modification-specific reagents, such as antibodies or enzymes, to next-generation sequencing. Recently, it also became possible to map RNA modifications directly by sequencing native RNAs using nanopore technologies, which has been applied for the detection of a number of RNA modifications, such as N6-methyladenosine (m6A), pseudouridine (Ψ), and inosine (I). However, the signal modulations caused by most RNA modifications are yet to be determined. A global effort is needed to determine the signatures of the full range of RNA modifications to avoid the technical biases that have so far limited our understanding of the epitranscriptome.
Collapse
|
9
|
ADRAM is an experience-dependent long noncoding RNA that drives fear extinction through a direct interaction with the chaperone protein 14-3-3. Cell Rep 2022; 38:110546. [PMID: 35320727 PMCID: PMC9015815 DOI: 10.1016/j.celrep.2022.110546] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 02/03/2022] [Accepted: 02/28/2022] [Indexed: 11/25/2022] Open
Abstract
Here, we used RNA capture-seq to identify a large population of lncRNAs that are expressed in the infralimbic prefrontal cortex of adult male mice in response to fear-related learning. Combining these data with cell-type-specific ATAC-seq on neurons that had been selectively activated by fear extinction learning, we find inducible 434 lncRNAs that are derived from enhancer regions in the vicinity of protein-coding genes. In particular, we discover an experience-induced lncRNA we call ADRAM (activity-dependent lncRNA associated with memory) that acts as both a scaffold and a combinatorial guide to recruit the brain-enriched chaperone protein 14-3-3 to the promoter of the memory-associated immediate-early gene Nr4a2 and is required fear extinction memory. This study expands the lexicon of experience-dependent lncRNA activity in the brain and highlights enhancer-derived RNAs (eRNAs) as key players in the epigenomic regulation of gene expression associated with the formation of fear extinction memory. Wei et al. use targeted RNA capture sequencing to examine experience-dependent long noncoding RNA activity in the infralimbic prefrontal cortex of adult mice. They discover a gene, which they call ADRAM, that is directly involved in the epigenomic regulation of gene expression underlying memory formation.
Collapse
|
10
|
High frequency of intron retention and clustered H3K4me3-marked nucleosomes in short first introns of human long non-coding RNAs. Epigenetics Chromatin 2021; 14:45. [PMID: 34579770 PMCID: PMC8477579 DOI: 10.1186/s13072-021-00419-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 08/27/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND It is established that protein-coding exons are preferentially localized in nucleosomes. To examine whether the same is true for non-coding exons, we analysed nucleosome occupancy in and adjacent to internal exons in genes encoding long non-coding RNAs (lncRNAs) in human CD4+ T cells and K562 cells. RESULTS We confirmed that internal exons in lncRNAs are preferentially associated with nucleosomes, but also observed an elevated signal from H3K4me3-marked nucleosomes in the sequences upstream of these exons. Examination of 200 genomic lncRNA loci chosen at random across all chromosomes showed that high-density regions of H3K4me3-marked nucleosomes, which we term 'slabs', are associated with genomic regions exhibiting intron retention. These retained introns occur in over 50% of lncRNAs examined and are mostly first introns with an average length of just 354 bp, compared to the average length of all human introns of 6355 and 7987 bp in mRNAs and lncRNAs, respectively. Removal of short introns from the dataset abrogated the high upstream H3K4me3 signal, confirming that the association of slabs and short lncRNA introns with intron retention holds genome-wide. The high upstream H3K4me3 signal is also associated with alternatively spliced exons, known to be prominent in lncRNAs. This phenomenon was not observed with mRNAs. CONCLUSIONS There is widespread intron retention and clustered H3K4me3-marked nucleosomes in short first introns of human long non-coding RNAs, which raises intriguing questions about the relationship of IR to lncRNA function and chromatin organization.
Collapse
|
11
|
Widespread formation of double-stranded RNAs in testis. Genome Res 2021; 31:1174-1186. [PMID: 34158368 PMCID: PMC8256860 DOI: 10.1101/gr.265603.120] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 06/02/2021] [Indexed: 12/27/2022]
Abstract
The testis transcriptome is highly complex and includes RNAs that potentially hybridize to form double-stranded RNA (dsRNA). We isolated dsRNA using the monoclonal J2 antibody and deep-sequenced the enriched samples from testes of juvenile Dicer1 knockout mice, age-matched controls, and adult animals. Comparison of our data set with recently published data from mouse liver revealed that the dsRNA transcriptome in testis is markedly different from liver: In testis, dsRNA-forming transcripts derive from mRNAs including promoters and immediate downstream regions, whereas in somatic cells they originate more often from introns and intergenic transcription. The genes that generate dsRNA are significantly expressed in isolated male germ cells with particular enrichment in pachytene spermatocytes. dsRNA formation is lower on the sex (X and Y) chromosomes. The dsRNA transcriptome is significantly less complex in juvenile mice as compared to adult controls and, possibly as a consequence, the knockout of Dicer1 has only a minor effect on the total number of transcript peaks associated with dsRNA. The comparison between dsRNA-associated genes in testis and liver with a reported set of genes that produce endogenous siRNAs reveals a significant overlap in testis but not in liver. Testis dsRNAs also significantly associate with natural antisense genes-again, this feature is not observed in liver. These findings point to a testis-specific mechanism involving natural antisense transcripts and the formation of dsRNAs that feed into the RNA interference pathway, possibly to mitigate the mutagenic impacts of recombination and transposon mobilization.
Collapse
|
12
|
Subcellular relocalization and nuclear redistribution of the RNA methyltransferases TRMT1 and TRMT1L upon neuronal activation. RNA Biol 2021; 18:1905-1919. [PMID: 33499731 DOI: 10.1080/15476286.2021.1881291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Abstract
RNA modifications are dynamic chemical entities that expand the RNA lexicon and regulate RNA fate. The most abundant modification present in mRNAs, N6-methyladenosine (m6A), has been implicated in neurogenesis and memory formation. However, whether additional RNA modifications may be playing a role in neuronal functions and in response to environmental queues is largely unknown. Here we characterize the biochemical function and cellular dynamics of two human RNA methyltransferases previously associated with neurological dysfunction, TRMT1 and its homolog, TRMT1-like (TRMT1L). Using a combination of next-generation sequencing, LC-MS/MS, patient-derived cell lines and knockout mouse models, we confirm the previously reported dimethylguanosine (m2,2G) activity of TRMT1 in tRNAs, as well as reveal that TRMT1L, whose activity was unknown, is responsible for methylating a subset of cytosolic tRNAAla(AGC) isodecoders at position 26. Using a cellular in vitro model that mimics neuronal activation and long term potentiation, we find that both TRMT1 and TRMT1L change their subcellular localization upon neuronal activation. Specifically, we observe a major subcellular relocalization from mitochondria and other cytoplasmic domains (TRMT1) and nucleoli (TRMT1L) to different small punctate compartments in the nucleus, which are as yet uncharacterized. This phenomenon does not occur upon heat shock, suggesting that the relocalization of TRMT1 and TRMT1L is not a general reaction to stress, but rather a specific response to neuronal activation. Our results suggest that subcellular relocalization of RNA modification enzymes may play a role in neuronal plasticity and transmission of information, presumably by addressing new targets.
Collapse
|
13
|
Genetic Variations of Ultraconserved Elements in the Human Genome. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2020; 23:549-559. [PMID: 31689173 DOI: 10.1089/omi.2019.0156] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Ultraconserved elements (UCEs) are among the most popular DNA markers for phylogenomic analysis. In at least three of five placental mammalian genomes (human, dog, cow, mouse, and rat), 2189 UCEs of at least 200 bp in length that are identical have been identified. Most of these regions have not yet been functionally annotated, and their associations with diseases remain largely unknown. This is an important knowledge gap in human genomics with regard to UCE roles in physiologically critical functions, and by extension, their relevance for shared susceptibilities to common complex diseases across several mammalian organisms in the event of their polymorphic variations. In the present study, we remapped the genomic locations of these UCEs to the latest human genome assembly, and examined them for documented polymorphisms in sequenced human genomes. We identified 29,983 polymorphisms within analyzed UCEs, but revealed that a vast majority exhibits very low minor allele frequencies. Notably, only 112 of the identified polymorphisms are associated with a phenotype in the Ensembl genome browser. Through literature analyses, we confirmed associations of 37 (i.e., out of the 112) polymorphisms within 23 UCEs with 25 diseases and phenotypic traits, including, muscular dystrophies, eye diseases, and cancers (e.g., familial adenomatous polyposis). Most reports of UCE polymorphism-disease associations appeared to be not cognizant that their candidate polymorphisms were actually within UCEs. The present study offers strategic directions and knowledge gaps for future computational and experimental work so as to better understand the thus far intriguing and puzzling role(s) of UCEs in mammalian genomes.
Collapse
|
14
|
Integrative analyses of the RNA modification machinery reveal tissue- and cancer-specific signatures. Genome Biol 2020; 21:97. [PMID: 32375858 PMCID: PMC7204298 DOI: 10.1186/s13059-020-02009-z] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 04/03/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND RNA modifications play central roles in cellular fate and differentiation. However, the machinery responsible for placing, removing, and recognizing more than 170 RNA modifications remains largely uncharacterized and poorly annotated, and we currently lack integrative studies that identify which RNA modification-related proteins (RMPs) may be dysregulated in each cancer type. RESULTS Here, we perform a comprehensive annotation and evolutionary analysis of human RMPs, as well as an integrative analysis of their expression patterns across 32 tissues, 10 species, and 13,358 paired tumor-normal human samples. Our analysis reveals an unanticipated heterogeneity of RMP expression patterns across mammalian tissues, with a vast proportion of duplicated enzymes displaying testis-specific expression, suggesting a key role for RNA modifications in sperm formation and possibly intergenerational inheritance. We uncover many RMPs that are dysregulated in various types of cancer, and whose expression levels are predictive of cancer progression. Surprisingly, we find that several commonly studied RNA modification enzymes such as METTL3 or FTO are not significantly upregulated in most cancer types, whereas several less-characterized RMPs, such as LAGE3 and HENMT1, are dysregulated in many cancers. CONCLUSIONS Our analyses reveal an unanticipated heterogeneity in the expression patterns of RMPs across mammalian tissues and uncover a large proportion of dysregulated RMPs in multiple cancer types. We provide novel targets for future cancer research studies targeting the human epitranscriptome, as well as foundations to understand cell type-specific behaviors that are orchestrated by RNA modifications.
Collapse
|
15
|
Impacts of genomics on the health and social costs of intellectual disability. J Med Genet 2020; 57:479-486. [PMID: 31980565 DOI: 10.1136/jmedgenet-2019-106445] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 12/17/2019] [Accepted: 01/03/2020] [Indexed: 11/03/2022]
Abstract
BACKGROUND This study provides an integrated assessment of the economic and social impacts of genomic sequencing for the detection of monogenic disorders resulting in intellectual disability (ID). METHODS Multiple knowledge bases were cross-referenced and analysed to compile a reference list of monogenic disorders associated with ID. Multiple literature searches were used to quantify the health and social costs for the care of people with ID. Health and social expenditures and the current cost of whole-exome sequencing and whole-genome sequencing were quantified in relation to the more common causes of ID and their impact on lifespan. RESULTS On average, individuals with ID incur annual costs in terms of health costs, disability support, lost income and other social costs of US$172 000, accumulating to many millions of dollars over a lifetime. CONCLUSION The diagnosis of monogenic disorders through genomic testing provides the opportunity to improve the diagnosis and management, and to reduce the costs of ID through informed reproductive decisions, reductions in unproductive diagnostic tests and increasingly targeted therapies.
Collapse
|
16
|
Abstract
The epitranscriptomics field has undergone an enormous expansion in the last few years; however, a major limitation is the lack of generic methods to map RNA modifications transcriptome-wide. Here, we show that using direct RNA sequencing, N6-methyladenosine (m6A) RNA modifications can be detected with high accuracy, in the form of systematic errors and decreased base-calling qualities. Specifically, we find that our algorithm, trained with m6A-modified and unmodified synthetic sequences, can predict m6A RNA modifications with ~90% accuracy. We then extend our findings to yeast data sets, finding that our method can identify m6A RNA modifications in vivo with an accuracy of 87%. Moreover, we further validate our method by showing that these 'errors' are typically not observed in yeast ime4-knockout strains, which lack m6A modifications. Our results open avenues to investigate the biological roles of RNA modifications in their native RNA context.
Collapse
|
17
|
CNS cell type-specific gene profiling of P301S tau transgenic mice identifies genes dysregulated by progressive tau accumulation. J Biol Chem 2019; 294:14149-14162. [PMID: 31366728 DOI: 10.1074/jbc.ra118.005263] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 07/24/2019] [Indexed: 12/20/2022] Open
Abstract
The microtubule-associated protein tau undergoes aberrant modification resulting in insoluble brain deposits in various neurodegenerative diseases, including frontotemporal dementia (FTD), progressive supranuclear palsy, and corticobasal degeneration. Tau aggregates can form in different cell types of the central nervous system (CNS) but are most prevalent in neurons. We have previously recapitulated aspects of human FTD in mouse models by overexpressing mutant human tau in CNS neurons, including a P301S tau variant in TAU58/2 mice, characterized by early-onset and progressive behavioral deficits and FTD-like neuropathology. The molecular mechanisms underlying the functional deficits of TAU58/2 mice remain mostly elusive. Here, we employed functional genomics (i.e. RNAseq) to determine differentially expressed genes in young and aged TAU58/2 mice to identify alterations in cellular processes that may contribute to neuropathy. We identified genes in cortical brain samples differentially regulated between young and old TAU58/2 mice relative to nontransgenic littermates and by comparative analysis with a dataset of CNS cell type-specific genes expressed in nontransgenic mice. Most differentially-regulated genes had known or putative roles in neurons and included presynaptic and excitatory genes. Specifically, we observed changes in presynaptic factors, glutamatergic signaling, and protein scaffolding. Moreover, in the aged mice, expression levels of several genes whose expression was annotated to occur in other brain cell types were altered. Immunoblotting and immunostaining of brain samples from the TAU58/2 mice confirmed altered expression and localization of identified and network-linked proteins. Our results have revealed genes dysregulated by progressive tau accumulation in an FTD mouse model.
Collapse
|
18
|
Targeted, High-Resolution RNA Sequencing of Non-coding Genomic Regions Associated With Neuropsychiatric Functions. Front Genet 2019; 10:309. [PMID: 31031799 PMCID: PMC6473190 DOI: 10.3389/fgene.2019.00309] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Accepted: 03/21/2019] [Indexed: 12/18/2022] Open
Abstract
The human brain is one of the last frontiers of biomedical research. Genome-wide association studies (GWAS) have succeeded in identifying thousands of haplotype blocks associated with a range of neuropsychiatric traits, including disorders such as schizophrenia, Alzheimer's and Parkinson's disease. However, the majority of single nucleotide polymorphisms (SNPs) that mark these haplotype blocks fall within non-coding regions of the genome, hindering their functional validation. While some of these GWAS loci may contain cis-acting regulatory DNA elements such as enhancers, we hypothesized that many are also transcribed into non-coding RNAs that are missing from publicly available transcriptome annotations. Here, we use targeted RNA capture ('RNA CaptureSeq') in combination with nanopore long-read cDNA sequencing to transcriptionally profile 1,023 haplotype blocks across the genome containing non-coding GWAS SNPs associated with neuropsychiatric traits, using post-mortem human brain tissue from three neurologically healthy donors. We find that the majority (62%) of targeted haplotype blocks, including 13% of intergenic blocks, are transcribed into novel, multi-exonic RNAs, most of which are not yet recorded in GENCODE annotations. We validated our findings with short-read RNA-seq, providing orthogonal confirmation of novel splice junctions and enabling a quantitative assessment of the long-read assemblies. Many novel transcripts are supported by independent evidence of transcription including cap analysis of gene expression (CAGE) data and epigenetic marks, and some show signs of potential functional roles. We present these transcriptomes as a preliminary atlas of non-coding transcription in human brain that can be used to connect neurological phenotypes with gene expression.
Collapse
|
19
|
Abstract 2453: Eradication of neuroblastoma by suppressing the expression of a single noncoding RNA. Cancer Res 2018. [DOI: 10.1158/1538-7445.am2018-2453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
N-Myc gene amplification occurs in one quarter of human neuroblastoma tissues, and is a marker for poor patient prognosis. We performed RNA sequencing experiments, and identified 5 transcripts, including RP1NB1, which were most considerably differentially expressed between N-Myc gene amplified and nonamplified human neuroblastoma cell lines. Affymetrix microarray studies revealed that DEPD was one of the few genes considerably downregulated in neuroblastoma cells after RP1NB1 depletion. Chromatin immunoprecipitation assays showed that knocking down RP1NB1 expression reduced histone H3 lysine 4 trimethylation, a marker for active gene transcription, at the DEPD gene promoter. Luciferase assays demonstrated that knocking down RP1NB1 decreased DEPD gene promoter activity. Depletion of RP1NB1 or DEPD with two independent siRNAs or shRNAs significantly reduced ERK protein phosphorylation, N-Myc protein phosphorylation at Serine 62, N-Myc protein stabilization, neuroblastoma cell proliferation and survival. Clonogenic assays showed that knocking down RP1NB1 with doxycycline completely abolished colony formation capacity of neuroblastoma cells stably transfected with doxycycline-inducible RP1NB1 shRNAs. Importantly, treatment with doxycycline in mice xenografted with neuroblastoma cells stably transfected with doxycycline-inducible RP1NB1 shRNA led to tumor eradication. In human neuroblastoma tissues from 600 neuroblastoma patients, high levels of RP1NB1 gene expression correlated with DEPD gene expression and poor patient prognosis. In conclusion, this study identifies the novel long noncoding RNA RP1NB1 as an important regulator of N-Myc protein stability and neuroblastoma tumorigenesis.
Citation Format: Andrew E. Tee, Pei Y. Liu, Giorgio Milazzo, Kate M. Hannan, Jesper Maag, Nenad Bartonicek, Renhua Song, Chen C. Jiang, Xu D. Zhang, Murray D. Norris, Michelle Haber, Glenn M. Marshall, Jinyan Li, Jo Vandesompele, John S. Mattick, Pieter Mestdagh, Giovanni Perini, Ross D. Hannan, Marcel E. Dinger, Tao Liu. Eradication of neuroblastoma by suppressing the expression of a single noncoding RNA [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2453.
Collapse
|
20
|
Abstract
The amount of regulatory RNA encoded in the genome and the extent of RNA editing by the post-transcriptional deamination of adenosine to inosine (A-I) have increased with developmental complexity and may be an important factor in the cognitive evolution of animals. The newest member of the A-I editing family of ADAR proteins, the vertebrate-specific ADAR3, is highly expressed in the brain, but its functional significance is unknown. In vitro studies have suggested that ADAR3 acts as a negative regulator of A-I RNA editing but the scope and underlying mechanisms are also unknown. Meta-analysis of published data indicates that mouse Adar3 expression is highest in the hippocampus, thalamus, amygdala, and olfactory region. Consistent with this, we show that mice lacking exon 3 of Adar3 (which encodes two double stranded RNA binding domains) have increased levels of anxiety and deficits in hippocampus-dependent short- and long-term memory formation. RNA sequencing revealed a dysregulation of genes involved in synaptic function in the hippocampi of Adar3-deficient mice. We also show that ADAR3 transiently translocates from the cytoplasm to the nucleus upon KCl-mediated activation in SH-SY5Y cells. These results indicate that ADAR3 contributes to cognitive processes in mammals.
Collapse
|
21
|
Whole genome sequencing provides better diagnostic yield and future value than whole exome sequencing. Med J Aust 2018; 209:197-199. [DOI: 10.5694/mja17.01176] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 03/08/2018] [Indexed: 12/21/2022]
|
22
|
Universal Alternative Splicing of Noncoding Exons. Cell Syst 2018; 6:245-255.e5. [DOI: 10.1016/j.cels.2017.12.005] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Revised: 10/18/2017] [Accepted: 12/08/2017] [Indexed: 01/31/2023]
|
23
|
DotAligner: identification and clustering of RNA structure motifs. Genome Biol 2017; 18:244. [PMID: 29284541 PMCID: PMC5747123 DOI: 10.1186/s13059-017-1371-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 12/05/2017] [Indexed: 01/01/2023] Open
Abstract
The diversity of processed transcripts in eukaryotic genomes poses a challenge for the classification of their biological functions. Sparse sequence conservation in non-coding sequences and the unreliable nature of RNA structure predictions further exacerbate this conundrum. Here, we describe a computational method, DotAligner, for the unsupervised discovery and classification of homologous RNA structure motifs from a set of sequences of interest. Our approach outperforms comparable algorithms at clustering known RNA structure families, both in speed and accuracy. It identifies clusters of known and novel structure motifs from ENCODE immunoprecipitation data for 44 RNA-binding proteins.
Collapse
|
24
|
Intergenic disease-associated regions are abundant in novel transcripts. Genome Biol 2017; 18:241. [PMID: 29284497 PMCID: PMC5747244 DOI: 10.1186/s13059-017-1363-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 11/21/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. RESULTS To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. CONCLUSIONS This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
Collapse
|
25
|
Abstract
RNA modifications have been historically considered as fine-tuning chemo-structural features of infrastructural RNAs, such as rRNAs, tRNAs, and snoRNAs. This view has changed dramatically in recent years, to a large extent as a result of systematic efforts to map and quantify various RNA modifications in a transcriptome-wide manner, revealing that RNA modifications are reversible, dynamically regulated, far more widespread than originally thought, and involved in major biological processes, including cell differentiation, sex determination, and stress responses. Here we summarize the state of knowledge and provide a catalog of RNA modifications and their links to neurological disorders, cancers, and other diseases. With the advent of direct RNA-sequencing technologies, we expect that this catalog will help prioritize those RNA modifications for transcriptome-wide maps.
Collapse
|
26
|
Prioritising the application of genomic medicine. NPJ Genom Med 2017; 2:35. [PMID: 29263844 PMCID: PMC5698310 DOI: 10.1038/s41525-017-0037-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 10/20/2017] [Accepted: 10/25/2017] [Indexed: 12/25/2022] Open
Abstract
The clinical translation of genomic sequencing is hampered by the limited information available to guide investment into those areas where genomics is well placed to deliver improved health and economic outcomes. To date, genomic medicine has achieved its greatest successes through applications to diseases that have a high genotype–phenotype correlation and high penetrance, with a near certainty that the individual will develop the condition in the presence of the genotype. It has been anticipated that genomics will play an important role in promoting population health by targeting at-risk individuals and reducing the incidence of highly prevalent, costly, complex diseases, with potential applications across screening, prevention, and treatment decisions. However, where primary or secondary prevention requires behavioural changes, there is currently very little evidence to support reduction in disease incidence. A better understanding of the relationship between genomic variation and complex diseases will be necessary before effective genomic risk identification and management of the risk of complex diseases in healthy individuals can be carried out in clinical practice. Our recommended approach is that priority for genomic testing should focus on diseases where there is strong genotype–phenotype correlation, high or certain penetrance, the effects of the disease are serious and near-term, there is the potential for prevention and/or treatment, and the net costs incurred are acceptable for the health gains achieved.
Collapse
|
27
|
Abstract
The human genome sequence is freely available, nearly complete and is providing a foundation of research opportunities that are overturning our current understanding of human biology. The advent of next generation sequencing has revolutionized the way we can interrogate the genome and its transcriptional products and how we analyze, diagnose, monitor and even treat human disease. Personal genetic profiles are increasing dramatically in medical value as researchers accumulate more and more knowledge about the interaction between genetic and environmental factors that contribute to the onset of common disorders. As the cost of sequencing plummets, whole genome sequencing of individuals is becoming a reality and the field of personalized genomic medicine is rapidly developing. Now there is great need for accurate annotation of all functionally important sequences in the human genome and the variations within them that contribute to health and disease. The vast majority of our genome gives rise to RNA transcripts. This extraordinarily versatile molecule not only encodes protein information but also has great structural dynamics and plasticity, capacity for DNA/RNA/protein interactions and catalytic activity. It is a key regulator of biological networks with clear links to human disease and a more comprehensive understanding of its function is needed to maximise its use in medical practice. This review focuses on the complexity of our genome and the impact of sequencing technologies in understanding its many products and functions in health and disease.
Collapse
|
28
|
Differential intron retention in Jumonji chromatin modifier genes is implicated in reptile temperature-dependent sex determination. SCIENCE ADVANCES 2017; 3:e1700731. [PMID: 28630932 PMCID: PMC5470834 DOI: 10.1126/sciadv.1700731] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
In many vertebrates, sex of offspring is determined by external environmental cues rather than by sex chromosomes. In reptiles, for instance, temperature-dependent sex determination (TSD) is common. Despite decades of work, the mechanism by which temperature is converted into a sex-determining signal remains mysterious. This is partly because it is difficult to distinguish the primary molecular events of TSD from the confounding downstream signatures of sexual differentiation. We use the Australian central bearded dragon, in which chromosomal sex determination is overridden at high temperatures to produce sex-reversed female offspring, as a unique model to identify TSD-specific features of the transcriptome. We show that an intron is retained in mature transcripts from each of two Jumonji family genes, JARID2 and JMJD3, in female dragons that have been sex-reversed by temperature but not in normal chromosomal females or males. JARID2 is a component of the master chromatin modifier Polycomb Repressive Complex 2, and the mammalian sex-determining factor SRY is directly regulated by an independent but closely related Jumonji family member. We propose that the perturbation of JARID2/JMJD3 function by intron retention alters the epigenetic landscape to override chromosomal sex-determining cues, triggering sex reversal at extreme temperatures. Sex reversal may then facilitate a transition from genetic sex determination to TSD, with JARID2/JMJD3 intron retention preserved as the decisive regulatory signal. Significantly, we also observe sex-associated differential retention of the equivalent introns in JARID2/JMJD3 transcripts expressed in embryonic gonads from TSD alligators and turtles, indicative of a reptile-wide mechanism controlling TSD.
Collapse
|
29
|
The Dimensions, Dynamics, and Relevance of the Mammalian Noncoding Transcriptome. Trends Genet 2017; 33:464-478. [PMID: 28535931 DOI: 10.1016/j.tig.2017.04.004] [Citation(s) in RCA: 143] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 04/24/2017] [Indexed: 01/02/2023]
Abstract
The combination of pervasive transcription and prolific alternative splicing produces a mammalian transcriptome of great breadth and diversity. The majority of transcribed genomic bases are intronic, antisense, or intergenic to protein-coding genes, yielding a plethora of short and long non-protein-coding regulatory RNAs. Long noncoding RNAs (lncRNAs) share most aspects of their biogenesis, processing, and regulation with mRNAs. However, lncRNAs are typically expressed in more restricted patterns, frequently from enhancers, and exhibit almost universal alternative splicing. These features are consistent with their role as modular epigenetic regulators. We describe here the key studies and technological advances that have shaped our understanding of the dimensions, dynamics, and biological relevance of the mammalian noncoding transcriptome.
Collapse
|
30
|
Initiating an undiagnosed diseases program in the Western Australian public health system. Orphanet J Rare Dis 2017; 12:83. [PMID: 28468665 PMCID: PMC5415708 DOI: 10.1186/s13023-017-0619-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2016] [Accepted: 03/26/2017] [Indexed: 02/02/2023] Open
Abstract
Background New approaches are required to address the needs of complex undiagnosed diseases patients. These approaches include clinical genomic diagnostic pipelines, utilizing intra- and multi-disciplinary platforms, as well as specialty-specific genomic clinics. Both are advancing diagnostic rates. However, complementary cross-disciplinary approaches are also critical to address those patients with multisystem disorders who traverse the bounds of multiple specialties and remain undiagnosed despite existing intra-specialty and genomic-focused approaches. The diagnostic possibilities of undiagnosed diseases include genetic and non-genetic conditions. The focus on genetic diseases addresses some of these disorders, however a cross-disciplinary approach is needed that also simultaneously addresses other disorder types. Herein, we describe the initiation and summary outcomes of a public health system approach for complex undiagnosed patients - the Undiagnosed Diseases Program-Western Australia (UDP-WA). Results Briefly the UDP-WA is: i) one of a complementary suite of approaches that is being delivered within health service, and with community engagement, to address the needs of those with severe undiagnosed diseases; ii) delivered within a public health system to support equitable access to health care, including for those from remote and regional areas; iii) providing diagnoses and improved patient care; iv) delivering a platform for in-service and real time genomic and phenomic education for clinicians that traverses a diverse range of specialties; v) retaining and recapturing clinical expertise; vi) supporting the education of junior and more senior medical staff; vii) designed to integrate with clinical translational research; and viii) is supporting greater connectedness for patients, families and medical staff. Conclusion The UDP-WA has been initiated in the public health system to complement existing clinical genomic approaches; it has been targeted to those with a specific diagnostic need, and initiated by redirecting existing clinical and financial resources. The UDP-WA supports the provision of equitable and sustainable diagnostics and simultaneously supports capacity building in clinical care and translational research, for those with undiagnosed, typically rare, conditions. Electronic supplementary material The online version of this article (doi:10.1186/s13023-017-0619-z) contains supplementary material, which is available to authorized users.
Collapse
|
31
|
Improved definition of the mouse transcriptome via targeted RNA sequencing. Genome Res 2017; 26:705-16. [PMID: 27197243 PMCID: PMC4864457 DOI: 10.1101/gr.199760.115] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 02/23/2016] [Indexed: 11/24/2022]
Abstract
Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources.
Collapse
|
32
|
Abstract
Protein-coding RNAs represent only a small fraction of the transcriptional output in higher eukaryotes. The remaining RNA species encompass a broad range of molecular functions and regulatory roles, a consequence of the structural polyvalence of RNA polymers. Albeit several classes of small noncoding RNAs are relatively well characterized, the accessibility of affordable high-throughput sequencing is generating a wealth of novel, unannotated transcripts, especially long noncoding RNAs (lncRNAs) that are derived from genomic regions that are antisense, intronic, intergenic, and overlapping protein-coding loci. Parsing and characterizing the functions of noncoding RNAs-lncRNAs in particular-is one of the great challenges of modern genome biology. Here we discuss concepts and computational methods for the identification of structural domains in lncRNAs from genomic and transcriptomic data. In the first part, we briefly review how to identify RNA structural motifs in individual lncRNAs. In the second part, we describe how to leverage the evolutionary dynamics of structured RNAs in a computationally efficient screen to detect putative functional lncRNA motifs using comparative genomics.
Collapse
|
33
|
Spliced synthetic genes as internal controls in RNA sequencing experiments. Nat Methods 2016; 13:792-8. [DOI: 10.1038/nmeth.3958] [Citation(s) in RCA: 95] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 06/29/2016] [Indexed: 11/09/2022]
|
34
|
Abstract
Abstract
Breast cancer is a heterogeneous disease that can be classified into several distinct molecular subtypes based on gene expression. Like mRNAs and miRNAs, long noncoding RNAs (lncRNAs) differ dramatically in expression across subtypes and can be used for classification. While there has been considerable emphasis on miRNAs, our knowledge is still lacking about the role of lncRNAs that comprise the majority of the mammalian transcriptome. Recently, the importance of lncRNAs in cancer has been highlighted by several studies. We have examined the expression profiles of >17,000 lncRNAs in a large set of breast tumors and have identified a lncRNA, AK001796, that is overexpressed in aggressive breast cancers. In particular, AK001796 is enriched in the aggressive claudin-low, HER-enriched, and luminal B subtypes. Furthermore, in four different models, we find that AK001796 is significantly upregulated in cell lines induced to undergo EMT and in putative mesenchymal-like cancer stem cells within cell lines suggesting this lncRNA as an inducer/facilitator of EMT. Similar results were obtained when a lung cancer cell line was induced to EMT through TGF beta treatment. Using cell fractionation, we have discovered that AK001796 is maintained predominantly in the nucleus. By RACE we have identified two isoforms of AK001796 in breast cancer cells that differ by the presence or absence of a 94 nucleotide intron. The short form (with intron spliced out) appears to be the variant enriched following EMT and may serve as a marker of aggressiveness. Interestingly, knockdown of AK001796 using antisense oligonucleotides lead to significantly increased apoptosis in EMT positive cell lines whereas in EMT negative cells knockdown had little effect. Preliminary studies for finding out the protein interacting partners identified some mesenchymal phenotype-associated proteins in pull-down studies using biotinylated oligos. To further investigate the molecular mechanisms regulated by AK001796 and its utility as a therapeutic target, we will determine the pathways induced by AK001796 by mapping its protein interaction network and downstream signaling pathways. These results nominate AK001796 as a promising therapeutic target in aggressive breast cancers.
Citation Format: Maneesh Kumar, Rebecca Sinnott DeVaux, Julia J. Shen, Steven P. Davis, Marcel E. Dinger, John S. Mattick, Charles M. Perou, Jeffrey M. Rosen, Sendurai A. Mani, Jason I. Herschkowitz. LncRNA AK001796 as a therapeutic target in aggressive breast cancers. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 1598.
Collapse
|
35
|
The Evx1/Evx1as gene locus regulates anterior-posterior patterning during gastrulation. Sci Rep 2016; 6:26657. [PMID: 27226347 PMCID: PMC4880930 DOI: 10.1038/srep26657] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 04/29/2016] [Indexed: 01/09/2023] Open
Abstract
Thousands of sense-antisense mRNA-lncRNA gene pairs occur in the mammalian genome. While there is usually little doubt about the function of the coding transcript, the function of the lncRNA partner is mostly untested. Here we examine the function of the homeotic Evx1-Evx1as gene locus. Expression is tightly co-regulated in posterior mesoderm of mouse embryos and in embryoid bodies. Expression of both genes is enhanced by BMP4 and WNT3A, and reduced by Activin. We generated a suite of deletions in the locus by CRISPR-Cas9 editing. We show EVX1 is a critical downstream effector of BMP4 and WNT3A with respect to patterning of posterior mesoderm. The lncRNA, Evx1as arises from alternative promoters and is difficult to fully abrogate by gene editing or siRNA approaches. Nevertheless, we were able to generate a large 2.6 kb deletion encompassing the shared promoter with Evx1 and multiple additional exons of Evx1as. This led to an identical dorsal-ventral patterning defect to that generated by micro-deletion in the DNA-binding domain of EVX1. Thus, Evx1as has no function independent of EVX1, and is therefore unlikely to act in trans. We predict many antisense lncRNAs have no specific trans function, possibly only regulating the linked coding genes in cis.
Collapse
|
36
|
Abstract A09: The long noncoding RNA SPRIGHTLY regulates cell proliferation in primary human melanocytes. Cancer Res 2016. [DOI: 10.1158/1538-7445.nonrna15-a09] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The long non-coding RNA (lncRNA) SPRIGHTLY (formerly SPRY4-IT1), which lies within the intronic region of the SPRY4 gene is upregulated in human melanoma cells compared to melanocytes. SPRIGHTLY regulates a number of cancer hallmarks including proliferation, motility, and apoptosis. To better understand its oncogenic role, SPRIGHTLY was stably transfected into human melanocytes, which resulted in increased cellular proliferation, invasion, and development of a multinucleated dendritic-like phenotype. RNA sequencing and mass spectrometric analysis of SPRIGHTLY-expressing cells revealed changes in the expression of genes involved in cell proliferation, apoptosis, chromosome organization, regulation of DNA damage responses, and cell cycle. The proliferation marker Ki67, minichromosome maintenance genes (MCM2-5), and the anti-apoptotic genes X-linked inhibitor of apoptosis (XIAP) and baculoviral IAP repeat-containing 7 (livin) were upregulated in SPRIGHTLY-expressing melanocytes, while the pro-apoptotic tumor suppressor gene DPPIV/CD26 was downregulated. Since downregulation of DPPIV is known to be associated with malignant transformation in melanocytes, SPRIGHTLY-mediated DPPIV downregulation may play an important role in melanoma pathobiology. These findings provide novel insights into how SPRIGHTLY regulates proliferation, and apoptosis in primary human melanocytes.
Citation Format: Wei Zhao, Joseph Mazar, Bongyong Lee, Junko Sawada, Jian-Liang Li, John Shelley, Subramaniam Govindarajan, Dwight Towler, John S. Mattick, Masanobu Komatsu, Marcel E. Dinger, Ranjan J. Perera. The long noncoding RNA SPRIGHTLY regulates cell proliferation in primary human melanocytes. [abstract]. In: Proceedings of the AACR Special Conference on Noncoding RNAs and Cancer: Mechanisms to Medicines ; 2015 Dec 4-7; Boston, MA. Philadelphia (PA): AACR; Cancer Res 2016;76(6 Suppl):Abstract nr A09.
Collapse
|
37
|
|
38
|
The Long Noncoding RNA SPRIGHTLY Regulates Cell Proliferation in Primary Human Melanocytes. J Invest Dermatol 2016; 136:819-828. [PMID: 26829028 DOI: 10.1016/j.jid.2016.01.018] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 12/14/2015] [Accepted: 12/14/2015] [Indexed: 01/29/2023]
Abstract
The long noncoding RNA SPRIGHTLY (formerly SPRY4-IT1), which lies within the intronic region of the SPRY4 gene, is up-regulated in human melanoma cells compared to melanocytes. SPRIGHTLY regulates a number of cancer hallmarks, including proliferation, motility, and apoptosis. To better understand its oncogenic role, SPRIGHTLY was stably transfected into human melanocytes, which resulted in increased cellular proliferation, colony formation, invasion, and development of a multinucleated dendritic-like phenotype. RNA sequencing and mass spectrometric analysis of SPRIGHTLY-expressing cells revealed changes in the expression of genes involved in cell proliferation, apoptosis, chromosome organization, regulation of DNA damage responses, and cell cycle. The proliferation marker Ki67, minichromosome maintenance genes 2-5, antiapoptotic gene X-linked inhibitor of apoptosis, and baculoviral IAP repeat-containing 7 were all up-regulated in SPRIGHTLY-expressing melanocytes, whereas the proapoptotic tumor suppressor gene DPPIV/CD26 was down-regulated, followed by an increase in extracellular signal-regulated kinase 1/2 phosphorylation, suggesting an increase in mitogen-activated protein kinase activity. Because down-regulation of DPPIV is known to be associated with malignant transformation in melanocytes, SPRIGHTLY-mediated DPPIV down-regulation may play an important role in melanoma pathobiology. Together, these findings provide important insights into how SPRIGHTLY regulates cell proliferation and anchorage-independent colony formation in primary human melanocytes.
Collapse
|
39
|
|
40
|
Abstract
During the splicing reaction, the 5′ intron end is joined to the branchpoint nucleotide, selecting the next exon to incorporate into the mature RNA and forming an intron lariat, which is excised. Despite a critical role in gene splicing, the locations and features of human splicing branchpoints are largely unknown. We use exoribonuclease digestion and targeted RNA-sequencing to enrich for sequences that traverse the lariat junction and, by split and inverted alignment, reveal the branchpoint. We identify 59,359 high-confidence human branchpoints in >10,000 genes, providing a first map of splicing branchpoints in the human genome. Branchpoints are predominantly adenosine, highly conserved, and closely distributed to the 3′ splice site. Analysis of human branchpoints reveals numerous novel features, including distinct features of branchpoints for alternatively spliced exons and a family of conserved sequence motifs overlapping branchpoints we term B-boxes, which exhibit maximal nucleotide diversity while maintaining interactions with the keto-rich U2 snRNA. Different B-box motifs exhibit divergent usage in vertebrate lineages and associate with other splicing elements and distinct intron–exon architectures, suggesting integration within a broader regulatory splicing code. Lastly, although branchpoints are refractory to common mutational processes and genetic variation, mutations occurring at branchpoint nucleotides are enriched for disease associations.
Collapse
|
41
|
Transpositional shuffling and quality control in male germ cells to enhance evolution of complex organisms. Ann N Y Acad Sci 2014; 1341:156-63. [PMID: 25557795 PMCID: PMC4390386 DOI: 10.1111/nyas.12608] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Complex organisms, particularly mammals, have long generation times and produce small numbers of progeny that undergo increasingly entangled developmental programs. This reduces the ability of such organisms to explore evolutionary space, and, consequently, strategies that mitigate this problem likely have a strategic advantage. Here, we suggest that animals exploit the controlled shuffling of transposons to enhance genomic variability in conjunction with a molecular screening mechanism to exclude deleterious events. Accordingly, the removal of repressive DNA-methylation marks during male germ cell development is an evolved function that exploits the mutagenic potential of transposable elements. A wave of transcription during the meiotic phase of spermatogenesis produces the most complex transcriptome of all mammalian cells, including genic and noncoding sense-antisense RNA pairs that enable a genome-wide quality-control mechanism. Cells that fail the genomic quality test are excluded from further development, eventually resulting in a positively selected mature sperm population. We suggest that these processes, enhanced variability and stringent molecular quality control, compensate for the apparent reduced potential of complex animals to adapt and evolve.
Collapse
|
42
|
Extracellular vesicles from neural stem cells transfer IFN-γ via Ifngr1 to activate Stat1 signaling in target cells. Mol Cell 2014; 56:193-204. [PMID: 25242146 PMCID: PMC4578249 DOI: 10.1016/j.molcel.2014.08.020] [Citation(s) in RCA: 171] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 07/22/2014] [Accepted: 08/15/2014] [Indexed: 12/20/2022]
Abstract
The idea that stem cell therapies work only via cell replacement is challenged by the observation of consistent intercellular molecule exchange between the graft and the host. Here we defined a mechanism of cellular signaling by which neural stem/precursor cells (NPCs) communicate with the microenvironment via extracellular vesicles (EVs), and we elucidated its molecular signature and function. We observed cytokine-regulated pathways that sort proteins and mRNAs into EVs. We described induction of interferon gamma (IFN-γ) pathway in NPCs exposed to proinflammatory cytokines that is mirrored in EVs. We showed that IFN-γ bound to EVs through Ifngr1 activates Stat1 in target cells. Finally, we demonstrated that endogenous Stat1 and Ifngr1 in target cells are indispensable to sustain the activation of Stat1 signaling by EV-associated IFN-γ/Ifngr1 complexes. Our study identifies a mechanism of cellular signaling regulated by EV-associated IFN-γ/Ifngr1 complexes, which grafted stem cells may use to communicate with the host immune system.
Collapse
|
43
|
The impact of genomics on the future of medicine and health. Med J Aust 2014; 201:17-20. [DOI: 10.5694/mja13.10920] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Accepted: 02/03/2014] [Indexed: 11/17/2022]
|
44
|
Effects of a novel long noncoding RNA, lncUSMycN, on N-Myc expression and neuroblastoma progression. J Natl Cancer Inst 2014; 106:dju113. [PMID: 24906397 DOI: 10.1093/jnci/dju113] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Patients with neuroblastoma due to the amplification of a 130-kb genomic DNA region containing the MYCN oncogene have poor prognoses. METHODS Bioinformatics data were used to discover a novel long noncoding RNA, lncUSMycN, at the 130-kb amplicon. RNA-protein pull-down assays were used to identify proteins bound to lncUSMycN RNA. Kaplan-Meier survival analysis, multivariable Cox regression, and two-sided log-rank test were used to examine the prognostic value of lncUSMycN and NonO expression in three cohorts of neuroblastoma patients (n = 47, 88, and 476, respectively). Neuroblastoma-bearing mice were treated with antisense oligonucleotides targeting lncUSMycN (n = 12) or mismatch sequence (n = 13), and results were analyzed by multiple comparison two-way analysis of variance. All statistical tests were two-sided. RESULTS Bioinformatics data predicted lncUSMycN gene and RNA, and reverse-transcription polymerase chain reaction confirmed its three exons and two introns. The lncUSMycN gene was coamplified with MYCN in 88 of 341 human neuroblastoma tissues. lncUSMycN RNA bound to the RNA-binding protein NonO, leading to N-Myc RNA upregulation and neuroblastoma cell proliferation. High levels of lncUSMycN and NonO expression in human neuroblastoma tissues independently predicted poor patient prognoses (lncUSMycN: hazard ratio [HR] = 1.87, 95% confidence interval [CI] = 1.06 to 3.28, P = .03; NonO: HR = 2.48, 95% CI = 1.34 to 4.57, P = .004). Treatment with antisense oligonucleotides targeting lncUSMycN in neuroblastoma-bearing mice statistically significantly hindered tumor progression (P < .001). CONCLUSIONS Our data demonstrate the important roles of lncUSMycN and NonO in regulating N-Myc expression and neuroblastoma oncogenesis and provide the first evidence that amplification of long noncoding RNA genes can contribute to tumorigenesis.
Collapse
|
45
|
Abstract
Discoveries over the past decade portend a paradigm shift in molecular biology. Evidence suggests that RNA is not only functional as a messenger between DNA and protein but also involved in the regulation of genome organization and gene expression, which is increasingly elaborate in complex organisms. Regulatory RNA seems to operate at many levels; in particular, it plays an important part in the epigenetic processes that control differentiation and development. These discoveries suggest a central role for RNA in human evolution and ontogeny. Here, we review the emergence of the previously unsuspected world of regulatory RNA from a historical perspective.
Collapse
|
46
|
Abstract
Genetic knockout experiments on mice confirm that some long noncoding RNA molecules have developmental functions.
Collapse
|
47
|
Non-coding RNAs in homeostasis, disease and stress responses: an evolutionary perspective. Brief Funct Genomics 2013; 12:254-78. [PMID: 23709461 DOI: 10.1093/bfgp/elt016] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Cells and organisms are subject to challenges and perturbations in their environment and physiology in all stages of life. The molecular response to such changes, including insulting conditions such as pathogen infections, involves coordinated modulation of gene expression programmes and has not only homeostatic but also ecological and evolutionary importance. Although attention has been primarily focused on signalling pathways and protein networks, non-coding RNAs (ncRNAs), which comprise a significant output of the genomes of prokaryotes and especially eukaryotes, are increasingly implicated in the molecular mechanisms of these responses. Long and short ncRNAs not only regulate development and cell physiology, they are also involved in disease states, including cancers, in host-pathogen interactions, and in a variety of stress responses. Indeed, regulatory RNAs are part of genetically encoded response networks and also underpin epigenetic processes, which are emerging as key mechanisms of adaptation and transgenerational inheritance. Here we present the growing evidence that ncRNAs are intrinsically involved in cellular and organismal adaptation processes, in both robustness and protection to stresses, as well as in mechanisms generating evolutionary change.
Collapse
|
48
|
Mapping of mitochondrial RNA-protein interactions by digital RNase footprinting. Cell Rep 2013; 5:839-48. [PMID: 24183674 DOI: 10.1016/j.celrep.2013.09.036] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2013] [Revised: 09/04/2013] [Accepted: 09/25/2013] [Indexed: 01/01/2023] Open
Abstract
Human mitochondrial DNA is transcribed as long polycistronic transcripts that encompass each strand of the genome and are processed subsequently into mature mRNAs, tRNAs, and rRNAs, necessitating widespread posttranscriptional regulation. Here, we establish methods for massively parallel sequencing and analyses of RNase-accessible regions of human mitochondrial RNA and thereby identify specific regions within mitochondrial transcripts that are bound by proteins. This approach provides a range of insights into the contribution of RNA-binding proteins to the regulation of mitochondrial gene expression.
Collapse
|
49
|
Abstract
An expansive functionality and complexity has been ascribed to the majority of the human genome that was unanticipated at the outset of the draft sequence and assembly a decade ago. We are now faced with the challenge of integrating and interpreting this complexity in order to achieve a coherent view of genome biology. We argue that the linear representation of the genome exacerbates this complexity and an understanding of its three-dimensional structure is central to interpreting the regulatory and transcriptional architecture of the genome. Chromatin conformation capture techniques and high-resolution microscopy have afforded an emergent global view of genome structure within the nucleus. Chromosomes fold into complex, territorialized three-dimensional domains in concert with specialized subnuclear bodies that harbor concentrations of transcription and splicing machinery. The signature of these folds is retained within the layered regulatory landscapes annotated by chromatin immunoprecipitation, and we propose that genome contacts are reflected in the organization and expression of interweaved networks of overlapping coding and noncoding transcripts. This pervasive impact of genome structure favors a preeminent role for the nucleoskeleton and RNA in regulating gene expression by organizing these folds and contacts. Accordingly, we propose that the local and global three-dimensional structure of the genome provides a consistent, integrated, and intuitive framework for interpreting and understanding the regulatory and transcriptional complexity of the human genome.
Collapse
|
50
|
Abstract A039: The role of long noncoding RNAs in epithelial to mesenchymal transition and cancer stem cells. Mol Cancer Res 2013. [DOI: 10.1158/1557-3125.advbc-a039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The claudin-low subtype is generally triple (ER, PR, HER2) negative and there are currently no targeted agents directed at them. These tumors express low levels of tight and adherens junction genes including claudin 3 and E-cadherin, and high levels of markers associated with epithelial-mesenchymal transition (EMT) including Snail, Twist, and Zeb1/2. Claudin-low tumors are also enriched in signatures derived from human tumor-initiating cells and a sorted population enriched for human mammary stem cells. miRNAs are differentially expressed in claudin-low tumors including low expression of the miR-200 family - regulators of EMT and stemness. MiR-200 overexpression in claudin-low cell lines causes them to lose this classification and to adopt an expression profile of another subtype. While there has been considerable emphasis on miRNAs, our knowledge is still lacking about the role of long noncoding RNAs (lncRNAs) that comprise the majority of the mammalian transcriptome. Here, we have examined the expression profiles of >17,000 lncRNAs in a large set of breast tumors. Like mRNAs and miRNAs, lncRNAs differ dramatically in expression across subtypes and can be used for classification. LncRNAs that are differentially regulated in cell lines induced to undergo EMT are associated with claudin-low tumors and we have identified some of these lncRNAs as potential regulators of the EMT/CSC phenotype. We have begun to study the subcellular localization and potential function of a couple of these candidate lncRNAs using RNA FISH and siRNA knockdown respectively. These results suggest major roles for noncoding RNAs in claudin-low breast tumors and in the regulation of breast cancer stem cells.
Citation Format: Jason I. Herschkowitz, Cristian Coarfa, Aleix Prat, Michael J. Toneff, Katherine A. Hoadley, Marcel E. Dinger, John S. Mattick, Sendurai A. Mani, Charles M. Perou, Jeffrey M. Rosen. The role of long noncoding RNAs in epithelial to mesenchymal transition and cancer stem cells. [abstract]. In: Proceedings of the AACR Special Conference on Advances in Breast Cancer Research: Genetics, Biology, and Clinical Applications; Oct 3-6, 2013; San Diego, CA. Philadelphia (PA): AACR; Mol Cancer Res 2013;11(10 Suppl):Abstract nr A039.
Collapse
|