1
|
Ghafouri‐Fard S, Harsij A, Farahzadi H, Hussen BM, Taheri M, Mokhtari M. A concise review on the role of MIR100HG in human disorders. J Cell Mol Med 2023; 27:2278-2289. [PMID: 37487022 PMCID: PMC10424294 DOI: 10.1111/jcmm.17875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/12/2023] [Accepted: 07/18/2023] [Indexed: 07/26/2023] Open
Abstract
MIR100HG is a long non-coding RNA (lncRNA) encoded by a locus on chr11:122,028,203-122,556,721. This gene can regulate cell proliferation, apoptosis, cell cycle transition and cell differentiation. MIR100HG was firstly identified through a transcriptome analysis and found to regulate differentiation of human neural stem cells. It is functionally related with a number of signalling pathways such as TGF-β, Wnt, Hippo and ERK/MAPK signalling pathways. Dysregulation of MIR100HG has been detected in a diversity of cancers in association with clinical outcomes. Moreover, it has a role in the pathophysiology of dilated cardiomyopathy, intervertebral disk degeneration and pulmonary fibrosis. The current study summarizes the role of these lncRNAs in human disorders.
Collapse
Affiliation(s)
- Soudeh Ghafouri‐Fard
- Department of Medical Genetics, School of MedicineShahid Beheshti University of Medical SciencesTehranIran
| | - Atefeh Harsij
- Phytochemistry Research CentreShahid Beheshti University of Medical SciencesTehranIran
| | - Hossein Farahzadi
- Phytochemistry Research CentreShahid Beheshti University of Medical SciencesTehranIran
| | - Bashdar Mahmud Hussen
- Department of Clinical Analysis, College of PharmacyHawler Medical UniversityErbilIraq
| | - Mohammad Taheri
- Urology and Nephrology Research CentreShahid Beheshti University of Medical SciencesTehranIran
- Institute of Human GeneticsJena University HospitalJenaGermany
| | - Majid Mokhtari
- Skull Base Research Centre, Loghman Hakim HospitalShahid Beheshti University of Medical SciencesTehranIran
| |
Collapse
|
2
|
Wu Y, Wang Z, Yu S, Liu D, Sun L. LncmiRHG-MIR100HG: A new budding star in cancer. Front Oncol 2022; 12:997532. [PMID: 36212400 PMCID: PMC9544809 DOI: 10.3389/fonc.2022.997532] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 09/12/2022] [Indexed: 11/24/2022] Open
Abstract
MIR100HG, also known as lncRNA mir-100-let-7a-2-mir-125b-1 cluster host gene, is a new and critical regulator in cancers in recent years. MIR100HG is dysregulated in various cancers and plays an oncogenic or tumor-suppressive role, which participates in many tumor cell biology processes and cancer-related pathways. The errant expression of MIR100HG has inspired people to investigate the function of MIR100HG and its diagnostic and therapeutic potential in cancers. Many studies have indicated that dysregulated expression of MIR100HG is markedly correlated with poor prognosis and clinicopathological features. In this review, we will highlight the characteristics and introduce the role of MIR100HG in different cancers, and summarize the molecular mechanism, pathways, chemoresistance, and current research progress of MIR100HG in cancers. Furthermore, some open questions in this rapidly advancing field are proposed. These updates clarify our understanding of MIR100HG in cancers, which may pave the way for the application of MIR100HG-targeting approaches in future cancer diagnosis, prognosis, and therapy.
Collapse
Affiliation(s)
- Yingnan Wu
- Cancer Center, Department of Ultrasound Medicine, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital of Hangzhou Medical College, Hangzhou, China
| | - Zhenzhen Wang
- Cancer Center, Department of Ultrasound Medicine, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital of Hangzhou Medical College, Hangzhou, China
| | - Shan Yu
- Department of Pathology, The 2nd Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Dongzhe Liu
- Department of Hematology and Oncology, International Cancer Center, Shenzhen Key Laboratory, Shenzhen University General Hospital, Shenzhen University Clinical Medical Academy, Shenzhen University Health Science Center, Shenzhen, China
- *Correspondence: Litao Sun, ; Dongzhe Liu,
| | - Litao Sun
- Cancer Center, Department of Ultrasound Medicine, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital of Hangzhou Medical College, Hangzhou, China
- *Correspondence: Litao Sun, ; Dongzhe Liu,
| |
Collapse
|
3
|
Pseudomonas syringae AlgU Downregulates Flagellin Gene Expression, Helping Evade Plant Immunity. J Bacteriol 2020; 202:JB.00418-19. [PMID: 31740494 DOI: 10.1128/jb.00418-19] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Accepted: 11/08/2019] [Indexed: 12/15/2022] Open
Abstract
Flagella power bacterial movement through liquids and over surfaces to access or avoid certain environmental conditions, ultimately increasing a cell's probability of survival and reproduction. In some cases, flagella and chemotaxis are key virulence factors enabling pathogens to gain entry and attach to suitable host tissues. However, flagella are not always beneficial; both plant and animal immune systems have evolved receptors to sense the proteins that make up flagellar filaments as signatures of bacterial infection. Microbes poorly adapted to avoid or counteract these immune functions are unlikely to be successful in host environments, and this selective pressure has driven the evolution of diverse and often redundant pathogen compensatory mechanisms. We tested the role of AlgU, the Pseudomonas extracytoplasmic function sigma factor σE/σ22 ortholog, in regulating flagellar expression in the context of Pseudomonas syringae-plant interactions. We found that AlgU is necessary for downregulating bacterial flagellin expression in planta and that this results in a corresponding reduction in plant immune elicitation. This AlgU-dependent regulation of flagellin gene expression is beneficial to bacterial growth in the course of plant infection, and eliminating the plant's ability to detect flagellin makes this AlgU-dependent function irrelevant for bacteria growing in the apoplast. Together, these results add support to an emerging model in which P. syringae AlgU functions at a key control point that serves to optimize the expression of bacterial functions during host interactions, including minimizing the expression of immune elicitors and concomitantly upregulating beneficial virulence functions.IMPORTANCE Foliar plant pathogens, like Pseudomonas syringae, adjust their physiology and behavior to facilitate host colonization and disease, but the full extent of these adaptations is not known. Plant immune systems are triggered by bacterial molecules, such as the proteins that make up flagellar filaments. In this study, we found that during plant infection, AlgU, a gene expression regulator that is responsive to external stimuli, downregulates expression of fliC, which encodes the flagellin protein, a strong elicitor of plant immune systems. This change in gene expression and resultant change in behavior correlate with reduced plant immune activation and improved P. syringae plant colonization. The results of this study demonstrate the proximate and ultimate causes of flagellar regulation in a plant-pathogen interaction.
Collapse
|
4
|
MIR100HG: a credible prognostic biomarker and an oncogenic lncRNA in gastric cancer. Biosci Rep 2019; 39:BSR20190171. [PMID: 30886062 PMCID: PMC6449568 DOI: 10.1042/bsr20190171] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 03/07/2019] [Accepted: 03/11/2019] [Indexed: 12/24/2022] Open
Abstract
The MIR100HG expression was observed to be up-regulated or down-regulated in human cancer tissues depending on tumor types. However, there was no report about the role of MIR100HG in gastric cancer. In our study, we first found levels of MIR100HG expression were increased in gastric cancer cell lines and tissue samples compared with normal gastric epithelial cell line and adjacent normal gastric mucosa tissue samples, respectively. Moreover, high MIR100HG expression was positively associated with clinical stage, tumor invasion, lymph node metastasis, and distant metastasis in gastric cancer patients. Survival analysis showed MIR100HG expression was negative correlated with clinical outcome in gastric cancer patients from The Cancer Genome Atlas (TCGA) database or our study, and high MIR100HG expression served as an independent poor prognostic factor for gastric cancer patient's overall survival. The study in vitro suggested down-regulation of MIR100HG expression inhibits cell proliferation, migration, and invasion in gastric cancer. In conclusion, MIR100HG is a credible prognostic biomarker and functions as an oncogenic lncRNA in gastric cancer.
Collapse
|
5
|
Giuffra E, Tuggle CK. Functional Annotation of Animal Genomes (FAANG): Current Achievements and Roadmap. Annu Rev Anim Biosci 2018; 7:65-88. [PMID: 30427726 DOI: 10.1146/annurev-animal-020518-114913] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Functional annotation of genomes is a prerequisite for contemporary basic and applied genomic research, yet farmed animal genomics is deficient in such annotation. To address this, the FAANG (Functional Annotation of Animal Genomes) Consortium is producing genome-wide data sets on RNA expression, DNA methylation, and chromatin modification, as well as chromatin accessibility and interactions. In addition to informing our understanding of genome function, including comparative approaches to elucidate constrained sequence or epigenetic elements, these annotation maps will improve the precision and sensitivity of genomic selection for animal improvement. A scientific community-driven effort has already created a coordinated data collection and analysis enterprise crucial for the success of this global effort. Although it is early in this continuing process, functional data have already been produced and application to genetic improvement reported. The functional annotation delivered by the FAANG initiative will add value and utility to the greatly improved genome sequences being established for domesticated animal species.
Collapse
Affiliation(s)
- Elisabetta Giuffra
- Génétique Animale et Biologie Intégrative (GABI), Institut National de la Recherche Agronomique (INRA), AgroParisTech, Université Paris Saclay, 78350 Jouy-en-Josas, France;
| | | | | |
Collapse
|
6
|
Michelini F, Jalihal AP, Francia S, Meers C, Neeb ZT, Rossiello F, Gioia U, Aguado J, Jones-Weinert C, Luke B, Biamonti G, Nowacki M, Storici F, Carninci P, Walter NG, d'Adda di Fagagna F. From "Cellular" RNA to "Smart" RNA: Multiple Roles of RNA in Genome Stability and Beyond. Chem Rev 2018; 118:4365-4403. [PMID: 29600857 DOI: 10.1021/acs.chemrev.7b00487] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Coding for proteins has been considered the main function of RNA since the "central dogma" of biology was proposed. The discovery of noncoding transcripts shed light on additional roles of RNA, ranging from the support of polypeptide synthesis, to the assembly of subnuclear structures, to gene expression modulation. Cellular RNA has therefore been recognized as a central player in often unanticipated biological processes, including genomic stability. This ever-expanding list of functions inspired us to think of RNA as a "smart" phone, which has replaced the older obsolete "cellular" phone. In this review, we summarize the last two decades of advances in research on the interface between RNA biology and genome stability. We start with an account of the emergence of noncoding RNA, and then we discuss the involvement of RNA in DNA damage signaling and repair, telomere maintenance, and genomic rearrangements. We continue with the depiction of single-molecule RNA detection techniques, and we conclude by illustrating the possibilities of RNA modulation in hopes of creating or improving new therapies. The widespread biological functions of RNA have made this molecule a reoccurring theme in basic and translational research, warranting it the transcendence from classically studied "cellular" RNA to "smart" RNA.
Collapse
Affiliation(s)
- Flavia Michelini
- IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
| | - Ameya P Jalihal
- Single Molecule Analysis Group and Center for RNA Biomedicine, Department of Chemistry , University of Michigan , Ann Arbor , Michigan 48109-1055 , United States
| | - Sofia Francia
- IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy.,Istituto di Genetica Molecolare , CNR - Consiglio Nazionale delle Ricerche , Pavia , 27100 , Italy
| | - Chance Meers
- School of Biological Sciences , Georgia Institute of Technology , Atlanta , Georgia 30332 , United States
| | - Zachary T Neeb
- Institute of Cell Biology , University of Bern , Baltzerstrasse 4 , 3012 Bern , Switzerland
| | | | - Ubaldo Gioia
- IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
| | - Julio Aguado
- IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
| | | | - Brian Luke
- Institute of Developmental Biology and Neurobiology , Johannes Gutenberg University , 55099 Mainz , Germany.,Institute of Molecular Biology (IMB) , 55128 Mainz , Germany
| | - Giuseppe Biamonti
- Istituto di Genetica Molecolare , CNR - Consiglio Nazionale delle Ricerche , Pavia , 27100 , Italy
| | - Mariusz Nowacki
- Institute of Cell Biology , University of Bern , Baltzerstrasse 4 , 3012 Bern , Switzerland
| | - Francesca Storici
- School of Biological Sciences , Georgia Institute of Technology , Atlanta , Georgia 30332 , United States
| | - Piero Carninci
- RIKEN Center for Life Science Technologies , 1-7-22 Suehiro-cho, Tsurumi-ku , Yokohama City , Kanagawa 230-0045 , Japan
| | - Nils G Walter
- Single Molecule Analysis Group and Center for RNA Biomedicine, Department of Chemistry , University of Michigan , Ann Arbor , Michigan 48109-1055 , United States
| | - Fabrizio d'Adda di Fagagna
- IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy.,Istituto di Genetica Molecolare , CNR - Consiglio Nazionale delle Ricerche , Pavia , 27100 , Italy
| |
Collapse
|
7
|
Chen YM, Liu Y, Wei HY, Lv KZ, Fu PF. Large intergenic non-coding RNA-ROR reverses gemcitabine-induced autophagy and apoptosis in breast cancer cells. Oncotarget 2018; 7:59604-59617. [PMID: 27449099 PMCID: PMC5312334 DOI: 10.18632/oncotarget.10730] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Accepted: 06/30/2016] [Indexed: 12/19/2022] Open
Abstract
The purpose of this study was to elucidate the potential role of long intergenic non-protein coding RNA, regulator of reprogramming (linc-ROR) in gemcitabine (Gem)-induced autophagy and apoptosis in breast cancer cells. MDA-MB-231 cells were treated with short hairpin RNA (shRNA) to knockdown Linc-ROR expression in the presence of Gem. Gem treatment alone decreased cell survival and increased both apoptosis and autophagy. Gem treatment also increased the expression of LC3-II, Beclin 1, NOTCH1 and Bcl-2, but decreased expression of p62 and p53. Untreated MDA-MB-231 cell lines strongly expressed linc-ROR, but linc-ROR knockdown decreased cell viability and expression of p62 and p53 while increasing apoptosis. Linc-ROR knockdown also increased LC3-II/β-actin, Beclin 1, NOTCH1, and Bcl-2 expression, as well as the number of autophagic vesicles in MDA-MB-231 cells. Linc-ROR negatively regulated miR-34a expression by inhibiting histone H3 acetylation in the miR-34a promoter. We conclude that linc-ROR suppresses Gem-induced autophagy and apoptosis in breast cancer cells by silencing miR-34a expression.
Collapse
Affiliation(s)
- Yao-Min Chen
- Department of Breast Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou 310000, P.R. China
| | - Yu Liu
- Department of Breast Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou 310000, P.R. China
| | - Hai-Yan Wei
- Department of Breast Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou 310000, P.R. China
| | - Ke-Zhen Lv
- Department of Breast Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou 310000, P.R. China
| | - Pei-Fen Fu
- Department of Breast Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou 310000, P.R. China
| |
Collapse
|
8
|
Algama M, Tasker E, Williams C, Parslow AC, Bryson-Richardson RJ, Keith JM. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach. BMC Genomics 2017; 18:259. [PMID: 28347272 PMCID: PMC5369223 DOI: 10.1186/s12864-017-3645-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 03/18/2017] [Indexed: 11/17/2022] Open
Abstract
Background Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. Results We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. Conclusions This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3645-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Manjula Algama
- School of Mathematical Sciences, Monash University, Melbourne, VIC, 3800, Australia
| | - Edward Tasker
- School of Mathematical Sciences, Monash University, Melbourne, VIC, 3800, Australia
| | - Caitlin Williams
- School of Biological Sciences, Monash University, Melbourne, VIC, 3800, Australia
| | - Adam C Parslow
- School of Biological Sciences, Monash University, Melbourne, VIC, 3800, Australia
| | | | - Jonathan M Keith
- School of Mathematical Sciences, Monash University, Melbourne, VIC, 3800, Australia.
| |
Collapse
|
9
|
Kurotani A, Yamada Y, Sakurai T. Alga-PrAS (Algal Protein Annotation Suite): A Database of Comprehensive Annotation in Algal Proteomes. PLANT & CELL PHYSIOLOGY 2017; 58:e6. [PMID: 28069893 PMCID: PMC5444574 DOI: 10.1093/pcp/pcw212] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 11/24/2016] [Indexed: 06/06/2023]
Abstract
Algae are smaller organisms than land plants and offer clear advantages in research over terrestrial species in terms of rapid production, short generation time and varied commercial applications. Thus, studies investigating the practical development of effective algal production are important and will improve our understanding of both aquatic and terrestrial plants. In this study we estimated multiple physicochemical and secondary structural properties of protein sequences, the predicted presence of post-translational modification (PTM) sites, and subcellular localization using a total of 510,123 protein sequences from the proteomes of 31 algal and three plant species. Algal species were broadly selected from green and red algae, glaucophytes, oomycetes, diatoms and other microalgal groups. The results were deposited in the Algal Protein Annotation Suite database (Alga-PrAS; http://alga-pras.riken.jp/), which can be freely accessed online.
Collapse
Affiliation(s)
- Atsushi Kurotani
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro, Tsurumi, Yokohama, Kanagawa, 230-0045, Japan
| | - Yutaka Yamada
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro, Tsurumi, Yokohama, Kanagawa, 230-0045, Japan
| | - Tetsuya Sakurai
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro, Tsurumi, Yokohama, Kanagawa, 230-0045, Japan
- Interdisciplinary Science Unit, Multidisciplinary Science Cluster, Research and Education Faculty, Kochi University, 200 Otsu, Monobe, Nankoku, Kochi, 783-8502, Japan
| |
Collapse
|
10
|
Bevilacqua V, Gioia U, Di Carlo V, Tortorelli AF, Colombo T, Bozzoni I, Laneve P, Caffarelli E. Identification of linc-NeD125, a novel long non coding RNA that hosts miR-125b-1 and negatively controls proliferation of human neuroblastoma cells. RNA Biol 2016; 12:1323-37. [PMID: 26480000 DOI: 10.1080/15476286.2015.1096488] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The human genome contains some thousands of long non coding RNAs (lncRNAs). Many of these transcripts are presently considered crucial regulators of gene expression and functionally implicated in developmental processes in Eukaryotes. Notably, despite a huge number of lncRNAs are expressed in the Central Nervous System (CNS), only a few of them have been characterized in terms of molecular structure, gene expression regulation and function. In the present study, we identify linc-NeD125 as a novel cytoplasmic, neuronal-induced long intergenic non coding RNA (lincRNA). Linc-NeD125 represents the host gene for miR-125b-1, a microRNA with an established role as negative regulator of human neuroblastoma cell proliferation. Here, we demonstrate that these two overlapping non coding RNAs are coordinately induced during in vitro neuronal differentiation, and that their expression is regulated by different mechanisms. While the production of miR-125b-1 relies on transcriptional regulation, linc-NeD125 is controlled at the post-transcriptional level, through modulation of its stability. We also demonstrate that linc-NeD125 functions independently of the hosted microRNA, by reducing cell proliferation and activating the antiapoptotic factor BCL-2.
Collapse
Affiliation(s)
- Valeria Bevilacqua
- a Department of Biology and Biotechnology C. Darwin ; Sapienza University of Rome ; Rome , Italy.,f Present addresses Valeria Bevilacqua: Virology Program, INGM - Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi," Milan, Italy; Ubaldo Gioia: IFOM; the FIRC Institute of Molecular Oncology; Milan, Italy; Valerio Di Carlo: Center for Genomic Regulation and UPF ; Barcelona , Spain.,g These authors equally contributed to this work
| | - Ubaldo Gioia
- a Department of Biology and Biotechnology C. Darwin ; Sapienza University of Rome ; Rome , Italy.,f Present addresses Valeria Bevilacqua: Virology Program, INGM - Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi," Milan, Italy; Ubaldo Gioia: IFOM; the FIRC Institute of Molecular Oncology; Milan, Italy; Valerio Di Carlo: Center for Genomic Regulation and UPF ; Barcelona , Spain.,g These authors equally contributed to this work
| | - Valerio Di Carlo
- a Department of Biology and Biotechnology C. Darwin ; Sapienza University of Rome ; Rome , Italy.,f Present addresses Valeria Bevilacqua: Virology Program, INGM - Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi," Milan, Italy; Ubaldo Gioia: IFOM; the FIRC Institute of Molecular Oncology; Milan, Italy; Valerio Di Carlo: Center for Genomic Regulation and UPF ; Barcelona , Spain
| | - Anna F Tortorelli
- a Department of Biology and Biotechnology C. Darwin ; Sapienza University of Rome ; Rome , Italy
| | - Teresa Colombo
- b Institute for Computing Applications "Mauro Picone," National Research Council ; Rome , Italy
| | - Irene Bozzoni
- a Department of Biology and Biotechnology C. Darwin ; Sapienza University of Rome ; Rome , Italy.,c Institute of Molecular Biology and Pathology, National Research Council, Sapienza University of Rome ; Rome , Italy.,d Institute Pasteur Fondazione Cenci-Bolognetti, Sapienza University of Rome ; Rome , Italy.,e Center for Life Nano Science@Sapienza, Istituto Italiano di Tecnologia ; Rome , Italy
| | - Pietro Laneve
- e Center for Life Nano Science@Sapienza, Istituto Italiano di Tecnologia ; Rome , Italy
| | - Elisa Caffarelli
- c Institute of Molecular Biology and Pathology, National Research Council, Sapienza University of Rome ; Rome , Italy.,e Center for Life Nano Science@Sapienza, Istituto Italiano di Tecnologia ; Rome , Italy
| |
Collapse
|
11
|
Shimada MK, Sanbonmatsu R, Yamaguchi-Kabata Y, Yamasaki C, Suzuki Y, Chakraborty R, Gojobori T, Imanishi T. Selection pressure on human STR loci and its relevance in repeat expansion disease. Mol Genet Genomics 2016; 291:1851-69. [PMID: 27290643 DOI: 10.1007/s00438-016-1219-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2015] [Accepted: 05/21/2016] [Indexed: 12/30/2022]
Abstract
Short Tandem Repeats (STRs) comprise repeats of one to several base pairs. Because of the high mutability due to strand slippage during DNA synthesis, rapid evolutionary change in the number of repeating units directly shapes the range of repeat-number variation according to selection pressure. However, the remaining questions include: Why are STRs causing repeat expansion diseases maintained in the human population; and why are these limited to neurodegenerative diseases? By evaluating the genome-wide selection pressure on STRs using the database we constructed, we identified two different patterns of relationship in repeat-number polymorphisms between DNA and amino-acid sequences, although both patterns are evolutionary consequences of avoiding the formation of harmful long STRs. First, a mixture of degenerate codons is represented in poly-proline (poly-P) repeats. Second, long poly-glutamine (poly-Q) repeats are favored at the protein level; however, at the DNA level, STRs encoding long poly-Qs are frequently divided by synonymous SNPs. Furthermore, significant enrichments of apoptosis and neurodevelopment were biological processes found specifically in genes encoding poly-Qs with repeat polymorphism. This suggests the existence of a specific molecular function for polymorphic and/or long poly-Q stretches. Given that the poly-Qs causing expansion diseases were longer than other poly-Qs, even in healthy subjects, our results indicate that the evolutionary benefits of long and/or polymorphic poly-Q stretches outweigh the risks of long CAG repeats predisposing to pathological hyper-expansions. Molecular pathways in neurodevelopment requiring long and polymorphic poly-Q stretches may provide a clue to understanding why poly-Q expansion diseases are limited to neurodegenerative diseases.
Collapse
Affiliation(s)
- Makoto K Shimada
- Institute for Comprehensive Medical Science, Fujita Health University, 1-98 Dengakugakubo, Kutsukake-cho, Toyoake, Aichi, 470-1192, Japan. .,National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan. .,Japan Biological Informatics Consortium, 10F TIME24 Building, 2-4-32 Aomi, Koto-ku, Tokyo, 135-8073, Japan.
| | - Ryoko Sanbonmatsu
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-4-32 Aomi, Koto-ku, Tokyo, 135-8073, Japan
| | - Yumi Yamaguchi-Kabata
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan.,Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, 980-8573, Japan
| | - Chisato Yamasaki
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan.,Japan Biological Informatics Consortium, 10F TIME24 Building, 2-4-32 Aomi, Koto-ku, Tokyo, 135-8073, Japan
| | - Yoshiyuki Suzuki
- Graduate School of Natural Sciences, Nagoya City University, 1 Yamanohata, Mizuho-cho, Mizuho-ku, Nagoya, Aichi, 467-8501, Japan
| | - Ranajit Chakraborty
- Health Science Center, University of North Texas, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA
| | - Takashi Gojobori
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan.,Computational Bioscience Research Center, King Abdullah University of Science and Technology, Ibn Al-Haytham Building (West), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tadashi Imanishi
- National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi Koto-ku, Tokyo, 135-0064, Japan.,Department of Molecular Life Science, Tokai University School of Medicine, 143 Shimokasuya, Isehara, Kanagawa, 259-1193, Japan
| |
Collapse
|
12
|
Moolhuijzen P, Kulski JK, Dunn DS, Schibeci D, Barrero R, Gojobori T, Bellgard M. The transcript repeat element: the human Alu sequence as a component of gene networks influencing cancer. Funct Integr Genomics 2016; 10:307-19. [PMID: 20393868 DOI: 10.1007/s10142-010-0168-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
A small percentage (3%) of the 1.3 million copies of Alu sequences in the human genome is expressed individually or as part of various gene transcripts with potential regulatory and pathophysiological importance. In order to better understand the role of repetitive elements within transcripts, this review focuses on Alu-containing transcripts of normal and cancerous tissue in a transcriptome-wide survey of the H-Invitational human transcript database on 106,825 tissue-derived transcripts expressed at 29,979 loci. The Alu elements in transcripts of cancerous tissues are significantly underrepresented in comparison to those in normal tissues. In this review, we propose a model for Alu-mediated siRNA down-regulation of Alu-containing transcripts in cancer tissues. In cancer or other rapidly dividing tissues, hypomethylation of repeat element regions triggers the expression of transposon elements including Alu, which can potentially form double-stranded RNA molecules for use as templates to generate Alu-derived siRNAs (Alu-siRNAs). The generated Alu-siRNAs target endogenous messenger RNAs harbouring sequence similarity to Alu elements. This model correlates with the observation that there is substantial under-representation of Alu-containing mRNAs in cancer cells. This new perspective of gene regulation in disease conditions can provide a basis for starting to account for changes in complex gene network in cancer.
Collapse
Affiliation(s)
- Paula Moolhuijzen
- Centre for Comparative Genomics, School for Information Technology, Murdoch University, Murdoch, WA, Australia
| | | | | | | | | | | | | |
Collapse
|
13
|
Mouilleron H, Delcourt V, Roucou X. Death of a dogma: eukaryotic mRNAs can code for more than one protein. Nucleic Acids Res 2016; 44:14-23. [PMID: 26578573 PMCID: PMC4705651 DOI: 10.1093/nar/gkv1218] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Revised: 10/26/2015] [Accepted: 10/28/2015] [Indexed: 12/13/2022] Open
Abstract
mRNAs carry the genetic information that is translated by ribosomes. The traditional view of a mature eukaryotic mRNA is a molecule with three main regions, the 5' UTR, the protein coding open reading frame (ORF) or coding sequence (CDS), and the 3' UTR. This concept assumes that ribosomes translate one ORF only, generally the longest one, and produce one protein. As a result, in the early days of genomics and bioinformatics, one CDS was associated with each protein-coding gene. This fundamental concept of a single CDS is being challenged by increasing experimental evidence indicating that annotated proteins are not the only proteins translated from mRNAs. In particular, mass spectrometry (MS)-based proteomics and ribosome profiling have detected productive translation of alternative open reading frames. In several cases, the alternative and annotated proteins interact. Thus, the expression of two or more proteins translated from the same mRNA may offer a mechanism to ensure the co-expression of proteins which have functional interactions. Translational mechanisms already described in eukaryotic cells indicate that the cellular machinery is able to translate different CDSs from a single viral or cellular mRNA. In addition to summarizing data showing that the protein coding potential of eukaryotic mRNAs has been underestimated, this review aims to challenge the single translated CDS dogma.
Collapse
Affiliation(s)
- Hélène Mouilleron
- Department of biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec, Canada
| | - Vivian Delcourt
- Department of biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec, Canada Inserm U-1192, Laboratoire de Protéomique, Réponse Inflammatoire, Spectrométrie de Masse (PRISM), Université de Lille 1, Cité Scientifique, 59655 Villeneuve D'Ascq, France
| | - Xavier Roucou
- Department of biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec, Canada
| |
Collapse
|
14
|
de Rooy DP, Tsonaka R, Andersson ML, Forslind K, Zhernakova A, Frank-Bertoncelj M, de Kovel CG, Koeleman BP, van der Heijde DM, Huizinga TW, Toes RE, Houwing-Duistermaat JJ, Ospelt C, Svensson B, van der Helm-van Mil AH. Genetic Factors for the Severity of ACPA-negative Rheumatoid Arthritis in 2 Cohorts of Early Disease: A Genome-wide Study. J Rheumatol 2015; 42:1383-91. [DOI: 10.3899/jrheum.140741] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/24/2015] [Indexed: 11/22/2022]
Abstract
Objective.Rheumatoid arthritis (RA) that is negative for anticitrullinated protein antibodies (ACPA) is a subentity of RA, characterized by less severe disease. At the individual level, however, considerable differences in the severity of joint destruction occur. We performed a study on genetic factors underlying the differences in joint destruction in ACPA-negative patients.Methods.A genome-wide association study was done with 262 ACPA-negative patients with early RA included in the Leiden Early Arthritis Clinic and related to radiographic joint destruction over 7 years. Significant single-nucleotide polymorphisms (SNP) were evaluated for association with progression of radiographic joint destruction in 253 ACPA-negative patients with early RA included in the Better Anti-Rheumatic Farmaco Therapy (BARFOT) study. According to the Bonferroni correction of the number of tested SNP, the threshold for significance was p < 2 × 10−7 in phase 1 and 0.0045 in phase 2. In both cohorts, joint destruction was measured by Sharp/van der Heijde method with good reproducibility.Results.Thirty-three SNP associated with severity of joint destruction (p < 2 × 10−7) in phase 1. In phase 2, rs2833522 (p = 0.0049) showed borderline significance. A combined analysis of both the Leiden and BARFOT datasets of rs2833522 confirmed this association with joint destruction (p = 3.57 × 10−9); the minor allele (A) associated with more severe damage (for instance, after 7 yrs followup, patients carrying AA had 1.22 times more joint damage compared to patients carrying AG and 1.50 times more joint damage than patients carrying GG). In silico analysis using the ENCODE and Ensembl databases showed presence of H3K4me3 histone mark, transcription factors, and long noncoding RNA in the region of rs2833522, an intergenic SNP located between HUNK and SCAF4.Conclusion.Rs2833522 might be associated with the severity of joint destruction in ACPA-negative RA.
Collapse
|
15
|
Jalali S, Kapoor S, Sivadas A, Bhartiya D, Scaria V. Computational approaches towards understanding human long non-coding RNA biology. Bioinformatics 2015; 31:2241-51. [DOI: 10.1093/bioinformatics/btv148] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Accepted: 03/10/2015] [Indexed: 12/18/2022] Open
|
16
|
Sun H, Yang S, Tun L, Li Y. IAOseq: inferring abundance of overlapping genes using RNA-seq data. BMC Bioinformatics 2015; 16 Suppl 1:S3. [PMID: 25707673 PMCID: PMC4331702 DOI: 10.1186/1471-2105-16-s1-s3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping transcription constitutes a common mechanism for regulating gene expression. A major limitation of the overlapping transcription assays is the lack of high throughput expression data. RESULTS We developed a new tool (IAOseq) that is based on reads distributions along the transcribed regions to identify the expression levels of overlapping genes from standard RNA-seq data. Compared with five commonly used quantification methods, IAOseq showed better performance in the estimation accuracy of overlapping transcription levels. For the same strand overlapping transcription, currently existing high-throughput methods are rarely available to distinguish which strand was present in the original mRNA template. The IAOseq results showed that the commonly used methods gave an average of 1.6 fold overestimation of the expression levels of same strand overlapping genes. CONCLUSIONS This work provides a useful tool for mining overlapping transcription levels from standard RNA-seq libraries. IAOseq could be used to help us understand the complex regulatory mechanism mediated by overlapping transcripts. IAOseq is freely available at http://lifecenter.sgst.cn/main/en/IAO_seq.jsp.
Collapse
|
17
|
Nagai Y, Takahashi Y, Imanishi T. VaDE: a manually curated database of reproducible associations between various traits and human genomic polymorphisms. Nucleic Acids Res 2014; 43:D868-72. [PMID: 25361969 PMCID: PMC4383886 DOI: 10.1093/nar/gku1037] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified numerous single nucleotide polymorphisms (SNPs) associated with the development of common diseases. However, it is clear that genetic risk factors of common diseases are heterogeneous among human populations. Therefore, we developed a database of genomic polymorphisms that are reproducibly associated with disease susceptibilities, drug responses and other traits for each human population: 'VarySysDB Disease Edition' (VaDE; http://bmi-tokai.jp/VaDE/). SNP-trait association data were obtained from the National Human Genome Research Institute GWAS (NHGRI GWAS) catalog and RAvariome, and we added detailed information of sample populations by curating original papers. In addition, we collected and curated original papers, and registered the detailed information of SNP-trait associations in VaDE. Then, we evaluated reproducibility of associations in each population by counting the number of significantly associated studies. VaDE provides literature-based SNP-trait association data and functional genomic region annotation for SNP functional research. SNP functional annotation data included experimental data of the ENCODE project, H-InvDB transcripts and the 1000 Genome Project. A user-friendly web interface was developed to assist quick search, easy download and fast swapping among viewers. We believe that our database will contribute to the future establishment of personalized medicine and increase our understanding of genetic factors underlying diseases.
Collapse
Affiliation(s)
- Yoko Nagai
- Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa 259-1193, Japan
| | - Yasuko Takahashi
- Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa 259-1193, Japan
| | - Tadashi Imanishi
- Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa 259-1193, Japan Data Management and Integration Team, Molecular Profiling Research Center for Drug Discovery, National Institute of Advanced Industrial Science and Technology, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
18
|
Mohanty V, Gökmen-Polar Y, Badve S, Janga SC. Role of lncRNAs in health and disease-size and shape matter. Brief Funct Genomics 2014; 14:115-29. [PMID: 25212482 DOI: 10.1093/bfgp/elu034] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Most of the mammalian genome including a large fraction of the non-protein coding transcripts has been shown to be transcribed. Studies related to these non-coding RNA molecules have predominantly focused on smaller molecules like microRNAs. In contrast, long non-coding RNAs (lncRNAs) have long been considered to be transcriptional noise. Accumulating evidence suggests that lncRNAs are involved in key cellular and developmental processes. Several critical questions regarding functions and properties of lncRNAs and their circular forms remain to be answered. Increasing evidence from high-throughput sequencing screens also suggests the involvement of lncRNAs in diseases such as cancer, although the underlying mechanisms still need to be elucidated. Here, we discuss the current state of research in the field of lncRNAs, questions that need to be addressed in light of recent genome-wide studies documenting the landscape of lncRNAs, their functional roles and involvement in diseases. We posit that with the availability of high-throughput data sets it is not only possible to improve methods for predicting lncRNAs but will also facilitate our ability to elucidate their functions and phenotypes by using integrative approaches.
Collapse
|
19
|
Multiplicity of 5' cap structures present on short RNAs. PLoS One 2014; 9:e102895. [PMID: 25079783 PMCID: PMC4117478 DOI: 10.1371/journal.pone.0102895] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2014] [Accepted: 06/24/2014] [Indexed: 12/18/2022] Open
Abstract
Most RNA molecules are co- or post-transcriptionally modified to alter their chemical and functional properties to assist in their ultimate biological function. Among these modifications, the addition of 5′ cap structure has been found to regulate turnover and localization. Here we report a study of the cap structure of human short (<200 nt) RNAs (sRNAs), using sequencing of cDNA libraries prepared by enzymatic pretreatment of the sRNAs with cap sensitive-specificity, thin layer chromatographic (TLC) analyses of isolated cap structures and mass spectrometric analyses for validation of TLC analyses. Processed versions of snoRNAs and tRNAs sequences of less than 50 nt were observed in capped sRNA libraries, indicating additional processing and recapping of these annotated sRNAs biotypes. We report for the first time 2,7 dimethylguanosine in human sRNAs cap structures and surprisingly we find multiple type 0 cap structures (mGpppC, 7mGpppG, GpppG, GpppA, and 7mGpppA) in RNA length fractions shorter than 50 nt. Finally, we find the presence of additional uncharacterized cap structures that wait determination by the creation of needed reference compounds to be used in TLC analyses. These studies suggest the existence of novel biochemical pathways leading to the processing of primary and sRNAs and the modifications of their RNA 5′ ends with a spectrum of chemical modifications.
Collapse
|
20
|
Harbers M. Wheat germ systems for cell-free protein expression. FEBS Lett 2014; 588:2762-73. [PMID: 24931374 DOI: 10.1016/j.febslet.2014.05.061] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Revised: 05/25/2014] [Accepted: 05/26/2014] [Indexed: 10/25/2022]
Abstract
Cell-free protein expression plays an important role in biochemical research. However, only recent developments led to new methods to rapidly synthesize preparative amounts of protein that make cell-free protein expression an attractive alternative to cell-based methods. In particular the wheat germ system provides the highest translation efficiency among eukaryotic cell-free protein expression approaches and has a very high success rate for the expression of soluble proteins of good quality. As an open in vitro method, the wheat germ system is a preferable choice for many applications in protein research including options for protein labeling and the expression of difficult-to-express proteins like membrane proteins and multiple protein complexes. Here I describe wheat germ cell-free protein expression systems and give examples how they have been used in genome-wide expression studies, preparation of labeled proteins for structural genomics and protein mass spectroscopy, automated protein synthesis, and screening of enzymatic activities. Future directions for the use of cell-free expression methods are discussed.
Collapse
Affiliation(s)
- Matthias Harbers
- RIKEN Center for Life Science Technologies, Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan; CellFree Sciences Co., Ltd., 75-1, Ono-cho, Leading Venture Plaza 201, Tsurumi-ku, Yokohama, Kanagawa 230-0046, Japan.
| |
Collapse
|
21
|
Abstract
As more and more systems biology approaches are used to investigate the different types of biological macromolecules, increasing numbers of whole genomic studies are now available for a large array of organisms. Whether it is genomics, transcriptomics, proteomics, interactomics or metabolomics, the full complement of genomic information on all different levels can be juxtaposed between different organisms to reveal similarities or differences, and even to provide consensus models. At the intersection of comparative genomics and systems biology lies great possibility for discovery, analysis and prediction. This paper explores this nexus and the relationship from four general levels: DNA, RNA, protein and extragenomic. For each level, we provide an overview of the methods, discuss the potential challenges and survey the current research. Finally, we suggest some organizing principles and make proposals for new areas that will be important for future research.
Collapse
Affiliation(s)
- Jimmy Lin
- Wilmer Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | | |
Collapse
|
22
|
Mochida K, Shinozaki K. Unlocking Triticeae genomics to sustainably feed the future. PLANT & CELL PHYSIOLOGY 2013; 54:1931-50. [PMID: 24204022 PMCID: PMC3856857 DOI: 10.1093/pcp/pct163] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2013] [Accepted: 11/04/2013] [Indexed: 05/23/2023]
Abstract
The tribe Triticeae includes the major crops wheat and barley. Within the last few years, the whole genomes of four Triticeae species-barley, wheat, Tausch's goatgrass (Aegilops tauschii) and wild einkorn wheat (Triticum urartu)-have been sequenced. The availability of these genomic resources for Triticeae plants and innovative analytical applications using next-generation sequencing technologies are helping to revitalize our approaches in genetic work and to accelerate improvement of the Triticeae crops. Comparative genomics and integration of genomic resources from Triticeae plants and the model grass Brachypodium distachyon are aiding the discovery of new genes and functional analyses of genes in Triticeae crops. Innovative approaches and tools such as analysis of next-generation populations, evolutionary genomics and systems approaches with mathematical modeling are new strategies that will help us discover alleles for adaptive traits to future agronomic environments. In this review, we provide an update on genomic tools for use with Triticeae plants and Brachypodium and describe emerging approaches toward crop improvements in Triticeae.
Collapse
Affiliation(s)
- Keiichi Mochida
- Biomass Research Platform Team, Biomass Engineering Program Cooperation Division, RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045 Japan
- Kihara Institute for Biological Research, Yokohama City University, 641-12 Maioka-cho, Totsuka-ku, Yokohama, Kanagawa, 230-0045 Japan
| | - Kazuo Shinozaki
- Biomass Research Platform Team, Biomass Engineering Program Cooperation Division, RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045 Japan
| |
Collapse
|
23
|
Mochida K, Uehara-Yamaguchi Y, Takahashi F, Yoshida T, Sakurai T, Shinozaki K. Large-scale collection and analysis of full-length cDNAs from Brachypodium distachyon and integration with Pooideae sequence resources. PLoS One 2013; 8:e75265. [PMID: 24130698 PMCID: PMC3793998 DOI: 10.1371/journal.pone.0075265] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 08/14/2013] [Indexed: 01/09/2023] Open
Abstract
A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops.
Collapse
Affiliation(s)
- Keiichi Mochida
- Biomass Research Platform Team, Biomass Engineering Program Cooperation Division, RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
- Kihara Institute for Biological Research, Yokohama City University, Totsuka-ku, Yokohama, Kanagawa, Japan
- * E-mail:
| | - Yukiko Uehara-Yamaguchi
- Biomass Research Platform Team, Biomass Engineering Program Cooperation Division, RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Fuminori Takahashi
- Biomass Research Platform Team, Biomass Engineering Program Cooperation Division, RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Takuhiro Yoshida
- Integrated Genome Informatics Research Unit, RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Tetsuya Sakurai
- Integrated Genome Informatics Research Unit, RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Kazuo Shinozaki
- Biomass Research Platform Team, Biomass Engineering Program Cooperation Division, RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| |
Collapse
|
24
|
Regulatory Roles for Long ncRNA and mRNA. Cancers (Basel) 2013; 5:462-90. [PMID: 24216986 PMCID: PMC3730338 DOI: 10.3390/cancers5020462] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 04/05/2013] [Accepted: 04/19/2013] [Indexed: 01/31/2023] Open
Abstract
Recent advances in high-throughput sequencing technology have identified the transcription of a much larger portion of the genome than previously anticipated. Especially in the context of cancer it has become clear that aberrant transcription of both protein-coding and long non-coding RNAs (lncRNAs) are frequent events. The current dogma of RNA function describes mRNA to be responsible for the synthesis of proteins, whereas non-coding RNA can have regulatory or epigenetic functions. However, this distinction between protein coding and regulatory ability of transcripts may not be that strict. Here, we review the increasing body of evidence for the existence of multifunctional RNAs that have both protein-coding and trans-regulatory roles. Moreover, we demonstrate that coding transcripts bind to components of the Polycomb Repressor Complex 2 (PRC2) with similar affinities as non-coding transcripts, revealing potential epigenetic regulation by mRNAs. We hypothesize that studies on the regulatory ability of disease-associated mRNAs will form an important new field of research.
Collapse
|
25
|
Hara Y, Imanishi T, Satta Y. Reconstructing the demographic history of the human lineage using whole-genome sequences from human and three great apes. Genome Biol Evol 2013; 4:1133-45. [PMID: 22975719 PMCID: PMC3752010 DOI: 10.1093/gbe/evs075] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The demographic history of human would provide helpful information for identifying the evolutionary events that shaped the humanity but remains controversial even in the genomic era. To settle the controversies, we inferred the speciation times (T) and ancestral population sizes (N) in the lineage leading to human and great apes based on whole-genome alignment. A coalescence simulation determined the sizes of alignment blocks and intervals between them required to obtain recombination-free blocks with a high frequency. This simulation revealed that the size of the block strongly affects the parameter inference, indicating that recombination is an important factor for achieving optimum parameter inference. From the whole genome alignments (1.9 giga-bases) of human (H), chimpanzee (C), gorilla (G), and orangutan, 100-bp alignment blocks separated by ≥5-kb intervals were sampled and subjected to estimate τ = μT and θ = 4μgN using the Markov chain Monte Carlo method, where μ is the mutation rate and g is the generation time. Although the estimated τHC differed across chromosomes, τHC and τHCG were strongly correlated across chromosomes, indicating that variation in τ is subject to variation in μ, rather than T, and thus, all chromosomes share a single speciation time. Subsequently, we estimated Ts of the human lineage from chimpanzee, gorilla, and orangutan to be 6.0–7.6, 7.6–9.7, and 15–19 Ma, respectively, assuming variable μ across lineages and chromosomes. These speciation times were consistent with the fossil records. We conclude that the speciation times in our recombination-free analysis would be conclusive and the speciation between human and chimpanzee was a single event.
Collapse
Affiliation(s)
- Yuichiro Hara
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, Koto-ku, Tokyo, Japan
| | | | | |
Collapse
|
26
|
Zhu S, Zhang XO, Yang L. Panning for Long Noncoding RNAs. Biomolecules 2013; 3:226-41. [PMID: 24970166 PMCID: PMC4030883 DOI: 10.3390/biom3010226] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Revised: 02/21/2013] [Accepted: 02/21/2013] [Indexed: 11/16/2022] Open
Abstract
The recent advent of high-throughput approaches has revealed widespread transcription of the human genome, leading to a new appreciation of transcription regulation, especially from noncoding regions. Distinct from most coding and small noncoding RNAs, long noncoding RNAs (lncRNAs) are generally expressed at low levels, are less conserved and lack protein-coding capacity. These intrinsic features of lncRNAs have not only hampered their full annotation in the past several years, but have also generated controversy concerning whether many or most of these lncRNAs are simply the result of transcriptional noise. Here, we assess these intrinsic features that have challenged lncRNA discovery and further summarize recent progress in lncRNA discovery with integrated methodologies, from which new lessons and insights can be derived to achieve better characterization of lncRNA expression regulation. Full annotation of lncRNA repertoires and the implications of such annotation will provide a fundamental basis for comprehensive understanding of pervasive functions of lncRNAs in biological regulation.
Collapse
Affiliation(s)
- Shanshan Zhu
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Chinese Academy of Sciences, Shanghai 200031, China.
| | - Xiao-Ou Zhang
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Chinese Academy of Sciences, Shanghai 200031, China.
| | - Li Yang
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Chinese Academy of Sciences, Shanghai 200031, China.
| |
Collapse
|
27
|
Chan WL, Yang WK, Huang HD, Chang JG. pseudoMap: an innovative and comprehensive resource for identification of siRNA-mediated mechanisms in human transcribed pseudogenes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat001. [PMID: 23396300 PMCID: PMC3567485 DOI: 10.1093/database/bat001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL:http://pseudomap.mbc.nctu.edu.tw/
Collapse
Affiliation(s)
- Wen-Ling Chan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu, Taiwan
| | | | | | | |
Collapse
|
28
|
Chan WL, Yuo CY, Yang WK, Hung SY, Chang YS, Chiu CC, Yeh KT, Huang HD, Chang JG. Transcribed pseudogene ψPPM1K generates endogenous siRNA to suppress oncogenic cell growth in hepatocellular carcinoma. Nucleic Acids Res 2013; 41:3734-47. [PMID: 23376929 PMCID: PMC3616710 DOI: 10.1093/nar/gkt047] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Pseudogenes, especially those that are transcribed, may not be mere genomic fossils, but their biological significance remains unclear. Postulating that in the human genome, as in animal models, pseudogenes may function as gene regulators through generation of endo-siRNAs (esiRNAs), antisense RNAs or RNA decoys, we performed bioinformatic and subsequent experimental tests to explore esiRNA-mediated mechanisms of pseudogene involvement in oncogenesis. A genome-wide survey revealed a partial retrotranscript pseudogene ψPPM1K containing inverted repeats capable of folding into hairpin structures that can be processed into two esiRNAs; these esiRNAs potentially target many cellular genes, including NEK8. In 41 paired surgical specimens, we found significantly reduced expression of two predicted ψPPM1K-specific esiRNAs, and the cognate gene PPM1K, in hepatocellular carcinoma compared with matched non-tumour tissues, whereas the expression of target gene NEK8 was increased in tumours. Additionally, NEK8 and PPM1K were downregulated in stably transfected ψPPM1K-overexpressing cells, but not in cells transfected with an esiRNA1-deletion mutant of ψPPM1K. Furthermore, expression of NEK8 in ψPPM1K-transfected cells demonstrated that NEK8 can counteract the growth inhibitory effects of ψPPM1K. These findings indicate that a transcribed pseudogene can exert tumour-suppressor activity independent of its parental gene by generation of esiRNAs that regulate human cell growth.
Collapse
Affiliation(s)
- Wen-Ling Chan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Imanishi T, Nagai Y, Habara T, Yamasaki C, Takeda JI, Mikami S, Bando Y, Tojo H, Nishimura T. Full-length Transcriptome-based H-InvDB Throws a New Light on Chromosome-centric Proteomics. J Proteome Res 2012; 12:62-6. [DOI: 10.1021/pr300861a] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Tadashi Imanishi
- Biomedicinal Information Research
Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Department of Molecular Life
Science, Division of Basic Medical Science and Molecular Medicine, Tokai University, School of Medicine, Kanagawa, Japan
| | - Yoko Nagai
- Biomedicinal Information Research
Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Takuya Habara
- Biomedicinal Information Research
Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Chisato Yamasaki
- Biomedicinal Information Research
Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Jun-ichi Takeda
- Biomedicinal Information Research
Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | | | | | - Hiromasa Tojo
- Department of Biophysics and Biochemistry, Osaka University, Graduate School of Medicine, Osaka,
Japan
| | - Toshihide Nishimura
- Biosys Technologies, Inc., Tokyo, Japan
- Department of Surgery
I, Tokyo Medical University, Tokyo, Japan
| |
Collapse
|
30
|
Kikugawa S, Nishikata K, Murakami K, Sato Y, Suzuki M, Altaf-Ul-Amin M, Kanaya S, Imanishi T. PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from h-invitational protein-protein interactions integrative dataset. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 2:S7. [PMID: 23282181 PMCID: PMC3521179 DOI: 10.1186/1752-0509-6-s2-s7] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Background Proteins interact with other proteins or biomolecules in complexes to perform cellular functions. Existing protein-protein interaction (PPI) databases and protein complex databases for human proteins are not organized to provide protein complex information or facilitate the discovery of novel subunits. Data integration of PPIs focused specifically on protein complexes, subunits, and their functions. Predicted candidate complexes or subunits are also important for experimental biologists. Description Based on integrated PPI data and literature, we have developed a human protein complex database with a complex quality index (PCDq), which includes both known and predicted complexes and subunits. We integrated six PPI data (BIND, DIP, MINT, HPRD, IntAct, and GNP_Y2H), and predicted human protein complexes by finding densely connected regions in the PPI networks. They were curated with the literature so that missing proteins were complemented and some complexes were merged, resulting in 1,264 complexes comprising 9,268 proteins with 32,198 PPIs. The evidence level of each subunit was assigned as a categorical variable. This indicated whether it was a known subunit, and a specific function was inferable from sequence or network analysis. To summarize the categories of all the subunits in a complex, we devised a complex quality index (CQI) and assigned it to each complex. We examined the proportion of consistency of Gene Ontology (GO) terms among protein subunits of a complex. Next, we compared the expression profiles of the corresponding genes and found that many proteins in larger complexes tend to be expressed cooperatively at the transcript level. The proportion of duplicated genes in a complex was evaluated. Finally, we identified 78 hypothetical proteins that were annotated as subunits of 82 complexes, which included known complexes. Of these hypothetical proteins, after our prediction had been made, four were reported to be actual subunits of the assigned protein complexes. Conclusions We constructed a new protein complex database PCDq including both predicted and curated human protein complexes. CQI is a useful source of experimentally confirmed information about protein complexes and subunits. The predicted protein complexes can provide functional clues about hypothetical proteins. PCDq is freely available at http://h-invitational.jp/hinv/pcdq/.
Collapse
Affiliation(s)
- Shingo Kikugawa
- Integrated Databases and Systems Biology Team, Biological Information Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Takeda JI, Yamasaki C, Murakami K, Nagai Y, Sera M, Hara Y, Obi N, Habara T, Gojobori T, Imanishi T. H-InvDB in 2013: an omics study platform for human functional gene and transcript discovery. Nucleic Acids Res 2012. [PMID: 23197657 PMCID: PMC3531145 DOI: 10.1093/nar/gks1245] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
H-InvDB (http://www.h-invitational.jp/) is a comprehensive human gene database started in 2004. In the latest version, H-InvDB 8.0, a total of 244 709 human complementary DNA was mapped onto the hg19 reference genome and 43 829 gene loci, including nonprotein-coding ones, were identified. Of these loci, 35 631 were identified as potential protein-coding genes, and 22 898 of these were identical to known genes. In our analysis, 19 309 annotated genes were specific to H-InvDB and not found in RefSeq and Ensembl. In fact, 233 genes of the 19 309 turned out to have protein functions in this version of H-InvDB; they were annotated as unknown protein functions in the previous version. Furthermore, 11 genes were identified as known Mendelian disorder genes. It is advantageous that many biologically functional genes are hidden in the H-InvDB unique genes. As large-scale proteomic projects have been conducted to elucidate the functions of all human proteins, we have enhanced the proteomic information with an advanced protein view and new subdatabase of protein complexes (Protein Complex Database with quality index). We propose that H-InvDB is an important resource for finding novel candidate targets for medical care and drug development.
Collapse
Affiliation(s)
- Jun-Ichi Takeda
- Integrated Database and Systems Biology Team, Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, Aomi 2-4-7, Koto-ku, Tokyo 135-0064, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
The relationship between sequence polymorphisms and human disease has been studied mostly in terms of effects of single nucleotide polymorphisms (SNPs) leading to single amino acid substitutions that change protein structure and function. However, less attention has been paid to more drastic sequence polymorphisms which cause premature termination of a protein’s sequence or large changes, insertions, or deletions in the sequence. We have analyzed a large set (n = 512) of insertions and deletions (indels) and single nucleotide polymorphisms causing premature termination of translation in disease-related genes. Prediction of protein-destabilization effects was performed by graphical presentation of the locations of polymorphisms in the protein structure, using the Genomes TO Protein (GTOP) database, and manual annotation with a set of specific criteria. Protein-destabilization was predicted for 44.4% of the nonsense SNPs, 32.4% of the frameshifting indels, and 9.1% of the non-frameshifting indels. A prediction of nonsense-mediated decay allowed to infer which truncated proteins would actually be translated as defective proteins. These cases included the proteins linked to diseases inherited dominantly, suggesting a relation between these diseases and toxic aggregation. Our approach would be useful in identifying potentially aggregation-inducing polymorphisms that may have pathological effects.
Collapse
|
33
|
Gascoigne DK, Cheetham SW, Cattenoz PB, Clark MB, Amaral PP, Taft RJ, Wilhelm D, Dinger ME, Mattick JS. Pinstripe: a suite of programs for integrating transcriptomic and proteomic datasets identifies novel proteins and improves differentiation of protein-coding and non-coding genes. Bioinformatics 2012; 28:3042-50. [PMID: 23044541 DOI: 10.1093/bioinformatics/bts582] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Comparing transcriptomic data with proteomic data to identify protein-coding sequences is a long-standing challenge in molecular biology, one that is exacerbated by the increasing size of high-throughput datasets. To address this challenge, and thereby to improve the quality of genome annotation and understanding of genome biology, we have developed an integrated suite of programs, called Pinstripe. We demonstrate its application, utility and discovery power using transcriptomic and proteomic data from publicly available datasets. RESULTS To demonstrate the efficacy of Pinstripe for large-scale analysis, we applied Pinstripe's reverse peptide mapping pipeline to a transcript library including de novo assembled transcriptomes from the human Illumina Body Atlas (IBA2) and GENCODE v10 gene annotations, and the EBI Proteomics Identifications Database (PRIDE) peptide database. This analysis identified 736 canonical open reading frames (ORFs) supported by three or more PRIDE peptide fragments that are positioned outside any known coding DNA sequence (CDS). Because of the unfiltered nature of the PRIDE database and high probability of false discovery, we further refined this list using independent evidence for translation, including the presence of a Kozak sequence or functional domains, synonymous/non-synonymous substitution ratios and ORF length. Using this integrative approach, we observed evidence of translation from a previously unknown let7e primary transcript, the archetypical lncRNA H19, and a homolog of RD3. Reciprocally, by exclusion of transcripts with mapped peptides or significant ORFs (>80 codon), we identify 32 187 loci with RNAs longer than 2000 nt that are unlikely to encode proteins. AVAILABILITY AND IMPLEMENTATION Pinstripe (pinstripe.matticklab.com) is freely available as source code or a Mono binary. Pinstripe is written in C# and runs under the Mono framework on Linux or Mac OS X, and both under Mono and .Net under Windows. CONTACT m.dinger@garvan.org.au or j.mattick@garvan.org.au SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dennis K Gascoigne
- Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Wu PY, Phan JH, Wang MD. The Effect of Human Genome Annotation Complexity on RNA-Seq Gene Expression Quantification. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2012; 2012:712-717. [PMID: 27532059 DOI: 10.1109/bibmw.2012.6470224] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Next-generation sequencing (NGS) has brought human genomic research to an unprecedented era. RNA-Seq is a branch of NGS that can be used to quantify gene expression and depends on accurate annotation of the human genome (i.e., the definition of genes and all of their variants or isoforms). Multiple annotations of the human genome exist with varying complexity. However, it is not clear how the choice of genome annotation influences RNA-Seq gene expression quantification. We assess the effect of different genome annotations in terms of (1) mapping quality, (2) quantification variation, (3) quantification accuracy (i.e., by comparing to qRT-PCR data), and (4) the concordance of detecting differentially expressed genes. External validation with qRT-PCR suggests that more complex genome annotations result in higher quantification variation.
Collapse
Affiliation(s)
- Po-Yen Wu
- Department of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, U.S.A,
| | - John H Phan
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA, U.S.A,
| | - May D Wang
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA, U.S.A,
| |
Collapse
|
35
|
Comparative genome analysis of three eukaryotic parasites with differing abilities to transform leukocytes reveals key mediators of Theileria-induced leukocyte transformation. mBio 2012; 3:e00204-12. [PMID: 22951932 PMCID: PMC3445966 DOI: 10.1128/mbio.00204-12] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We sequenced the genome of Theileria orientalis, a tick-borne apicomplexan protozoan parasite of cattle. The focus of this study was a comparative genome analysis of T. orientalis relative to other highly pathogenic Theileria species, T. parva and T. annulata. T. parva and T. annulata induce transformation of infected cells of lymphocyte or macrophage/monocyte lineages; in contrast, T. orientalis does not induce uncontrolled proliferation of infected leukocytes and multiplies predominantly within infected erythrocytes. While synteny across homologous chromosomes of the three Theileria species was found to be well conserved overall, subtelomeric structures were found to differ substantially, as T. orientalis lacks the large tandemly arrayed subtelomere-encoded variable secreted protein-encoding gene family. Moreover, expansion of particular gene families by gene duplication was found in the genomes of the two transforming Theileria species, most notably, the TashAT/TpHN and Tar/Tpr gene families. Gene families that are present only in T. parva and T. annulata and not in T. orientalis, Babesia bovis, or Plasmodium were also identified. Identification of differences between the genome sequences of Theileria species with different abilities to transform and immortalize bovine leukocytes will provide insight into proteins and mechanisms that have evolved to induce and regulate this process. The T. orientalis genome database is available at http://totdb.czc.hokudai.ac.jp/.
Collapse
|
36
|
Sirota FL, Batagov A, Schneider G, Eisenhaber B, Eisenhaber F, Maurer-Stroh S. Beware of moving targets: reference proteome content fluctuates substantially over the years. J Bioinform Comput Biol 2012; 10:1250020. [PMID: 22867629 DOI: 10.1142/s0219720012500205] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50-90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct.
Collapse
Affiliation(s)
- Fernanda L Sirota
- Bioinformatics Institute (BII), Agency for Science and Technology (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore.
| | | | | | | | | | | |
Collapse
|
37
|
Davis MJ, Shin CJ, Jing N, Ragan MA. Rewiring the dynamic interactome. MOLECULAR BIOSYSTEMS 2012; 8:2054-66, 2013. [DOI: 10.1039/c2mb25050k] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
38
|
Ng SY, Johnson R, Stanton LW. Human long non-coding RNAs promote pluripotency and neuronal differentiation by association with chromatin modifiers and transcription factors. EMBO J 2011; 31:522-33. [PMID: 22193719 DOI: 10.1038/emboj.2011.459] [Citation(s) in RCA: 410] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Accepted: 11/17/2011] [Indexed: 01/04/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) are a numerous class of newly discovered genes in the human genome, which have been proposed to be key regulators of biological processes, including stem cell pluripotency and neurogenesis. However, at present very little functional characterization of lncRNAs in human differentiation has been carried out. In the present study, we address this using human embryonic stem cells (hESCs) as a paradigm for pluripotency and neuronal differentiation. With a newly developed method, hESCs were robustly and efficiently differentiated into neurons, and we profiled the expression of thousands of lncRNAs using a custom-designed microarray. Some hESC-specific lncRNAs involved in pluripotency maintenance were identified, and shown to physically interact with SOX2, and PRC2 complex component, SUZ12. Using a similar approach, we identified lncRNAs required for neurogenesis. Knockdown studies indicated that loss of any of these lncRNAs blocked neurogenesis, and immunoprecipitation studies revealed physical association with REST and SUZ12. This study indicates that lncRNAs are important regulators of pluripotency and neurogenesis, and represents important evidence for an indispensable role of lncRNAs in human brain development.
Collapse
Affiliation(s)
- Shi-Yan Ng
- Stem Cell and Developmental Biology Group, Genome Institute of Singapore, Singapore
| | | | | |
Collapse
|
39
|
Bioinformatics tools and novel challenges in long non-coding RNAs (lncRNAs) functional analysis. Int J Mol Sci 2011; 13:97-114. [PMID: 22312241 PMCID: PMC3269675 DOI: 10.3390/ijms13010097] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Revised: 12/02/2011] [Accepted: 12/05/2011] [Indexed: 01/22/2023] Open
Abstract
The advent of next generation sequencing revealed that a fraction of transcribed RNAs (short and long RNAs) is non-coding. Long non-coding RNAs (lncRNAs) have a crucial role in regulating gene expression and in epigenetics (chromatin and histones remodeling). LncRNAs may have different roles: gene activators (signaling), repressors (decoy), cis and trans gene expression regulators (guides) and chromatin modificators (scaffolds) without the need to be mutually exclusive. LncRNAs are also implicated in a number of diseases. The huge amount of inhomogeneous data produced so far poses several bioinformatics challenges spanning from the simple annotation to the more complex functional annotation. In this review, we report and discuss several bioinformatics resources freely available and dealing with the study of lncRNAs. To our knowledge, this is the first review summarizing all the available bioinformatics resources on lncRNAs appeared in the literature after the completion of the human genome project. Therefore, the aim of this review is to provide a little guide for biologists and bioinformaticians looking for dedicated resources, public repositories and other tools for lncRNAs functional analysis.
Collapse
|
40
|
Maruyama Y, Kawamura Y, Nishikawa T, Isogai T, Nomura N, Goshima N. HGPD: Human Gene and Protein Database, 2012 update. Nucleic Acids Res 2011; 40:D924-9. [PMID: 22140100 PMCID: PMC3245012 DOI: 10.1093/nar/gkr1188] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Human Gene and Protein Database (HGPD; http://www.HGPD.jp/) is a unique database that stores information on a set of human Gateway entry clones in addition to protein expression and protein synthesis data. The HGPD was launched in November 2008, and 33,275 human Gateway entry clones have been constructed from the open reading frames (ORFs) of full-length cDNA, thus representing the largest collection in the world. Recently, research objectives have focused on the development of new medicines and the establishment of novel diagnostic methods and medical treatments. And, studies using proteins and protein information, which are closely related to gene function, have been undertaken. For this update, we constructed an additional 9974 human Gateway entry clones, giving a total of 43,249. This set of human Gateway entry clones was named the Human Proteome Expression Resource, known as the 'HuPEX'. In addition, we also classified the clones into 10 groups according to protein function. Moreover, in vivo cellular localization data of proteins for 32,651 human Gateway entry clones were included for retrieval from the HGPD. In 'Information Overview', which presents the search results, the ORF region of each cDNA is now displayed allowing the Gateway entry clones to be searched more easily.
Collapse
Affiliation(s)
- Yukio Maruyama
- National Institute of Advanced Industrial Science and Technology, Japan Biological Informatics Consortium, Aomi, Koto-ku, Tokyo 135-0064, Japan
| | | | | | | | | | | |
Collapse
|
41
|
Hara Y, Imanishi T. Abundance of ultramicro inversions within local alignments between human and chimpanzee genomes. BMC Evol Biol 2011; 11:308. [PMID: 22011259 PMCID: PMC3227671 DOI: 10.1186/1471-2148-11-308] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 10/19/2011] [Indexed: 11/18/2022] Open
Abstract
Background Chromosomal inversion is one of the most important mechanisms of evolution. Recent studies of comparative genomics have revealed that chromosomal inversions are abundant in the human genome. While such previously characterized inversions are large enough to be identified as a single alignment or a string of local alignments, the impact of ultramicro inversions, which are such short that the local alignments completely cover them, on evolution is still uncertain. Results In this study, we developed a method for identifying ultramicro inversions by scanning of local alignments. This technique achieved a high sensitivity and a very low rate of false positives. We identified 2,377 ultramicro inversions ranging from five to 125 bp within the orthologous alignments between the human and chimpanzee genomes. The false positive rate was estimated to be around 4%. Based on phylogenetic profiles using the primate outgroups, 479 ultramicro inversions were inferred to have specifically inverted in the human lineage. Ultramicro inversions exclusively involving adenine and thymine were the most frequent; 461 inversions (19.4%) of the total. Furthermore, the density of ultramicro inversions in chromosome Y and the neighborhoods of transposable elements was higher than average. Sixty-five ultramicro inversions were identified within the exons of human protein-coding genes. Conclusions We defined ultramicro inversions as the inverted regions equal to or smaller than 125 bp buried within local alignments. Our observations suggest that ultramicro inversions are abundant among the human and chimpanzee genomes, and that location of the inversions correlated with the genome structural instability. Some of the ultramicro inversions may contribute to gene evolution. Our inversion-identification method is also applicable in the fine-tuning of genome alignments by distinguishing ultramicro inversions from nucleotide substitutions and indels.
Collapse
Affiliation(s)
- Yuichiro Hara
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, Aomi 2-4-7, Koto-ku, Tokyo, Japan
| | | |
Collapse
|
42
|
Taniya T, Tanaka S, Yamaguchi-Kabata Y, Hanaoka H, Yamasaki C, Maekawa H, Barrero RA, Lenhard B, Datta MW, Shimoyama M, Bumgarner R, Chakraborty R, Hopkinson I, Jia L, Hide W, Auffray C, Minoshima S, Imanishi T, Gojobori T. A prioritization analysis of disease association by data-mining of functional annotation of human genes. Genomics 2011; 99:1-9. [PMID: 22019378 DOI: 10.1016/j.ygeno.2011.10.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2011] [Revised: 09/16/2011] [Accepted: 10/06/2011] [Indexed: 11/15/2022]
Abstract
Complex diseases result from contributions of multiple genes that act in concert through pathways. Here we present a method to prioritize novel candidates of disease-susceptibility genes depending on the biological similarities to the known disease-related genes. The extent of disease-susceptibility of a gene is prioritized by analyzing seven features of human genes captured in H-InvDB. Taking rheumatoid arthritis (RA) and prostate cancer (PC) as two examples, we evaluated the efficiency of our method. Highly scored genes obtained included TNFSF12 and OSM as candidate disease genes for RA and PC, respectively. Subsequent characterization of these genes based upon an extensive literature survey reinforced the validity of these highly scored genes as possible disease-susceptibility genes. Our approach, Prioritization ANalysis of Disease Association (PANDA), is an efficient and cost-effective method to narrow down a large set of genes into smaller subsets that are most likely to be involved in the disease pathogenesis.
Collapse
Affiliation(s)
- Takayuki Taniya
- Japan Biological Information Research Center, Japan Biological Informatics Consortium, AIST Bio-IT Research Building 7F, 2-4-7 Aomi, Tokyo 135-0064, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Weak seed-pairing stability and high target-site abundance decrease the proficiency of lsy-6 and other microRNAs. Nat Struct Mol Biol 2011; 18:1139-46. [PMID: 21909094 PMCID: PMC3190056 DOI: 10.1038/nsmb.2115] [Citation(s) in RCA: 712] [Impact Index Per Article: 54.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2010] [Accepted: 07/01/2011] [Indexed: 12/11/2022]
Abstract
Most metazoan microRNAs (miRNAs) target many genes for repression, but the nematode lsy-6 miRNA is much less proficient. Here we show that the low proficiency of lsy-6 can be recapitulated in HeLa cells and that miR-23, a mammalian miRNA, also has low proficiency in these cells. Reporter results and array data indicate two properties of these miRNAs that impart low proficiency: their weak predicted seed-pairing stability (SPS) and their high target-site abundance (TA). These two properties also explain differential propensities of small interfering RNAs (siRNAs) to repress unintended targets. Using these insights, we expand the TargetScan tool for quantitatively predicting miRNA regulation (and siRNA off-targeting) to model differential miRNA (and siRNA) proficiencies, thereby improving prediction performance. We propose that siRNAs designed to have both weaker SPS and higher TA will have fewer off-targets without compromised on-target activity.
Collapse
|
44
|
Abe H, Narusaka Y, Sasaki I, Hatakeyama K, Shin-I S, Narusaka M, Fukami-Kobayashi K, Matsumoto S, Kobayashi M. Development of full-length cDNAs from Chinese cabbage (Brassica rapa Subsp. pekinensis) and identification of marker genes for defence response. DNA Res 2011; 18:277-89. [PMID: 21745830 PMCID: PMC3158467 DOI: 10.1093/dnares/dsr018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2011] [Accepted: 05/25/2011] [Indexed: 11/13/2022] Open
Abstract
Arabidopsis belongs to the Brassicaceae family and plays an important role as a model plant for which researchers have developed fine-tuned genome resources. Genome sequencing projects have been initiated for other members of the Brassicaceae family. Among these projects, research on Chinese cabbage (Brassica rapa subsp. pekinensis) started early because of strong interest in this species. Here, we report the development of a library of Chinese cabbage full-length cDNA clones, the RIKEN BRC B. rapa full-length cDNA (BBRAF) resource, to accelerate research on Brassica species. We sequenced 10 000 BBRAF clones and confirmed 5476 independent clones. Most of these cDNAs showed high homology to Arabidopsis genes, but we also obtained more than 200 cDNA clones that lacked any sequence homology to Arabidopsis genes. We also successfully identified several possible candidate marker genes for plant defence responses from our analysis of the expression of the Brassica counterparts of Arabidopsis marker genes in response to salicylic acid and jasmonic acid. We compared gene expression of these markers in several Chinese cabbage cultivars. Our BBRAF cDNA resource will be publicly available from the RIKEN Bioresource Center and will help researchers to transfer Arabidopsis-related knowledge to Brassica crops.
Collapse
Affiliation(s)
- Hiroshi Abe
- Experimental Plant Division, Department of Biological Systems, RIKEN BioResource Center, Koyadai, Tsukuba, Ibaraki, Japan.
| | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Samuels ME. Saturation of the human phenome. Curr Genomics 2011; 11:482-99. [PMID: 21532833 PMCID: PMC3048311 DOI: 10.2174/138920210793175886] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2010] [Revised: 06/22/2010] [Accepted: 06/22/2010] [Indexed: 12/26/2022] Open
Abstract
The phenome is the complete set of phenotypes resulting from genetic variation in populations of an organism. Saturation of a phenome implies the identification and phenotypic description of mutations in all genes in an organism, potentially constrained to those encoding proteins. The human genome is believed to contain 20-25,000 protein coding genes, but only a small fraction of these have documented mutant phenotypes, thus the human phenome is far from complete. In model organisms, genetic saturation entails the identification of multiple mutant alleles of a gene or locus, allowing a consistent description of mutational phenotypes for that gene. Saturation of several model organisms has been attempted, usually by targeting annotated coding genes with insertional transposons (Drosophila melanogaster, Mus musculus) or by sequence directed deletion (Saccharomyces cerevisiae) or using libraries of antisense oligonucleotide probes injected directly into animals (Caenorhabditis elegans, Danio rerio). This paper reviews the general state of the human phenome, and discusses theoretical and practical considerations toward a saturation analysis in humans. Throughout, emphasis is placed on high penetrance genetic variation, of the kind typically asociated with monogenic versus complex traits.
Collapse
Affiliation(s)
- Mark E Samuels
- Centre de Recherche de Ste-Justine, 3175, Côte Ste-Catherine, Montréal QC H3T 1C5, Canada
| |
Collapse
|
46
|
Simola DF, Kim J. Sniper: improved SNP discovery by multiply mapping deep sequenced reads. Genome Biol 2011; 12:R55. [PMID: 21689413 PMCID: PMC3218843 DOI: 10.1186/gb-2011-12-6-r55] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2011] [Revised: 04/22/2011] [Accepted: 06/20/2011] [Indexed: 11/10/2022] Open
Abstract
SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at http://kim.bio.upenn.edu/software/sniper.shtml.
Collapse
Affiliation(s)
- Daniel F Simola
- Department of Biology, University of Pennsylvania, 433 S, University Ave., Philadelphia, PA 19104, USA
| | | |
Collapse
|
47
|
A computational approach to candidate gene prioritization for X-linked mental retardation using annotation-based binary filtering and motif-based linear discriminatory analysis. Biol Direct 2011; 6:30. [PMID: 21668950 PMCID: PMC3142252 DOI: 10.1186/1745-6150-6-30] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 06/13/2011] [Indexed: 01/07/2023] Open
Abstract
Background Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches - the examination of similarities to known disease genes and/or the evaluation of functional annotation of genes. Each of these approaches has its own caveats. Here we employ a previously described method of candidate gene prioritization based mainly on gene annotation, in accompaniment with a technique based on the evaluation of pertinent sequence motifs or signatures, in an attempt to refine the gene prioritization approach. We apply this approach to X-linked mental retardation (XLMR), a group of heterogeneous disorders for which some of the underlying genetics is known. Results The gene annotation-based binary filtering method yielded a ranked list of putative XLMR candidate genes with good plausibility of being associated with the development of mental retardation. In parallel, a motif finding approach based on linear discriminatory analysis (LDA) was employed to identify short sequence patterns that may discriminate XLMR from non-XLMR genes. High rates (>80%) of correct classification was achieved, suggesting that the identification of these motifs effectively captures genomic signals associated with XLMR vs. non-XLMR genes. The computational tools developed for the motif-based LDA is integrated into the freely available genomic analysis portal Galaxy (http://main.g2.bx.psu.edu/). Nine genes (APLN, ZC4H2, MAGED4, MAGED4B, RAP2C, FAM156A, FAM156B, TBL1X, and UXT) were highlighted as highly-ranked XLMR methods. Conclusions The combination of gene annotation information and sequence motif-orientated computational candidate gene prediction methods highlight an added benefit in generating a list of plausible candidate genes, as has been demonstrated for XLMR. Reviewers: This article was reviewed by Dr Barbara Bardoni (nominated by Prof Juergen Brosius); Prof Neil Smalheiser and Dr Dustin Holloway (nominated by Prof Charles DeLisi).
Collapse
|
48
|
Mattick JS. The central role of RNA in human development and cognition. FEBS Lett 2011; 585:1600-16. [DOI: 10.1016/j.febslet.2011.05.001] [Citation(s) in RCA: 149] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Accepted: 05/03/2011] [Indexed: 12/22/2022]
|
49
|
Matsumoto T, Tanaka T, Sakai H, Amano N, Kanamori H, Kurita K, Kikuta A, Kamiya K, Yamamoto M, Ikawa H, Fujii N, Hori K, Itoh T, Sato K. Comprehensive sequence analysis of 24,783 barley full-length cDNAs derived from 12 clone libraries. PLANT PHYSIOLOGY 2011; 156:20-8. [PMID: 21415278 PMCID: PMC3091036 DOI: 10.1104/pp.110.171579] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 03/16/2011] [Indexed: 05/18/2023]
Abstract
Full-length cDNA (FLcDNA) libraries consisting of 172,000 clones were constructed from a two-row malting barley cultivar (Hordeum vulgare 'Haruna Nijo') under normal and stressed conditions. After sequencing the clones from both ends and clustering the sequences, a total of 24,783 complete sequences were produced. By removing duplicates between these and publicly available sequences, 22,651 representative sequences were obtained: 17,773 were novel barley FLcDNAs, and 1,699 were barley specific. Highly conserved genes were found in the barley FLcDNA sequences for 721 of 881 rice (Oryza sativa) trait genes with 50% or greater identity. These FLcDNA resources from our Haruna Nijo cDNA libraries and the full-length sequences of representative clones will improve our understanding of the biological functions of genes in barley, which is the cereal crop with the fourth highest production in the world, and will provide a powerful tool for annotating the barley genome sequences that will become available in the near future.
Collapse
Affiliation(s)
- Takashi Matsumoto
- National Institute of Agrobiological Sciences, Tsukuba, Ibaraki 305-8602, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Arron ST, Ruby JG, Dybbro E, Ganem D, Derisi JL. Transcriptome sequencing demonstrates that human papillomavirus is not active in cutaneous squamous cell carcinoma. J Invest Dermatol 2011; 131:1745-53. [PMID: 21490616 PMCID: PMC3136639 DOI: 10.1038/jid.2011.91] [Citation(s) in RCA: 110] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Beta-papillomavirus (β-HPV) DNA is present in some cutaneous squamous cell carcinomas (cuSCC), but no mechanism of carcinogenesis has been determined. We used ultra-high throughput sequencing of the cancer transcriptome to assess whether papillomavirus transcripts are present in these cancers. Sixty-seven cuSCC samples were assayed for β-HPV DNA by PCR, and viral loads were measured with type-specific qPCR. Thirty-one SCCs were selected for whole transcriptome sequencing. Transcriptome libraries were prepared in parallel from the HPV18 positive HeLa cervical cancer cell line and HPV16 positive primary cervical and periungual SCC. Thirty percent (20/67) of the tumors were positive for β-HPV DNA, but there was no difference in β-HPV viral load between tumor and normal tissue (p=0.310). Immunosuppression and age were significantly associated with higher viral load (p=0.016 for immunosuppression; p=0.0004 for age). Transcriptome sequencing failed to identify papillomavirus expression in any of the skin tumors. In contrast, HPV 16 and 18 mRNA transcripts were readily identified in primary cervical and periungual cancers and HeLa cells. These data demonstrate that papillomavirus mRNA expression is not a factor in the maintenance of cuSCC.
Collapse
Affiliation(s)
- Sarah T Arron
- Department of Dermatology, University of California, San Francisco, San Francisco, CA 94143, USA
| | | | | | | | | |
Collapse
|