1
|
Ten simple rules for delivering live distance training in bioinformatics across the globe using webinars. PLoS Comput Biol 2018; 14:e1006419. [PMID: 30439935 PMCID: PMC6237289 DOI: 10.1371/journal.pcbi.1006419] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
|
2
|
Genetic variability of the activity of bidirectional promoters: a pilot study in bovine muscle. DNA Res 2017; 24:221-233. [PMID: 28338730 PMCID: PMC5499805 DOI: 10.1093/dnares/dsx004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 01/24/2017] [Indexed: 11/25/2022] Open
Abstract
Bidirectional promoters are regulatory regions co-regulating the expression of two neighbouring genes organized in a head-to-head orientation. In recent years, these regulatory regions have been studied in many organisms; however, no investigation to date has been done to analyse the genetic variation of the activity of this type of promoter regions. In our study, we conducted an investigation to first identify bidirectional promoters sharing genes expressed in bovine Longissimus thoracis and then to find genetic variants affecting the activity of some of these bidirectional promoters. Combining bovine gene information and expression data obtained using RNA-Seq, we identified 120 putative bidirectional promoters active in bovine muscle. We experimentally validated in vitro 16 of these bidirectional promoters. Finally, using gene expression and whole-genome genotyping data, we explored the variability of the activity in muscle of the identified bidirectional promoters and discovered genetic variants affecting their activity. We found that the expression level of 77 genes is correlated with the activity of 12 bidirectional promoters. We also identified 57 single nucleotide polymorphisms associated with the activity of 5 bidirectional promoters. To our knowledge, our study is the first analysis in any species of the genetic variability of the activity of bidirectional promoters.
Collapse
|
3
|
|
4
|
A Systems Biology Approach to Reveal Putative Host-Derived Biomarkers of Periodontitis by Network Topology Characterization of MMP-REDOX/NO and Apoptosis Integrated Pathways. Front Cell Infect Microbiol 2016; 5:102. [PMID: 26793622 PMCID: PMC4707239 DOI: 10.3389/fcimb.2015.00102] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 12/15/2015] [Indexed: 01/27/2023] Open
Abstract
Periodontitis, a formidable global health burden, is a common chronic disease that destroys tooth-supporting tissues. Biomarkers of the early phase of this progressive disease are of utmost importance for global health. In this context, saliva represents a non-invasive biosample. By using systems biology tools, we aimed to (1) identify an integrated interactome between matrix metalloproteinase (MMP)-REDOX/nitric oxide (NO) and apoptosis upstream pathways of periodontal inflammation, and (2) characterize the attendant topological network properties to uncover putative biomarkers to be tested in saliva from patients with periodontitis. Hence, we first generated a protein-protein network model of interactions ("BIOMARK" interactome) by using the STRING 10 database, a search tool for the retrieval of interacting genes/proteins, with "Experiments" and "Databases" as input options and a confidence score of 0.400. Second, we determined the centrality values (closeness, stress, degree or connectivity, and betweenness) for the "BIOMARK" members by using the Cytoscape software. We found Ubiquitin C (UBC), Jun proto-oncogene (JUN), and matrix metalloproteinase-14 (MMP14) as the most central hub- and non-hub-bottlenecks among the 211 genes/proteins of the whole interactome. We conclude that UBC, JUN, and MMP14 are likely an optimal candidate group of host-derived biomarkers, in combination with oral pathogenic bacteria-derived proteins, for detecting periodontitis at its early phase by using salivary samples from patients. These findings therefore have broader relevance for systems medicine in global health as well.
Collapse
|
5
|
Analyses of Gingival Adhesion Molecules in Periodontitis: Theoretical In Silico, Comparative In Vivo, and Explanatory In Vitro Models. J Periodontol 2015; 87:193-202. [PMID: 26430925 DOI: 10.1902/jop.2015.150361] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
BACKGROUND A deeper understanding of periodontitis pathophysiology is central to future development of novel biomarkers and therapeutics. The following is reported here: 1) an in silico network model of interactions among cell adhesion molecules and a network-focused microarray analysis of the corresponding genes in periodontitis; 2) analysis of secretions of adhesion molecules in gingival tissue samples from patients with periodontitis and healthy controls; and 3) effect of the human neutrophilic peptide-1 (HNP-1) on epithelial adhesion molecules. METHODS The network model identified 85 nodes in relation to the interactions of adhesion molecules. Subsequently, the relative gene expression was overlaid on the network model. Differential gene expression was analyzed, and false discovery rate control was performed for statistical assessment of the microarray data. Both tissue and cell culture samples were immunostained for desmocollin (DSC)2, occludin (OCLN), desmoglein (DSG)1, tight junction protein 2, and gap junction protein α. RESULTS The differential gene expression analysis revealed that the epithelial adhesion molecules were significantly lower in abundance in individuals with periodontitis than controls. In contrast, the genes for leukocyte adhesion molecules showed a significant upregulation. Immunostainings revealed elevated secretions of both DSG1 and OCLN in periodontitis. An in vitro model suggested reduced DSC2 and OCLN secretions in the presence of HNP-1. CONCLUSIONS Gene expression of gingival adhesion molecules in periodontitis is regulated by leukocyte transmigration, whereas the neutrophilic antimicrobial peptide HNP-1 is noted as a putative regulator of epithelial adhesion molecules. These observations contribute to the key mechanisms by which future biomarkers might be developed for periodontitis.
Collapse
|
6
|
Bayesian phylogeny analysis of vertebrate serpins illustrates evolutionary conservation of the intron and indels based six groups classification system from lampreys for ∼500 MY. PeerJ 2015; 3:e1026. [PMID: 26157611 PMCID: PMC4476131 DOI: 10.7717/peerj.1026] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2015] [Accepted: 05/26/2015] [Indexed: 11/20/2022] Open
Abstract
The serpin superfamily is characterized by proteins that fold into a conserved tertiary structure and exploits a sophisticated and irreversible suicide-mechanism of inhibition. Vertebrate serpins are classified into six groups (V1-V6), based on three independent biological features-genomic organization, diagnostic amino acid sites and rare indels. However, this classification system was based on the limited number of mammalian genomes available. In this study, several non-mammalian genomes are used to validate this classification system using the powerful Bayesian phylogenetic method. This method supports the intron and indel based vertebrate classification and proves that serpins have been maintained from lampreys to humans for about 500 MY. Lampreys have fewer than 10 serpins, which expand into 36 serpins in humans. The two expanding groups V1 and V2 have SERPINB1/SERPINB6 and SERPINA8/SERPIND1 as the ancestral serpins, respectively. Large clusters of serpins are formed by local duplications of these serpins in tetrapod genomes. Interestingly, the ancestral HCII/SERPIND1 locus (nested within PIK4CA) possesses group V4 serpin (A2APL1, homolog of α 2-AP/SERPINF2) of lampreys; hence, pointing to the fact that group V4 might have originated from group V2. Additionally in this study, details of the phylogenetic history and genomic characteristics of vertebrate serpins are revisited.
Collapse
|
7
|
Roles of small RNAs in the effects of nutrition on apoptosis and spermatogenesis in the adult testis. Sci Rep 2015; 5:10372. [PMID: 25996545 PMCID: PMC4440528 DOI: 10.1038/srep10372] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 04/10/2015] [Indexed: 12/16/2022] Open
Abstract
We tested whether reductions in spermatozoal quality induced by under-nutrition are associated with increased germ cell apoptosis and disrupted spermatogenesis, and whether these effects are mediated by small RNAs. Groups of 8 male sheep were fed for a 10% increase or 10% decrease in body mass over 65 days. Underfeeding increased the number of apoptotic germ cells (P < 0.05) and increased the expression of apoptosis-related genes (P < 0.05) in testicular tissue. We identified 44 miRNAs and 35 putative piRNAs that were differentially expressed in well-fed and underfed males (FDR < 0.05). Some were related to reproductive system development, apoptosis (miRNAs), and sperm production and quality (piRNAs). Novel-miR-144 (miR-98), was found to target three apoptotic genes (TP53, CASP3, FASL). The proportion of miRNAs as a total of small RNAs was greater in well-fed males than in underfed males (P < 0.05) and was correlated (r = 0.8, P < 0.05) with the proportion of piRNAs in well-fed and underfed males. In conclusion, the reductions in spermatozoal quality induced by under-nutrition are caused, at least partly, by disruptions to Sertoli cell function and increased germ cell apoptosis, mediated by changes in the expression of miRNAs and piRNAs.
Collapse
|
8
|
Association of C5aR1genetic polymorphisms with coronary artery disease in a Han population in Xinjiang, China. Diagn Pathol 2015; 10:33. [PMID: 25924896 PMCID: PMC4414445 DOI: 10.1186/s13000-015-0261-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 04/06/2015] [Indexed: 11/17/2022] Open
Abstract
Background Complement 5a receptor (C5aR) was demonstrated a receptor of complement 5a (C5a) which is involved in many inflammatory diseases. The functional responses attributed to C5a results from its interaction with its receptors C5aR, which stimulates food intake, plays a role in increasing the inflammatory response in adipose tissue as well as the cardiovascular and neural systems. However, There are unknown associations between the SNPs of C5aR1 gene and coronary artery disease (CAD). Methods We examined the role of the tagging single nucleotide polymorphisms (SNPs) of C5aR1 gene for CAD using a case–control design, and determined the prevalence of C5aR1 genotypes in 505 CAD patients and 469 age and sex-matched healthy control subjects of Han population. Results The rs10853784 was found to be associated with CAD in dominant model (CC vs TT + CT, P = 0.004). The difference remained statistically significant after multivariate adjustment (OR = 1.430, 95% CI: 1.087 ~ 1.882, P = 0.011). There was no significant difference in genotype distributions of rs4577202 and rs7250152 between CAD patients and control subjects. The frequency of the haplotype (A-T-C) was significantly higher in the CAD patients than in the controls (P = 0.035), and the haplotype (A-C-T) was significantly lower in the CAD patients than in the control subjects in Chinese Han population (P = 0.002). Conclusion The results of this study indicate that rs10853784 of C5aR1 gene are associated with CAD in Han population of China, and A-C-T haplotypes may be protective genetic marker and the A-T-C may be risk genetic marker for CAD in Chinese Han population. Virtual slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/2054871241495194.
Collapse
|
9
|
Gene Expression Profile of NF-κB, Nrf2, Glycolytic, and p53 Pathways During the SH-SY5Y Neuronal Differentiation Mediated by Retinoic Acid. Mol Neurobiol 2014; 53:423-435. [PMID: 25465239 DOI: 10.1007/s12035-014-8998-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2014] [Accepted: 11/12/2014] [Indexed: 01/17/2023]
Abstract
SH-SY5Y cells, a neuroblastoma cell line that is a well-established model system to study the initial phases of neuronal differentiation, have been used in studies to elucidate the mechanisms of neuronal differentiation. In the present study, we investigated alterations of gene expression in SH-SY5Y cells during neuronal differentiation mediated by retinoic acid (RA) treatment. We evaluated important pathways involving nuclear factor kappa B (NF-κB), nuclear E2-related factor 2 (Nrf2), glycolytic, and p53 during neuronal differentiation. We also investigated the involvement of reactive oxygen species (ROS) in modulating the gene expression profile of those pathways by antioxidant co-treatment with Trolox®, a hydrophilic analogue of α-tocopherol. We found that RA treatment increases levels of gene expression of NF-κB, glycolytic, and antioxidant pathway genes during neuronal differentiation of SH-SY5Y cells. We also found that ROS production induced by RA treatment in SH-SY5Y cells is involved in gene expression profile alterations, chiefly in NF-κB, and glycolytic pathways. Antioxidant co-treatment with Trolox® reversed the effects mediated by RA NF-κB, and glycolytic pathways gene expression. Interestingly, co-treatment with Trolox® did not reverse the effects in antioxidant gene expression mediated by RA in SH-SY5Y. To confirm neuronal differentiation, we quantified endogenous levels of tyrosine hydroxylase, a recognized marker of neuronal differentiation. Our data suggest that during neuronal differentiation mediated by RA, changes in profile gene expression of important pathways occur. These alterations are in part mediated by ROS production. Therefore, our results reinforce the importance in understanding the mechanism by which RA induces neuronal differentiation in SH-SY5Y cells, principally due this model being commonly used as a neuronal cell model in studies of neuronal pathologies.
Collapse
|
10
|
Influence of component 5a receptor 1 (C5AR1) −1330T/G polymorphism on nonsedating H1-antihistamines therapy in Chinese patients with chronic spontaneous urticaria. J Dermatol Sci 2014; 76:240-5. [DOI: 10.1016/j.jdermsci.2014.09.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Revised: 09/22/2014] [Accepted: 09/27/2014] [Indexed: 11/22/2022]
|
11
|
Comprehensive analysis of microRNA genomic loci identifies pervasive repetitive-element origins. Mob Genet Elements 2014; 1:8-17. [PMID: 22016841 DOI: 10.4161/mge.1.1.15766] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Revised: 04/06/2011] [Accepted: 04/06/2011] [Indexed: 11/19/2022] Open
Abstract
MicroRNAs (miRs) are small non-coding RNAs that generally function as negative regulators of target messenger RNAs (mRNAs) at the posttranscriptional level. MiRs bind to the 3'UTR of target mRNAs through complementary base pairing, resulting in target mRNA cleavage or translation repression. To date, over 15,000 distinct miRs have been identified in organisms ranging from viruses to man and interest in miR research continues to intensify. Of note, the most enlightening aspect of miR function-the mRNAs they target-continues to be elusive. Descriptions of the molecular origins of independent miR molecules currently support the hypothesis that miR hairpin generation is based on the adjacent insertion of two related transposable elements (TEs) at one genomic locus. Thus transcription across such TE interfaces establishes many, if not the majority of functional miRs. The implications of these findings are substantial for understanding how TEs confer increased genomic fitness, describing miR transcriptional regulations and making accurate miR target predictions. In this work, we have performed a comprehensive analysis of the genomic events responsible for the formation of all currently annotated miR loci. We find that the connection between miRs and transposable elements is more significant than previously appreciated, and more broadly, supports an important role for repetitive elements in miR origin, expression and regulatory network formation. Further, we demonstrate the utility of these findings in miR target prediction. Our results greatly expand the existing repertoire of defined miR origins, detailing the formation of 2,392 of 15,176 currently recognized miR genomic loci and supporting a mobile genetic element model for the genomic establishment of functional miRs.
Collapse
|
12
|
Abstract
MicroRNAs coordinate networks of mRNAs, but predicting specific sites of interactions is complicated by the very few bases of complementarity needed for regulation. Although efforts to characterize the specific requirements for microRNA (miR) regulation have made some advances, no general model of target recognition has been widely accepted. In this work, we describe an entirely novel approach to miR target identification. The genomic events responsible for the creation of individual miR loci have now been described with many miRs now known to have been initially formed from transposable element (TE) sequences. In light of this, we propose that limiting miR target searches to transcripts containing a miR's progenitor TE can facilitate accurate target identification. In this report we outline the methodology behind OrbId (Origin-based identification of microRNA targets). In stark contrast to the principal miR target algorithms (which rely heavily on target site conservation across species and are therefore most effective at predicting targets for older miRs), we find OrbId is particularly efficacious at predicting the mRNA targets of miRs formed more recently in evolutionary time. After defining the TE origins of > 200 human miRs, OrbId successfully generated likely target sets for 191 predominately primate-specific human miR loci. While only a handful of the loci examined were well enough conserved to have been previously evaluated by existing algorithms, we find ~80% of the targets for the oldest miR (miR-28) in our analysis contained within the principal Diana and TargetScan prediction sets. More importantly, four of the 15 OrbId miR-28 putative targets have been previously verified experimentally. In light of OrbId proving best-suited for predicting targets for more recently formed miRs, we suggest OrbId makes a logical complement to existing, conservation based, miR target algorithms.
Collapse
|
13
|
Pregnancy-induced gingivitis and OMICS in dentistry: in silico modeling and in vivo prospective validation of estradiol-modulated inflammatory biomarkers. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:582-90. [PMID: 24983467 DOI: 10.1089/omi.2014.0020] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Pregnancy-associated gingivitis is a bacterial-induced inflammatory disease with a remarkably high prevalence ranging from 35% to 100% across studies. Yet little is known about the attendant mechanisms or diagnostic biomarkers that can help predict individual susceptibility for rational personalized medicine. We aimed to define inflammatory proteins in saliva, induced or inhibited by estradiol, as early diagnostic biomarkers or target proteins in relation to pregnancy-associated gingivitis. An in silico gene/protein interaction network model was developed by using the STITCH 3.1 with "experiments" and "databases" as input options and a confidence score of 0.700 (high confidence). Salivary estradiol, interleukin (IL)-1β and -8, myeloperoxidase (MPO), matrix metalloproteinase (MMP)-2, -8, and -9, and tissue inhibitor of matrix metalloproteinase (TIMP)-1 levels from 30 women were measured prospectively three times during pregnancy and twice during postpartum. In silico analysis revealed that estradiol interacts with IL-1β and -8 by an activation link when the "actions view" was consulted. In saliva, estradiol concentrations associated positively with TIMP-1 and negatively with MPO and MMP-8 concentrations. When the gingival bleeding on probing percentage (BOP%) was included in the model as an effect modifier, the only association, a negative one, was found between estradiol and MMP-8. Throughout gestation, estradiol modulates the inflammatory response by inhibiting neutrophilic enzymes, such as MMP-8. The interactions between salivary degradative enzymes and proinflammatory cytokines during pregnancy suggest promising ways to identify candidate biomarkers for pregnancy-associated gingivitis, and for personalized medicine in the field of dentistry. Finally, we call for greater investments in, and action for biomarker research in periodontology and dentistry that have surprisingly lagged behind in personalized medicine compared to other fields, such as cancer research.
Collapse
|
14
|
Comparative evolutionary genomics of the STAT family of transcription factors. JAKSTAT 2014; 1:23-33. [PMID: 24058748 PMCID: PMC3670131 DOI: 10.4161/jkst.19418] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Revised: 01/16/2012] [Accepted: 01/19/2012] [Indexed: 01/23/2023] Open
Abstract
The STAT signaling pathway is one of the seven common pathways that govern cell fate decisions during animal development. Comparative genomics revealed multiple incidences of stat gene duplications throughout metazoan evolutionary history. While pseudogenization is a frequent fate of duplicated genes, many of these STAT duplications evolved into novel genes through rapid sequence diversification and neofunctionalization. Additionally, the core of STAT gene regulatory networks, comprising stat1 through 4, stat5 and stat6, arose early in vertebrate evolution, probably through the two whole genome duplication events that occurred after the split of Cephalochordates but before the rise of Chondrichthyes. While another complete genome duplication event took place during the evolution of bony fish after their separation from the tetrapods about 450 million years ago (Mya), modern fish have only one set of these core stats, suggesting the rapid loss of most duplicated stat genes. The two stat5 genes in mammals likely arose from a duplication event in early Eutherian evolution, a period from about 310 Mya at the avian-mammal divergence to the separation of marsupials from other mammals about 130 Mya. These analyses indicate that whole genome duplications and gene duplications by unequal chromosomal crossing over were likely the major mechanisms underlying the evolution of STATs.
Collapse
|
15
|
Abstract
The computational prediction of alternative splicing from high-throughput sequencing data is inherently difficult and necessitates robust statistical measures because the differential splicing signal is overlaid by influencing factors such as gene expression differences and simultaneous expression of multiple isoforms amongst others. In this work we describe ARH-seq, a discovery tool for differential splicing in case–control studies that is based on the information-theoretic concept of entropy. ARH-seq works on high-throughput sequencing data and is an extension of the ARH method that was originally developed for exon microarrays. We show that the method has inherent features, such as independence of transcript exon number and independence of differential expression, what makes it particularly suited for detecting alternative splicing events from sequencing data. In order to test and validate our workflow we challenged it with publicly available sequencing data derived from human tissues and conducted a comparison with eight alternative computational methods. In order to judge the performance of the different methods we constructed a benchmark data set of true positive splicing events across different tissues agglomerated from public databases and show that ARH-seq is an accurate, computationally fast and high-performing method for detecting differential splicing events.
Collapse
|
16
|
Ensemble-based classification approach for micro-RNA mining applied on diverse metagenomic sequences. BMC Res Notes 2014; 7:286. [PMID: 24884968 PMCID: PMC4051165 DOI: 10.1186/1756-0500-7-286] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2013] [Accepted: 04/22/2014] [Indexed: 01/23/2023] Open
Abstract
Background MicroRNAs (miRNAs) are endogenous ∼22 nt RNAs that are identified in many species as powerful regulators of gene expressions. Experimental identification of miRNAs is still slow since miRNAs are difficult to isolate by cloning due to their low expression, low stability, tissue specificity and the high cost of the cloning procedure. Thus, computational identification of miRNAs from genomic sequences provide a valuable complement to cloning. Different approaches for identification of miRNAs have been proposed based on homology, thermodynamic parameters, and cross-species comparisons. Results The present paper focuses on the integration of miRNA classifiers in a meta-classifier and the identification of miRNAs from metagenomic sequences collected from different environments. An ensemble of classifiers is proposed for miRNA hairpin prediction based on four well-known classifiers (Triplet SVM, Mipred, Virgo and EumiR), with non-identical features, and which have been trained on different data. Their decisions are combined using a single hidden layer neural network to increase the accuracy of the predictions. Our ensemble classifier achieved 89.3% accuracy, 82.2% f–measure, 74% sensitivity, 97% specificity, 92.5% precision and 88.2% negative predictive value when tested on real miRNA and pseudo sequence data. The area under the receiver operating characteristic curve of our classifier is 0.9 which represents a high performance index. The proposed classifier yields a significant performance improvement relative to Triplet-SVM, Virgo and EumiR and a minor refinement over MiPred. The developed ensemble classifier is used for miRNA prediction in mine drainage, groundwater and marine metagenomic sequences downloaded from the NCBI sequence reed archive. By consulting the miRBase repository, 179 miRNAs have been identified as highly probable miRNAs. Our new approach could thus be used for mining metagenomic sequences and finding new and homologous miRNAs. Conclusions The paper investigates a computational tool for miRNA prediction in genomic or metagenomic data. It has been applied on three metagenomic samples from different environments (mine drainage, groundwater and marine metagenomic sequences). The prediction results provide a set of extremely potential miRNA hairpins for cloning prediction methods. Among the ensemble prediction obtained results there are pre-miRNA candidates that have been validated using miRbase while they have not been recognized by some of the base classifiers.
Collapse
|
17
|
Host susceptibility to malaria in human and mice: compatible approaches to identify potential resistant genes. Physiol Genomics 2014; 46:1-16. [DOI: 10.1152/physiolgenomics.00044.2013] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
There is growing evidence for human genetic factors controlling the outcome of malaria infection, while molecular basis of this genetic control is still poorly understood. Case-control and family-based studies have been carried out to identify genes underlying host susceptibility to malarial infection. Parasitemia and mild malaria have been genetically linked to human chromosomes 5q31-q33 and 6p21.3, and several immune genes located within those regions have been associated with malaria-related phenotypes. Association and linkage studies of resistance to malaria are not easy to carry out in human populations, because of the difficulty in surveying a significant number of families. Murine models have proven to be an excellent genetic tool for studying host response to malaria; their use allowed mapping 14 resistance loci, eight of them controlling parasitic levels and six controlling cerebral malaria. Once quantitative trait loci or genes have been identified, the human ortholog may then be identified. Comparative mapping studies showed that a couple of human and mouse might share similar genetically controlled mechanisms of resistance. In this way, char8, which controls parasitemia, was mapped on chromosome 11; char8 corresponds to human chromosome 5q31-q33 and contains immune genes, such as Il3, Il4, Il5, Il12b, Il13, Irf1, and Csf2. Nevertheless, part of the genetic factors controlling malaria traits might differ in both hosts because of specific host-pathogen interactions. Finally, novel genetic tools including animal models were recently developed and will offer new opportunities for identifying genetic factors underlying host phenotypic response to malaria, which will help in better therapeutic strategies including vaccine and drug development.
Collapse
|
18
|
International Union of Basic and Clinical Pharmacology. [corrected]. LXXXVII. Complement peptide C5a, C4a, and C3a receptors. Pharmacol Rev 2013; 65:500-43. [PMID: 23383423 DOI: 10.1124/pr.111.005223] [Citation(s) in RCA: 178] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The activation of the complement cascade, a cornerstone of the innate immune response, produces a number of small (74-77 amino acid) fragments, originally termed anaphylatoxins, that are potent chemoattractants and secretagogues that act on a wide variety of cell types. These fragments, C5a, C4a, and C3a, participate at all levels of the immune response and are also involved in other processes such as neural development and organ regeneration. Their primary function, however, is in inflammation, so they are important targets for the development of antiinflammatory therapies. Only three receptors for complement peptides have been found, but there are no satisfactory antagonists as yet, despite intensive investigation. In humans, there is a single receptor for C3a (C3a receptor), no known receptor for C4a, and two receptors for C5a (C5a₁ receptor and C5a₂ receptor). The most recently characterized receptor, the C5a₂ receptor (previously known as C5L2 or GPR77), has been regarded as a passive binding protein, but signaling activities are now ascribed to it, so we propose that it be formally identified as a receptor and be given a name to reflect this. Here, we describe the complex biology of the complement peptides, introduce a new suggested nomenclature, and review our current knowledge of receptor pharmacology.
Collapse
|
19
|
Major components of energy drinks (caffeine, taurine, and guarana) exert cytotoxic effects on human neuronal SH-SY5Y cells by decreasing reactive oxygen species production. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2013; 2013:791795. [PMID: 23766861 PMCID: PMC3674721 DOI: 10.1155/2013/791795] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2013] [Accepted: 03/16/2013] [Indexed: 01/06/2023]
Abstract
Scope. To elucidate the morphological and biochemical in vitro effects exerted by caffeine, taurine, and guarana, alone or in combination, since they are major components in energy drinks (EDs). Methods and Results. On human neuronal SH-SY5Y cells, caffeine (0.125–2 mg/mL), taurine (1–16 mg/mL), and guarana (3.125–50 mg/mL) showed concentration-dependent nonenzymatic antioxidant potential, decreased the basal levels of free radical generation, and reduced both superoxide dismutase (SOD) and catalase (CAT) activities, especially when combined together. However, guarana-treated cells developed signs of neurite degeneration in the form of swellings at various segments in a beaded or pearl chain-like appearance and fragmentation of such neurites at concentrations ranging from 12.5 to 50 mg/mL. Swellings, but not neuritic fragmentation, were detected when cells were treated with 0.5 mg/mL (or higher doses) of caffeine, concentrations that are present in EDs. Cells treated with guarana also showed qualitative signs of apoptosis, including membrane blebbing, cell shrinkage, and cleaved caspase-3 positivity. Flow cytometric analysis confirmed that cells treated with 12.5–50 mg/mL of guarana and its combinations with caffeine and/or taurine underwent apoptosis. Conclusion. Excessive removal of intracellular reactive oxygen species, to nonphysiological levels (or “antioxidative stress”), could be a cause of in vitro toxicity induced by these drugs.
Collapse
|
20
|
MMPREDOX/NO Interplay in Periodontitis and Its Inhibition withSatureja hortensisL. Essential Oil. Chem Biodivers 2013; 10:507-23. [DOI: 10.1002/cbdv.201200375] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2012] [Indexed: 11/07/2022]
|
21
|
The western painted turtle genome, a model for the evolution of extreme physiological adaptations in a slowly evolving lineage. Genome Biol 2013; 14:R28. [PMID: 23537068 PMCID: PMC4054807 DOI: 10.1186/gb-2013-14-3-r28] [Citation(s) in RCA: 228] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Revised: 03/15/2013] [Accepted: 03/28/2013] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND We describe the genome of the western painted turtle, Chrysemys picta bellii, one of the most widespread, abundant, and well-studied turtles. We place the genome into a comparative evolutionary context, and focus on genomic features associated with tooth loss, immune function, longevity, sex differentiation and determination, and the species' physiological capacities to withstand extreme anoxia and tissue freezing. RESULTS Our phylogenetic analyses confirm that turtles are the sister group to living archosaurs, and demonstrate an extraordinarily slow rate of sequence evolution in the painted turtle. The ability of the painted turtle to withstand complete anoxia and partial freezing appears to be associated with common vertebrate gene networks, and we identify candidate genes for future functional analyses. Tooth loss shares a common pattern of pseudogenization and degradation of tooth-specific genes with birds, although the rate of accumulation of mutations is much slower in the painted turtle. Genes associated with sex differentiation generally reflect phylogeny rather than convergence in sex determination functionality. Among gene families that demonstrate exceptional expansions or show signatures of strong natural selection, immune function and musculoskeletal patterning genes are consistently over-represented. CONCLUSIONS Our comparative genomic analyses indicate that common vertebrate regulatory networks, some of which have analogs in human diseases, are often involved in the western painted turtle's extraordinary physiological capacities. As these regulatory pathways are analyzed at the functional level, the painted turtle may offer important insights into the management of a number of human health disorders.
Collapse
|
22
|
Database tools in genetic diseases research. Genomics 2013; 101:75-85. [DOI: 10.1016/j.ygeno.2012.11.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Revised: 10/26/2012] [Accepted: 11/01/2012] [Indexed: 01/22/2023]
|
23
|
Genomic perspectives in inter-individual adverse responses following nanomedicine administration: The way forward. Adv Drug Deliv Rev 2012; 64:1385-93. [PMID: 22634158 DOI: 10.1016/j.addr.2012.05.010] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 05/17/2012] [Indexed: 01/18/2023]
Abstract
The underlying mechanism of intravenous infusion-related adverse reactions inherent to regulatory-approved nanomedicines still remains elusive. There are substantial inter-individual differences in observed adverse reactions, which may include cardiovascular, broncho-pulmonary, muco-cutaneous, neuro-psychosomatic and autonomic manifestations. Although nanomedicine-mediated triggering of complement activation has been suggested to be a significant contributing factor to these adverse events, complement activation may still proceed in non-responders. Whether these reactions share similar immunological mechanisms and underpinning genetic factors with drug hypersensitivity syndrome remains to be investigated. Genetic association studies could be a powerful tool to dissect causative factors and reveal the multiple molecular pathways that induce infusion related adverse reactions. It is envisaged that such research may lead to the design of reliable in vitro profiling tests for risk assessment and treatment decisions, thereby revolutionizing the practice of medicine with nanopharmaceuticals. Such procedures may further improve regulatory approval processes for nanomedicines currently in the pipeline and decrease the overall cost of health care. Here we discuss some key innate immunity genes and their polymorphisms in relation to nanomedicine infusion-mediated symptomatic responses.
Collapse
|
24
|
Gene duplicability-connectivity-complexity across organisms and a neutral evolutionary explanation. PLoS One 2012; 7:e44491. [PMID: 22984517 PMCID: PMC3439388 DOI: 10.1371/journal.pone.0044491] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Accepted: 08/02/2012] [Indexed: 02/02/2023] Open
Abstract
Gene duplication has long been acknowledged by biologists as a major evolutionary force shaping genomic architectures and characteristics across the Tree of Life. Major research has been conducting on elucidating the fate of duplicated genes in a variety of organisms, as well as factors that affect a gene’s duplicability–that is, the tendency of certain genes to retain more duplicates than others. In particular, two studies have looked at the correlation between gene duplicability and its degree in a protein-protein interaction network in yeast, mouse, and human, and another has looked at the correlation between gene duplicability and its complexity (length, number of domains, etc.) in yeast. In this paper, we extend these studies to six species, and two trends emerge. There is an increase in the duplicability-connectivity correlation that agrees with the increase in the genome size as well as the phylogenetic relationship of the species. Further, the duplicability-complexity correlation seems to be constant across the species. We argue that the observed correlations can be explained by neutral evolutionary forces acting on the genomic regions containing the genes. For the duplicability-connectivity correlation, we show through simulations that an increasing trend can be obtained by adjusting parameters to approximate genomic characteristics of the respective species. Our results call for more research into factors, adaptive and non-adaptive alike, that determine a gene’s duplicability.
Collapse
|
25
|
Transcriptomic signature of Leishmania infected mice macrophages: a metabolic point of view. PLoS Negl Trop Dis 2012; 6:e1763. [PMID: 22928052 PMCID: PMC3424254 DOI: 10.1371/journal.pntd.0001763] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Accepted: 06/21/2012] [Indexed: 11/23/2022] Open
Abstract
We analyzed the transcriptional signatures of mouse bone marrow-derived macrophages at different times after infection with promastigotes of the protozoan parasite Leishmania major. Ingenuity Pathway Analysis revealed that the macrophage metabolic pathways including carbohydrate and lipid metabolisms were among the most altered pathways at later time points of infection. Indeed, L. major promastiogtes induced increased mRNA levels of the glucose transporter and almost all of the genes associated with glycolysis and lactate dehydrogenase, suggesting a shift to anaerobic glycolysis. On the other hand, L. major promastigotes enhanced the expression of scavenger receptors involved in the uptake of Low-Density Lipoprotein (LDL), inhibited the expression of genes coding for proteins regulating cholesterol efflux, and induced the synthesis of triacylglycerides. These data suggested that Leishmania infection disturbs cholesterol and triglycerides homeostasis and may lead to cholesterol accumulation and foam cell formation. Using Filipin and Bodipy staining, we showed cholesterol and triglycerides accumulation in infected macrophages. Moreover, Bodipy-positive lipid droplets accumulated in close proximity to parasitophorous vacuoles, suggesting that intracellular L. major may take advantage of these organelles as high-energy substrate sources. While the effect of infection on cholesterol accumulation and lipid droplet formation was independent on parasite development, our data indicate that anaerobic glycolysis is actively induced by L. major during the establishment of infection. Leishmania are obligated intracellular pathogens that develop almost exclusively in macrophages. Experimental leishmaniasis in mice is one of the most extensively studied models of intracellular infections both at the level of the parasite and host immune responses. We took advantage of Balb/c mice model to investigate gene expression profile through Affymetrix oligonucleotide arrays. In order to have a general and dynamic picture of the complex biological events that are acting in the context of Leishmania intracellular parasitism, we investigated the mouse macrophage response to initial invasion of L. major over a time course that extended from one to 24 hours post-infection. Our results reveal the alteration of several biological processes and metabolic changes. Indeed, similarly to different other pathogens, Leishmania induces cholesterol accumulation and foam cell formation that have been confirmed by confocal microscopy experiments. Whether Leishmania parasites take advantage of this high-energy source is now under investigation. Our findings provided further understandings in host responses to Leishmania infection.
Collapse
|
26
|
Nested Hierarchal Organization of Conservation for MicroRNAs and Their Putative Targets to Drosophila melanogaster. Chem Biodivers 2012; 9:945-64. [DOI: 10.1002/cbdv.201100358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
27
|
Characterization of rainbow trout gonad, brain and gill deep cDNA repertoires using a Roche 454-Titanium sequencing approach. Gene 2012; 500:32-9. [PMID: 22465513 DOI: 10.1016/j.gene.2012.03.053] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2011] [Revised: 03/09/2012] [Accepted: 03/12/2012] [Indexed: 11/23/2022]
Abstract
Rainbow trout, Oncorhynchus mykiss, is an important aquaculture species worldwide and, in addition to being of commercial interest, it is also a research model organism of considerable scientific importance. Because of the lack of a whole genome sequence in that species, transcriptomic analyses of this species have often been hindered. Using next-generation sequencing (NGS) technologies, we sought to fill these informational gaps. Here, using Roche 454-Titanium technology, we provide new tissue-specific cDNA repertoires from several rainbow trout tissues. Non-normalized cDNA libraries were constructed from testis, ovary, brain and gill rainbow trout tissue samples, and these different libraries were sequenced in 10 separate half-runs of 454-Titanium. Overall, we produced a total of 3million quality sequences with an average size of 328bp, representing more than 1Gb of expressed sequence information. These sequences have been combined with all publicly available rainbow trout sequences, resulting in a total of 242,187 clusters of putative transcript groups and 22,373 singletons. To identify the predominantly expressed genes in different tissues of interest, we developed a Digital Differential Display (DDD) approach. This approach allowed us to characterize the genes that are predominantly expressed within each tissue of interest. Of these genes, some were already known to be tissue-specific, thereby validating our approach. Many others, however, were novel candidates, demonstrating the usefulness of our strategy and of such tissue-specific resources. This new sequence information, acquired using NGS 454-Titanium technology, deeply enriched our current knowledge of the expressed genes in rainbow trout through the identification of an increased number of tissue-specific sequences. This identification allowed a precise cDNA tissue repertoire to be characterized in several important rainbow trout tissues. The rainbow trout contig browser can be accessed at the following publicly available web site (http://www.sigenae.org/).
Collapse
|
28
|
Using gene expression information obtained by quantitative real-time PCR to evaluate Angus bulls divergently selected for feed efficiency. ANIMAL PRODUCTION SCIENCE 2012. [DOI: 10.1071/an12098] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Residual feed intake (RFI) is a measure of feed efficiency in beef cattle. Young Angus bulls from lines of cattle divergently selected for RFI were used in a gene expression profiling study of the liver. Quantitative real-time PCR (qPCR) assay was used to quantify the differentially expressed genes and the information was used to examine the relationships between the genes and RFI and to classify the bulls into their respective RFI group. Gene expression of 21 genes in liver biopsies from 22 low RFI and 22 high RFI bulls were measured by qPCR. Gene expressions of 14 of the 21 genes were significantly correlated with RFI. The expression of the genes was used in a principal component analysis from which five components were extracted. The five principal components explained 70% of the variation in the dependency structure. The first component was highly correlated (correlation coefficient of 0.69) with RFI. The genes of the glutathione S-transferase Mu family (GSTM1, GSTM2, GSTM4), protocadherin 19 (PCDH19), ATP-binding cassette transporter C4 (ABCC4) and superoxide dismutase 3 (SOD3) are in the xenobiotic pathway and were the key factors in the first principal component. This highlights the important relationship between this pathway and variation in RFI. The second and third principal components were also correlated with RFI, with correlation coefficients of –0.28 and –0.20, respectively. Two of the four important genes of the second principal component work coordinately in the signalling pathways that inhibit the insulin-stimulated insulin receptor and regulate energy metabolism. This is consistent with the observation that a positive genetic correlation exists between RFI and fatness. The important genes in the third principal component are related to the extracellular matrix activity, with low RFI bulls showing high extracellular matrix activity.
Collapse
|
29
|
Subfunctionalization reduces the fitness cost of gene duplication in humans by buffering dosage imbalances. BMC Genomics 2011; 12:604. [PMID: 22168623 PMCID: PMC3280233 DOI: 10.1186/1471-2164-12-604] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 12/14/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Driven essentially by random genetic drift, subfunctionalization has been identified as a possible non-adaptive mechanism for the retention of duplicate genes in small-population species, where widespread deleterious mutations are likely to cause complementary loss of subfunctions across gene copies. Through subfunctionalization, duplicates become indispensable to maintain the functional requirements of the ancestral locus. Yet, gene duplication produces a dosage imbalance in the encoded proteins and thus, as investigated in this paper, subfunctionalization must be subject to the selective forces arising from the fitness bottleneck introduced by the duplication event. RESULTS We show that, while arising from random drift, subfunctionalization must be inescapably subject to selective forces, since the diversification of expression patterns across paralogs mitigates duplication-related dosage imbalances in the concentrations of encoded proteins. Dosage imbalance effects become paramount when proteins rely on obligatory associations to maintain their structural integrity, and are expected to be weaker when protein complexation is ephemeral or adventitious. To establish the buffering effect of subfunctionalization on selection pressure, we determine the packing quality of encoded proteins, an established indicator of dosage sensitivity, and correlate this parameter with the extent of paralog segregation in humans, using species with larger population -and more efficient selection- as controls. CONCLUSIONS Recognizing the role of subfunctionalization as a dosage-imbalance buffer in gene duplication events enabled us to reconcile its mechanistic nonadaptive origin with its adaptive role as an enabler of the evolution of genetic redundancy. This constructive role was established in this paper by proving the following assertion: If subfunctionalization is indeed adaptive, its effect on paralog segregation should scale with the dosage sensitivity of the duplicated genes. Thus, subfunctionalization becomes adaptive in response to the selection forces arising from the fitness bottleneck imposed by gene duplication.
Collapse
|
30
|
The GH18 family of chitinases: Their domain architectures, functions and evolutions. Glycobiology 2011; 22:23-34. [DOI: 10.1093/glycob/cwr092] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
|
31
|
Abstract
Mobile genetic elements (MGEs) account for a significant fraction of eukaryotic genomes and are implicated in altered gene expression and disease. We present an efficient computational protocol for MGE insertion site analysis. ELAN, the suite of tools described here uses standard techniques to identify different MGEs and their distribution on the genome. One component, DNASCANNER analyses known insertion sites of MGEs for the presence of signals that are based on a combination of local physical and chemical properties. ISF (insertion site finder) is a machine-learning tool that incorporates information derived from DNASCANNER. ISF permits classification of a given DNA sequence as a potential insertion site or not, using a support vector machine. We have studied the genomes of Homo sapiens, Mus musculus, Drosophila melanogaster and Entamoeba histolytica via a protocol whereby DNASCANNER is used to identify a common set of statistically important signals flanking the insertion sites in the various genomes. These are used in ISF for insertion site prediction, and the current accuracy of the tool is over 65%. We find similar signals at gene boundaries and splice sites. Together, these data are suggestive of a common insertion mechanism that operates in a variety of eukaryotes.
Collapse
|
32
|
The large-scale evolution by generating new genes from gene duplication; similarity and difference between monoploid and diploid organisms. J Theor Biol 2011; 278:120-6. [PMID: 21402082 DOI: 10.1016/j.jtbi.2011.03.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2010] [Revised: 01/17/2011] [Accepted: 03/05/2011] [Indexed: 11/23/2022]
Abstract
On the basis of the concept of biological activity, the large-scale evolution by generating new genes from gene duplication is theoretically compared between the monoploid organism and the diploid organism. The comparison is carried out not only for the process of generating one new gene but also for the process of generating two or more kinds of new genes from successive gene duplication. This comparison reveals the following difference in evolutionary pattern between the monoploids and diploids. The monoploid organism is more suitable to generate one or two new genes step by step but its successive gene duplication is obliged to generate smaller sizes of genes by the severer lowering of biological activity or self-reproducing rate. This is consistent with the evolutionary pattern of prokaryotes having steadily developed chemical syntheses, O₂-releasing photosynthesis and O₂-respiration in the respective lineages. On the other hand, the diploid organism with the plural number of homologous chromosome pairs has a chance to get together many kinds of new genes by the hybridization of variants having experienced different origins of gene duplication. Although this strategy of hybridization avoids the severe lowering of biological activity, it takes the longer time to establish the homozygotes of the more kinds of new genes. During this long period, furthermore different types of variants are accumulated in the population, and their successive hybridization sometimes yields various styles of new organisms. This evolutionary pattern explains the explosive divergence of body plans that has occasionally occurred in the diploid organisms, because the cell differentiation is a representative character exhibited by many kinds of genes and its evolution to the higher hierarchy constructs body plans.
Collapse
|
33
|
Identification of Parkinson’s disease candidate genes using CAESAR and screening of MAPT and SNCAIP in South African Parkinson’s disease patients. J Neural Transm (Vienna) 2011; 118:889-97. [DOI: 10.1007/s00702-011-0591-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2010] [Accepted: 01/24/2011] [Indexed: 01/08/2023]
|
34
|
NFκB inhibitors induce cell death in glioblastomas. Biochem Pharmacol 2010; 81:412-24. [PMID: 21040711 DOI: 10.1016/j.bcp.2010.10.014] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2010] [Revised: 10/21/2010] [Accepted: 10/21/2010] [Indexed: 12/16/2022]
Abstract
Identification of novel target pathways in glioblastoma (GBM) remains critical due to poor prognosis, inefficient therapies and recurrence associated with these tumors. In this work, we evaluated the role of nuclear-factor-kappa-B (NFκB) in the growth of GBM cells, and the potential of NFκB inhibitors as antiglioma agents. NFκB pathway was found overstimulated in GBM cell lines and in tumor specimens compared to normal astrocytes and healthy brain tissues, respectively. Treatment of a panel of established GBM cell lines (U138MG, U87, U373 and C6) with pharmacological NFκB inhibitors (BAY117082, parthenolide, MG132, curcumin and arsenic trioxide) and NFκB-p65 siRNA markedly decreased the viability of GBMs as compared to inhibitors of other signaling pathways such as MAPKs (ERK, JNK and p38), PKC, EGFR and PI3K/Akt. In addition, NFκB inhibitors presented a low toxicity to normal astrocytes, indicating selectivity to cancerous cells. In GBMs, mitochondrial dysfunction (membrane depolarization, bcl-xL downregulation and cytochrome c release) and arrest in the G2/M phase were observed at the early steps of NFκB inhibitors treatment. These events preceded sub-G1 detection, apoptotic body formation and caspase-3 activation. Also, NFκB was found overstimulated in cisplatin-resistant C6 cells, and treatment of GBMs with NFκB inhibitors overcame cisplatin resistance besides potentiating the effects of the chemotherapeutics, cisplatin and doxorubicin. These findings support NFκB as a potential target to cell death induction in GBMs, and that the NFκB inhibitors may be considered for in vivo testing on animal models and possibly on GBM therapy.
Collapse
|
35
|
PathEx: a novel multi factors based datasets selector web tool. BMC Bioinformatics 2010; 11:528. [PMID: 20969778 PMCID: PMC2978222 DOI: 10.1186/1471-2105-11-528] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 10/22/2010] [Indexed: 11/27/2022] Open
Abstract
Background Microarray experiments have become very popular in life science research. However, if such experiments are only considered independently, the possibilities for analysis and interpretation of many life science phenomena are reduced. The accumulation of publicly available data provides biomedical researchers with a valuable opportunity to either discover new phenomena or improve the interpretation and validation of other phenomena that partially understood or well known. This can only be achieved by intelligently exploiting this rich mine of information. Description Considering that technologies like microarrays remain prohibitively expensive for researchers with limited means to order their own experimental chips, it would be beneficial to re-use previously published microarray data. For certain researchers interested in finding gene groups (requiring many replicates), there is a great need for tools to help them to select appropriate datasets for analysis. These tools may be effective, if and only if, they are able to re-use previously deposited experiments or to create new experiments not initially envisioned by the depositors. However, the generation of new experiments requires that all published microarray data be completely annotated, which is not currently the case. Thus, we propose the PathEx approach. Conclusion This paper presents PathEx, a human-focused web solution built around a two-component system: one database component, enriched with relevant biological information (expression array, omics data, literature) from different sources, and another component comprising sophisticated web interfaces that allow users to perform complex dataset building queries on the contents integrated into the PathEx database.
Collapse
|
36
|
The ANISEED database: digital representation, formalization, and elucidation of a chordate developmental program. Genome Res 2010; 20:1459-68. [PMID: 20647237 DOI: 10.1101/gr.108175.110] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Developmental biology aims to understand how the dynamics of embryonic shapes and organ functions are encoded in linear DNA molecules. Thanks to recent progress in genomics and imaging technologies, systemic approaches are now used in parallel with small-scale studies to establish links between genomic information and phenotypes, often described at the subcellular level. Current model organism databases, however, do not integrate heterogeneous data sets at different scales into a global view of the developmental program. Here, we present a novel, generic digital system, NISEED, and its implementation, ANISEED, to ascidians, which are invertebrate chordates suitable for developmental systems biology approaches. ANISEED hosts an unprecedented combination of anatomical and molecular data on ascidian development. This includes the first detailed anatomical ontologies for these embryos, and quantitative geometrical descriptions of developing cells obtained from reconstructed three-dimensional (3D) embryos up to the gastrula stages. Fully annotated gene model sets are linked to 30,000 high-resolution spatial gene expression patterns in wild-type and experimentally manipulated conditions and to 528 experimentally validated cis-regulatory regions imported from specialized databases or extracted from 160 literature articles. This highly structured data set can be explored via a Developmental Browser, a Genome Browser, and a 3D Virtual Embryo module. We show how integration of heterogeneous data in ANISEED can provide a system-level understanding of the developmental program through the automatic inference of gene regulatory interactions, the identification of inducing signals, and the discovery and explanation of novel asymmetric divisions.
Collapse
|
37
|
Relaxed purifying selection and possibly high rate of adaptation in primate lineage-specific genes. Genome Biol Evol 2010; 2:393-409. [PMID: 20624743 PMCID: PMC2997544 DOI: 10.1093/gbe/evq019] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Genes in the same organism vary in the time since their evolutionary origin. Without horizontal gene transfer, young genes are necessarily restricted to a few closely related species, whereas old genes can be broadly distributed across the phylogeny. It has been shown that young genes evolve faster than old genes; however, the evolutionary forces responsible for this pattern remain obscure. Here, we classify human–chimp protein-coding genes into different age classes, according to the breath of their phylogenetic distribution. We estimate the strength of purifying selection and the rate of adaptive selection for genes in different age classes. We find that older genes carry fewer and less frequent nonsynonymous single-nucleotide polymorphisms than younger genes suggesting that older genes experience a stronger purifying selection at the protein-coding level. We infer the distribution of fitness effects of new deleterious mutations and find that older genes have proportionally more slightly deleterious mutations and fewer nearly neutral mutations than younger genes. To investigate the role of adaptive selection of genes in different age classes, we determine the selection coefficient (γ = 2Nes) of genes using the MKPRF approach and estimate the ratio of the rate of adaptive nonsynonymous substitution to synonymous substitution (ωA) using the DoFE method. Although the proportion of positively selected genes (γ > 0) is significantly higher in younger genes, we find no correlation between ωA and gene age. Collectively, these results provide strong evidence that younger genes are subject to weaker purifying selection and more tenuous evidence that they also undergo adaptive evolution more frequently.
Collapse
|
38
|
Composition and regulation of maternal and zygotic transcriptomes reflects species-specific reproductive mode. Genome Biol 2010; 11:R58. [PMID: 20515465 PMCID: PMC2911106 DOI: 10.1186/gb-2010-11-6-r58] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2009] [Revised: 04/23/2010] [Accepted: 06/01/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Early embryos contain mRNA transcripts expressed from two distinct origins; those expressed from the mother's genome and deposited in the oocyte (maternal) and those expressed from the embryo's genome after fertilization (zygotic). The transition from maternal to zygotic control occurs at different times in different animals according to the extent and form of maternal contributions, which likely reflect evolutionary and ecological forces. Maternally deposited transcripts rely on post-transcriptional regulatory mechanisms for precise spatial and temporal expression in the embryo, whereas zygotic transcripts can use both transcriptional and post-transcriptional regulatory mechanisms. The differences in maternal contributions between animals may be associated with gene regulatory changes detectable by the size and complexity of the associated regulatory regions. RESULTS We have used genomic data to identify and compare maternal and/or zygotic expressed genes from six different animals and find evidence for selection acting to shape gene regulatory architecture in thousands of genes. We find that mammalian maternal genes are enriched for complex regulatory regions, suggesting an increase in expression specificity, while egg-laying animals are enriched for maternal genes that lack transcriptional specificity. CONCLUSIONS We propose that this lack of specificity for maternal expression in egg-laying animals indicates that a large fraction of maternal genes are expressed non-functionally, providing only supplemental nutritional content to the developing embryo. These results provide clear predictive criteria for analysis of additional genomes.
Collapse
|
39
|
Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome. Nucleic Acids Res 2010; 38:4740-54. [PMID: 20385588 PMCID: PMC2919708 DOI: 10.1093/nar/gkq197] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Mining massive amounts of transcript data for alternative splicing information is paramount to help understand how the maturation of RNA regulates gene expression. We developed an algorithm to cluster transcript data to annotated genes to detect unannotated splice variants. A higher number of alternatively spliced genes and isoforms were found compared to other alternative splicing databases. Comparison of human and mouse data revealed a marked increase, in human, of splice variants incorporating novel exons and retained introns. Previously unannotated exons were validated by tiling array expression data and shown to correspond preferentially to novel first exons. Retained introns were validated by tiling array and deep sequencing data. The majority of retained introns were shorter than 500 nt and had weak polypyrimidine tracts. A subset of retained introns matching small RNAs and displaying a high GC content suggests a possible coordination between splicing regulation and production of noncoding RNAs. Conservation of unannotated exons and retained introns was higher in horse, dog and cow than in rodents, and 64% of exon sequences were only found in primates. This analysis highlights previously bypassed alternative splice variants, which may be crucial to deciphering more complex pathways of gene regulation in human.
Collapse
|
40
|
The chaperone-like protein HYPK acts together with NatA in cotranslational N-terminal acetylation and prevention of Huntingtin aggregation. Mol Cell Biol 2010; 30:1898-909. [PMID: 20154145 DOI: 10.1128/mcb.01199-09] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The human NatA protein N(alpha)-terminal-acetyltransferase complex is responsible for cotranslational N-terminal acetylation of proteins with Ser, Ala, Thr, Gly, and Val N termini. The NatA complex is composed of the catalytic subunit hNaa10p (hArd1) and the auxiliary subunit hNaa15p (hNat1/NATH). Using immunoprecipitation coupled with mass spectrometry, we identified endogenous HYPK, a Huntingtin (Htt)-interacting protein, as a novel stable interactor of NatA. HYPK has chaperone-like properties preventing Htt aggregation. HYPK, hNaa10p, and hNaa15p were associated with polysome fractions, indicating a function of HYPK associated with the NatA complex during protein translation. Knockdown of both hNAA10 and hNAA15 decreased HYPK protein levels, possibly indicating that NatA is required for the stability of HYPK. The biological importance of HYPK was evident from HYPK-knockdown HeLa cells displaying apoptosis and cell cycle arrest in the G(0)/G(1) phase. Knockdown of HYPK or hNAA10 resulted in increased aggregation of an Htt-enhanced green fluorescent protein (Htt-EGFP) fusion with expanded polyglutamine stretches, suggesting that both HYPK and NatA prevent Htt aggregation. Furthermore, we demonstrated that HYPK is required for N-terminal acetylation of the known in vivo NatA substrate protein PCNP. Taken together, the data indicate that the physical interaction between HYPK and NatA seems to be of functional importance both for Htt aggregation and for N-terminal acetylation.
Collapse
|
41
|
An integrated mass-spectrometry pipeline identifies novel protein coding-regions in the human genome. PLoS One 2010; 5:e8949. [PMID: 20126623 PMCID: PMC2812506 DOI: 10.1371/journal.pone.0008949] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2009] [Accepted: 01/06/2010] [Indexed: 01/28/2023] Open
Abstract
Background Most protein mass spectrometry (MS) experiments rely on searches against a database of known or predicted proteins, limiting their ability as a gene discovery tool. Results Using a search against an in silico translation of the entire human genome, combined with a series of annotation filters, we identified 346 putative novel peptides [False Discovery Rate (FDR)<5%] in a MS dataset derived from two human breast epithelial cell lines. A subset of these were then successfully validated by a different MS technique. Two of these correspond to novel isoforms of Heterogeneous Ribonuclear Proteins, while the rest correspond to novel loci. Conclusions MS technology can be used for ab initio gene discovery in human data, which, since it is based on different underlying assumptions, identifies protein-coding genes not found by other techniques. As MS technology continues to evolve, such approaches will become increasingly powerful.
Collapse
|
42
|
Duplication mechanism and disruptions in flanking regions determine the fate of Mammalian gene duplicates. J Comput Biol 2010; 16:1253-66. [PMID: 19772436 DOI: 10.1089/cmb.2009.0074] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Here we identify duplicated genes in five mammalian genomes and classify these duplicates based on the mechanisms by which they were generated. Retrotransposition accounts for at least half of all predicted duplicate genes in these genomes, with tandem and interspersed DNA-mediated duplicates comprising the other half. Estimation of the evolutionary rates in each class revealed greater rate asymmetry between retrotransposed and interspersed DNA duplicate pairs than between tandem duplicates, suggesting that retrotransposed and interspersed DNA duplicates are diverging more quickly. In an attempt to understand the basis of this asymmetry, we identified disruption of flanking DNA as an indicator of new duplicate fate-loss of local synteny accelerates the asymmetry of divergence of interspersed DNA duplicates. We also show that intact retrogenes are enriched in intergenic regions and indel purified regions of the human genome. Moreover, intact retrogenes closest to annotated genes show the greatest levels of purifying selective pressure. Together, these findings suggest that the differential evolution of duplicate genes may be significantly influenced by changes in local genome architecture.
Collapse
|
43
|
A Computation to Integrate the Analysis of Genetic Variations Occurring within Regulatory Elements and Their Possible Effects. J Comput Biol 2009; 16:1731-47. [DOI: 10.1089/cmb.2008.0247] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
|
44
|
Abstract
Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.
Collapse
|
45
|
Abstract
MOTIVATION Exon arrays allow the quantitative study of alternative splicing (AS) on a genome-wide scale. A variety of splicing prediction methods has been proposed for Affymetrix exon arrays mainly focusing on geometric correlation measures or analysis of variance. In this article, we introduce an information theoretic concept that is based on modification of the well-known entropy function. RESULTS We have developed an AS robust prediction method based on entropy (ARH). We can show that this measure copes with bias inherent in the analysis of AS such as the dependency of prediction performance on the number of exons or variable exon expression. In order to judge the performance of ARH, we have compared it with eight existing splicing prediction methods using experimental benchmark data and demonstrate that ARH is a well-performing new method for the prediction of splice variants. AVAILABILITY AND IMPLEMENTATION ARH is implemented in R and provided in the Supplementary Material.
Collapse
|
46
|
An atlas of the speed of copy number changes in animal gene families and its implications. PLoS One 2009; 4:e7342. [PMID: 19851465 PMCID: PMC2761543 DOI: 10.1371/journal.pone.0007342] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 08/28/2009] [Indexed: 01/23/2023] Open
Abstract
The notion that gene duplications generating new genes and functions is commonly accepted in evolutionary biology. However, this assumption is more speculative from theory rather than well proven in genome-wide studies. Here, we generated an atlas of the rate of copy number changes (CNCs) in all the gene families of ten animal genomes. We grouped the gene families with similar CNC dynamics into rate pattern groups (RPGs) and annotated their function using a novel bottom-up approach. By comparing CNC rate patterns, we showed that most of the species-specific CNC rates groups are formed by gene duplication rather than gene loss, and most of the changes in rates of CNCs may be the result of adaptive evolution. We also found that the functions of many RPGs match their biological significance well. Our work confirmed the role of gene duplication in generating novel phenotypes, and the results can serve as a guide for researchers to connect the phenotypic features to certain gene duplications.
Collapse
|
47
|
Dynamic Proteomics: a database for dynamics and localizations of endogenous fluorescently-tagged proteins in living human cells. Nucleic Acids Res 2009; 38:D508-12. [PMID: 19820112 PMCID: PMC2808965 DOI: 10.1093/nar/gkp808] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Recent advances allow tracking the levels and locations of a thousand proteins in individual living human cells over time using a library of annotated reporter cell clones (LARC). This library was created by Cohen et al. to study the proteome dynamics of a human lung carcinoma cell-line treated with an anti-cancer drug. Here, we report the Dynamic Proteomics database for the proteins studied by Cohen et al. Each cell-line clone in LARC has a protein tagged with yellow fluorescent protein, expressed from its endogenous chromosomal location, under its natural regulation. The Dynamic Proteomics interface facilitates searches for genes of interest, downloads of protein fluorescent movies and alignments of dynamics following drug addition. Each protein in the database is displayed with its annotation, cDNA sequence, fluorescent images and movies obtained by the time-lapse microscopy. The protein dynamics in the database represents a quantitative trace of the protein fluorescence levels in nucleus and cytoplasm produced by image analysis of movies over time. Furthermore, a sequence analysis provides a search and comparison of up to 50 input DNA sequences with all cDNAs in the library. The raw movies may be useful as a benchmark for developing image analysis tools for individual-cell dynamic-proteomics. The database is available at http://www.dynamicproteomics.net/.
Collapse
|
48
|
Abstract
Proteins rely on associations to improve packing quality and thus maintain structural integrity. This makes packing deficiency a likely determinant of dosage sensitivity, that is, of the fitness impact of concentration imbalances relative to the stoichiometry of the protein complexes. This hypothesis was validated by examining evolution-related dosage imbalances: Duplicates of genes encoding for deficiently packed proteins are less likely to be retained than genes coding for well-packed proteins. This selection pressure is apparent in unicellular organisms, but is mitigated in higher eukaryotes. In human, this effect reveals a capacitance toward dosage imbalance. This capacitance is not expected in organisms with larger population size, where evolutionary forces are more efficient at promoting adaptive functional innovation and purifying selection, thus curbing the concentration imbalance arising from gene duplication. By examining miRNA target dissimilarities within human gene families, we show that the capacitance is operative at a post-transcriptional regulatory level: The higher the packing deficiency of a protein, the more likely that its paralogs will be dissimilarly targeted by miRNA to mitigate dosage imbalance. For families with low capacitance, paralog sequence divergence and family size correlate tightly with packing deficiency, just like in unicellular eukaryotes. Thus, a major component of human tolerance toward dosage imbalances is rooted in the paralog-discriminating capacity of miRNA regulation. The results may clarify the evolutionary etiology of aggregation-related diseases, since aggregation is often promoted by overexpression (a dosage imbalance) and aggregation propensity is associated with extreme packing deficiency.
Collapse
|
49
|
PPI spider: a tool for the interpretation of proteomics data in the context of protein-protein interaction networks. Proteomics 2009; 9:2740-9. [PMID: 19405022 DOI: 10.1002/pmic.200800612] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Recent advances in experimental technologies allow for the detection of a complete cell proteome. Proteins that are expressed at a particular cell state or in a particular compartment as well as proteins with differential expression between various cells states are commonly delivered by many proteomics studies. Once a list of proteins is derived, a major challenge is to interpret the identified set of proteins in the biological context. Protein-protein interaction (PPI) data represents abundant information that can be employed for this purpose. However, these data have not yet been fully exploited due to the absence of a methodological framework that can integrate this type of information. Here, we propose to infer a network model from an experimentally identified protein list based on the available information about the topology of the global PPI network. We propose to use a Monte Carlo simulation procedure to compute the statistical significance of the inferred models. The method has been implemented as a freely available web-based tool, PPI spider (http://mips.helmholtz-muenchen.de/proj/ppispider). To support the practical significance of PPI spider, we collected several hundreds of recently published experimental proteomics studies that reported lists of proteins in various biological contexts. We reanalyzed them using PPI spider and demonstrated that in most cases PPI spider could provide statistically significant hypotheses that are helpful for understanding of the protein list.
Collapse
|
50
|
Genomic and transcriptional co-localization of protein-coding and long non-coding RNA pairs in the developing brain. PLoS Genet 2009; 5:e1000617. [PMID: 19696892 PMCID: PMC2722021 DOI: 10.1371/journal.pgen.1000617] [Citation(s) in RCA: 300] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2008] [Accepted: 07/24/2009] [Indexed: 11/18/2022] Open
Abstract
Besides protein-coding mRNAs, eukaryotic transcriptomes include many long non-protein-coding RNAs (ncRNAs) of unknown function that are transcribed away from protein-coding loci. Here, we have identified 659 intergenic long ncRNAs whose genomic sequences individually exhibit evolutionary constraint, a hallmark of functionality. Of this set, those expressed in the brain are more frequently conserved and are significantly enriched with predicted RNA secondary structures. Furthermore, brain-expressed long ncRNAs are preferentially located adjacent to protein-coding genes that are (1) also expressed in the brain and (2) involved in transcriptional regulation or in nervous system development. This led us to the hypothesis that spatiotemporal co-expression of ncRNAs and nearby protein-coding genes represents a general phenomenon, a prediction that was confirmed subsequently by in situ hybridisation in developing and adult mouse brain. We provide the full set of constrained long ncRNAs as an important experimental resource and present, for the first time, substantive and predictive criteria for prioritising long ncRNA and mRNA transcript pairs when investigating their biological functions and contributions to development and disease. Virtually all of the eukaryotic genome is transcribed, yet far from all transcripts encode protein. Very little is known about the functions of most non-coding transcripts or, indeed, whether they convey functions at all. Among all such transcripts, we have chosen to consider long non-coding RNAs (ncRNAs) that are transcribed outside of known protein-coding gene loci. Our approach has focused on mouse long ncRNAs whose genomic sequences are conserved in humans, and also on ncRNAs that are expressed in the brain. This conservation might reflect the functionality of the underlying DNA, rather than the ncRNA, sequence. However, this cannot fully explain the concentration of predicted RNA structures in these ncRNAs. These long ncRNAs also tend to be transcribed in the genomic neighbourhood of protein-coding genes whose functions relate to transcription or to nervous system development. These observations are consistent with the positive transcriptional regulation in cis of these genes with nearby transcription of ncRNAs. This model implies co-expression of protein-coding and noncoding transcripts, a hypothesis that we validated experimentally. These findings are particularly important because they provide a rationale for prioritising specific ncRNAs when experimentally investigating regulation of protein-coding gene expression.
Collapse
|