1
|
Lüleci HB, Yılmaz A. Robust and rigorous identification of tissue-specific genes by statistically extending tau score. BioData Min 2022; 15:31. [PMID: 36494766 PMCID: PMC9733102 DOI: 10.1186/s13040-022-00315-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 11/11/2022] [Indexed: 12/13/2022] Open
Abstract
OBJECTIVES In this study, we aimed to identify tissue-specific genes for various human tissues/organs more robustly and rigorously by extending the tau score algorithm. INTRODUCTION Tissue-specific genes are a class of genes whose functions and expressions are preferred in one or several tissues restrictedly. Identification of tissue-specific genes is essential for discovering multi-cellular biological processes such as tissue-specific molecular regulations, tissue development, physiology, and the pathogenesis of tissue-associated diseases. MATERIALS AND METHODS Gene expression data derived from five large RNA sequencing (RNA-seq) projects, spanning 96 different human tissues, were retrieved from ArrayExpress and ExpressionAtlas. The first step is categorizing genes using significant filters and tau score as a specificity index. After calculating tau for each gene in all datasets separately, statistical distance from the maximum expression level was estimated using a new meaningful procedure. Specific expression of a gene in one or several tissues was calculated after the integration of tau and statistical distance estimation, which is called as extended tau approach. Obtained tissue-specific genes for 96 different human tissues were functionally annotated, and some comparisons were carried out to show the effectiveness of the extended tau method. RESULTS AND DISCUSSION Categorization of genes based on expression level and identification of tissue-specific genes for a large number of tissues/organs were executed. Genes were successfully assigned to multiple tissues by generating the extended tau approach as opposed to the original tau score, which can assign tissue specificity to single tissue only.
Collapse
Affiliation(s)
- Hatice Büşra Lüleci
- grid.448834.70000 0004 0595 7127Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Alper Yılmaz
- grid.38575.3c0000 0001 2337 3561Department of Bioengineering, Yildiz Technical University, Istanbul, Turkey
| |
Collapse
|
2
|
Huminiecki Ł. Virtual Gene Concept and a Corresponding Pragmatic Research Program in Genetical Data Science. ENTROPY (BASEL, SWITZERLAND) 2021; 24:17. [PMID: 35052043 PMCID: PMC8774939 DOI: 10.3390/e24010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 06/14/2023]
Abstract
Mendel proposed an experimentally verifiable paradigm of particle-based heredity that has been influential for over 150 years. The historical arguments have been reflected in the near past as Mendel's concept has been diversified by new types of omics data. As an effect of the accumulation of omics data, a virtual gene concept forms, giving rise to genetical data science. The concept integrates genetical, functional, and molecular features of the Mendelian paradigm. I argue that the virtual gene concept should be deployed pragmatically. Indeed, the concept has already inspired a practical research program related to systems genetics. The program includes questions about functionality of structural and categorical gene variants, about regulation of gene expression, and about roles of epigenetic modifications. The methodology of the program includes bioinformatics, machine learning, and deep learning. Education, funding, careers, standards, benchmarks, and tools to monitor research progress should be provided to support the research program.
Collapse
Affiliation(s)
- Łukasz Huminiecki
- Evolutionary, Computational, and Statistical Genetics, Department of Molecula Biology, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Postępu 36A, Jastrzębiec, 05-552 Warsaw, Poland
| |
Collapse
|
3
|
Wang Z, Zheng H, Zhou H, Huang N, Wei X, Liu X, Teng X, Hu Z, Zhang J, Zhou X, Li W, Li J. Systematic screening and identification of novel psoriasis‑specific genes from the transcriptome of psoriasis‑like keratinocytes. Mol Med Rep 2018; 19:1529-1542. [PMID: 30592269 PMCID: PMC6390042 DOI: 10.3892/mmr.2018.9782] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 11/05/2018] [Indexed: 02/05/2023] Open
Abstract
Psoriasis is a chronic inflammatory skin disease. Keratinocytes (KCs), as skin‑specific cells, serve an important role in the immunopathogenesis of psoriasis. In the present study, transcriptome data derived from psoriasis‑like KCs were used together with the reported transcriptome data from the skin/epidermis of patient with psoriasis, excluding known psoriasis‑associated genes that have been well described in the previous studies according to GeneCards database, to screen for novel psoriasis‑associated genes. According to the human expressed sequence tag of UniGene dataset, six genes that are located near psoriasis‑associated loci were highly expressed in skin. Among these six genes, four genes (epiregulin, NIPA like domain containing 4, serpin family B member 7 and WAP four‑disulfide core domain 12) were highly expressed in normal mouse epidermis (mainly KCs) and mouse psoriatic epidermis cells, but not in psoriatic dermis cells, which further emphasized the specificity of these genes. Furthermore, in systemic inflammatory response syndrome (SIRS), SERPINB7 showed no difference in expression in immune‑activated tissues from SIRS and control mice. It was also found that the mRNA expression levels of SERPINB in lesional skin of patients with psoriasis were significantly higher than in non‑lesional psoriatic skin from the same patients. SERPINB7 may be a valuable candidate for further studies. In the present study, a method for identifying novel key pathogenic skin‑specific molecules is presented, which may be used for investigating and treating psoriasis.
Collapse
Affiliation(s)
- Zhen Wang
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Huaping Zheng
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Hong Zhou
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Nongyu Huang
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Xiaoqiong Wei
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Xiao Liu
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Xiu Teng
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Zhonglan Hu
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Jun Zhang
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Xikun Zhou
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Wei Li
- Department of Dermatovenereology, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| | - Jiong Li
- Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu, Sichuan 610041, P.R. China
| |
Collapse
|
4
|
Vingtdeux V, Chang EH, Frattini SA, Zhao H, Chandakkar P, Adrien L, Strohl JJ, Gibson EL, Ohmoto M, Matsumoto I, Huerta PT, Marambaud P. CALHM1 deficiency impairs cerebral neuron activity and memory flexibility in mice. Sci Rep 2016; 6:24250. [PMID: 27066908 PMCID: PMC4828655 DOI: 10.1038/srep24250] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Accepted: 03/18/2016] [Indexed: 12/04/2022] Open
Abstract
CALHM1 is a cell surface calcium channel expressed in cerebral neurons. CALHM1 function in the brain remains unknown, but recent results showed that neuronal CALHM1 controls intracellular calcium signaling and cell excitability, two mechanisms required for synaptic function. Here, we describe the generation of Calhm1 knockout (Calhm1−/−) mice and investigate CALHM1 role in neuronal and cognitive functions. Structural analysis revealed that Calhm1−/− brains had normal regional and cellular architecture, and showed no evidence of neuronal or synaptic loss, indicating that CALHM1 deficiency does not affect brain development or brain integrity in adulthood. However, Calhm1−/− mice showed a severe impairment in memory flexibility, assessed in the Morris water maze, and a significant disruption of long-term potentiation without alteration of long-term depression, measured in ex vivo hippocampal slices. Importantly, in primary neurons and hippocampal slices, CALHM1 activation facilitated the phosphorylation of NMDA and AMPA receptors by protein kinase A. Furthermore, neuronal CALHM1 activation potentiated the effect of glutamate on the expression of c-Fos and C/EBPβ, two immediate-early gene markers of neuronal activity. Thus, CALHM1 controls synaptic activity in cerebral neurons and is required for the flexible processing of memory in mice. These results shed light on CALHM1 physiology in the mammalian brain.
Collapse
Affiliation(s)
- Valérie Vingtdeux
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Eric H Chang
- Laboratory of Immune &Neural Networks, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Stephen A Frattini
- Laboratory of Immune &Neural Networks, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Haitian Zhao
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Pallavi Chandakkar
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Leslie Adrien
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Joshua J Strohl
- Laboratory of Immune &Neural Networks, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Elizabeth L Gibson
- Laboratory of Immune &Neural Networks, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | - Makoto Ohmoto
- Monell Chemical Senses Center, Philadelphia, PA 19104, USA
| | | | - Patricio T Huerta
- Laboratory of Immune &Neural Networks, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA.,Department of Molecular Medicine, Hofstra Northwell School of Medicine, Manhasset, NY 11030, USA
| | - Philippe Marambaud
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| |
Collapse
|
5
|
Systematic identification of molecular links between core and candidate genes in breast cancer. J Mol Biol 2015; 427:1436-1450. [PMID: 25640309 DOI: 10.1016/j.jmb.2015.01.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Revised: 01/22/2015] [Accepted: 01/24/2015] [Indexed: 01/07/2023]
Abstract
Despite the remarkable progress achieved in the identification of specific genes involved in breast cancer (BC), our understanding of their complex functioning is still limited. In this manuscript, we systematically explore the existence of direct physical interactions between the products of BC core and associated genes. Our aim is to generate a protein interaction network of BC-associated gene products and suggest potential molecular mechanisms to unveil their role in the disease. In total, we report 599 novel high-confidence interactions among 44 BC core, 54 BC candidate/associated and 96 newly identified proteins. Our findings indicate that this network-based approach is indeed a robust inference tool to pinpoint new potential players and gain insight into the underlying mechanisms of those proteins with previously unknown roles in BC. To illustrate the power of our approach, we provide initial validation of two BC-associated proteins on the alteration of DNA damage response as a result of specific re-wiring interactions. Overall, our BC-related network may serve as a framework to integrate clinical and molecular data and foster novel global therapeutic strategies.
Collapse
|
6
|
Vingtdeux V, Tanis JE, Chandakkar P, Zhao H, Dreses-Werringloer U, Campagne F, Foskett JK, Marambaud P. Effect of the CALHM1 G330D and R154H human variants on the control of cytosolic Ca2+ and Aβ levels. PLoS One 2014; 9:e112484. [PMID: 25386646 PMCID: PMC4227689 DOI: 10.1371/journal.pone.0112484] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2014] [Accepted: 10/06/2014] [Indexed: 11/18/2022] Open
Abstract
CALHM1 is a plasma membrane voltage-gated Ca2+-permeable ion channel that controls amyloid-β (Aβ) metabolism and is potentially involved in the onset of Alzheimer's disease (AD). Recently, Rubio-Moscardo et al. (PLoS One (2013) 8: e74203) reported the identification of two CALHM1 variants, G330D and R154H, in early-onset AD (EOAD) patients. The authors provided evidence that these two human variants were rare and resulted in a complete loss of CALHM1 function. Recent publicly available large-scale exome sequencing data confirmed that R154H is a rare CALHM1 variant (minor allele frequency (MAF) = 0.015%), but that G330D is not (MAF = 3.5% in an African American cohort). Here, we show that both CALHM1 variants exhibited gating and permeation properties indistinguishable from wild-type CALHM1 when expressed in Xenopus oocytes. While there was also no effect of the G330D mutation on Ca2+ uptake by CALHM1 in transfected mammalian cells, the R154H mutation was associated with defects in the control by CALHM1 of both Ca2+ uptake and Aβ levels in this cell system. Together, our data show that the frequent CALHM1 G330D variant has no obvious functional consequences and is therefore unlikely to contribute to EOAD. Our data also demonstrate that the rare R154H variant interferes with CALHM1 control of cytosolic Ca2+ and Aβ accumulation. While these results strengthen the notion that CALHM1 influences Aβ metabolism, further investigation will be required to determine whether CALHM1 R154H, or other natural variants in CALHM1, is/are associated with EOAD.
Collapse
Affiliation(s)
- Valérie Vingtdeux
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
| | - Jessica E. Tanis
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Pallavi Chandakkar
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
| | - Haitian Zhao
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
| | - Ute Dreses-Werringloer
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
| | - Fabien Campagne
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, The Weill Cornell Medical College, New York, NY, United States of America
| | - J. Kevin Foskett
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States of America
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Philippe Marambaud
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
- * E-mail:
| |
Collapse
|
7
|
Feichtinger J, McFarlane RJ, Larcombe LD. CancerEST: a web-based tool for automatic meta-analysis of public EST data. Database (Oxford) 2014; 2014:bau024. [PMID: 24715218 PMCID: PMC3978373 DOI: 10.1093/database/bau024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Revised: 02/26/2014] [Accepted: 02/27/2014] [Indexed: 11/23/2022]
Abstract
The identification of cancer-restricted biomarkers is fundamental to the development of novel cancer therapies and diagnostic tools. The construction of comprehensive profiles to define tissue- and cancer-specific gene expression has been central to this. To this end, the exploitation of the current wealth of 'omic'-scale databases can be facilitated by automated approaches, allowing researchers to directly address specific biological questions. Here we present CancerEST, a user-friendly and intuitive web-based tool for the automated identification of candidate cancer markers/targets, for examining tissue specificity as well as for integrated expression profiling. CancerEST operates by means of constructing and meta-analyzing expressed sequence tag (EST) profiles of user-supplied gene sets across an EST database supporting 36 tissue types. Using a validation data set from the literature, we show the functionality and utility of CancerEST. DATABASE URL: http://www.cancerest.org.uk.
Collapse
Affiliation(s)
- Julia Feichtinger
- North West Cancer Research Institute, Bangor University, Bangor, Gwynedd LL57 2UW, UK, Institute for Genomics and Bioinformatics, Graz University of Technology, Petersgasse 14, 8010 Graz, Austria, Core Facility Bioinformatics, Austrian Centre of Industrial Biotechnology, Petersgasse 14, 8010 Graz, Austria, NISCHR Cancer Genetics Biomedical Research Unit, Bangor University, Bangor, Gwynedd LL57 2UW, UK, Liverpool Cancer Research UK Centre, University of Liverpool, Liverpool, Merseyside L3 9TA, UK and Applied Mathematics and Computing Group, Cranfield University, Cranfield, Bedfordshire MK43 0AL, UK
| | - Ramsay J. McFarlane
- North West Cancer Research Institute, Bangor University, Bangor, Gwynedd LL57 2UW, UK, Institute for Genomics and Bioinformatics, Graz University of Technology, Petersgasse 14, 8010 Graz, Austria, Core Facility Bioinformatics, Austrian Centre of Industrial Biotechnology, Petersgasse 14, 8010 Graz, Austria, NISCHR Cancer Genetics Biomedical Research Unit, Bangor University, Bangor, Gwynedd LL57 2UW, UK, Liverpool Cancer Research UK Centre, University of Liverpool, Liverpool, Merseyside L3 9TA, UK and Applied Mathematics and Computing Group, Cranfield University, Cranfield, Bedfordshire MK43 0AL, UK
| | - Lee D. Larcombe
- North West Cancer Research Institute, Bangor University, Bangor, Gwynedd LL57 2UW, UK, Institute for Genomics and Bioinformatics, Graz University of Technology, Petersgasse 14, 8010 Graz, Austria, Core Facility Bioinformatics, Austrian Centre of Industrial Biotechnology, Petersgasse 14, 8010 Graz, Austria, NISCHR Cancer Genetics Biomedical Research Unit, Bangor University, Bangor, Gwynedd LL57 2UW, UK, Liverpool Cancer Research UK Centre, University of Liverpool, Liverpool, Merseyside L3 9TA, UK and Applied Mathematics and Computing Group, Cranfield University, Cranfield, Bedfordshire MK43 0AL, UK
| |
Collapse
|
8
|
Charting the molecular links between driver and susceptibility genes in colorectal cancer. Biochem Biophys Res Commun 2014; 445:734-8. [DOI: 10.1016/j.bbrc.2013.12.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 12/02/2013] [Indexed: 12/16/2022]
|
9
|
|
10
|
Dreses-Werringloer U, Vingtdeux V, Zhao H, Chandakkar P, Davies P, Marambaud P. CALHM1 controls the Ca²⁺-dependent MEK, ERK, RSK and MSK signaling cascade in neurons. J Cell Sci 2013; 126:1199-206. [PMID: 23345406 DOI: 10.1242/jcs.117135] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Calcium homeostasis modulator 1 (CALHM1) is a Ca(2+) channel controlling neuronal excitability and potentially involved in the pathogenesis of Alzheimer's disease (AD). Although strong evidence indicates that CALHM1 is required for neuronal electrical activity, its role in intracellular Ca(2+) signaling remains unknown. In the present study, we show that in hippocampal HT-22 cells, CALHM1 expression led to a robust and relatively selective activation of the Ca(2+)-sensing kinases ERK1/2. CALHM1 also triggered activation of MEK1/2, the upstream ERK1/2-activating kinases, and of RSK1/2/3 and MSK1, two downstream effectors of ERK1/2 signaling. CALHM1-mediated activation of ERK1/2 signaling was controlled by the small GTPase Ras. Pharmacological inhibition of CALHM1 permeability using Ruthenium Red, Zn(2+), and Gd(3+), or expression of the CALHM1 N140A and W114A mutants, which are deficient in mediating Ca(2+) influx, prevented the effect of CALHM1 on the MEK, ERK, RSK and MSK signaling cascade, demonstrating that CALHM1 controlled this pathway via its channel properties. Importantly, expression of CALHM1 bearing the natural P86L polymorphism, which leads to a partial loss of CALHM1 function and is associated with an earlier age at onset in AD patients, showed reduced activation of ERK1/2, RSK1/2/3, and MSK1. In line with these results obtained in transfected cells, primary cerebral neurons isolated from Calhm1 knockout mice showed significant impairments in the activation of MEK, ERK, RSK and MSK signaling. The present study identifies a previously uncharacterized mechanism of control of Ca(2+)-dependent ERK1/2 signaling in neurons, and further establishes CALHM1 as a critical ion channel for neuronal signaling and function.
Collapse
Affiliation(s)
- Ute Dreses-Werringloer
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, New York, USA
| | | | | | | | | | | |
Collapse
|
11
|
Chapuis J, Vingtdeux V, Capiralla H, Davies P, Marambaud P. Gas1 interferes with AβPP trafficking by facilitating the accumulation of immature AβPP in endoplasmic reticulum-associated raft subdomains. J Alzheimers Dis 2012; 28:127-35. [PMID: 21971401 DOI: 10.3233/jad-2011-110434] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The amyloid-β protein precursor (AβPP) is a type I transmembrane protein that undergoes maturation during trafficking in the secretory pathway. Proper maturation and trafficking of AβPP are necessary prerequisites for AβPP processing to generate amyloid-β (Aβ), the core component of Alzheimer's disease senile plaques. Recently, we reported that the glycosylphosphatidylinositol (GPI)-anchored protein growth arrest-specific 1 (Gas1) binds to and interferes with the maturation and processing of AβPP. Gas1 expression led to a trafficking blockade of AβPP between the endoplasmic reticulum (ER) and the Golgi. GPI-anchored proteins can exit the ER by transiting through raft subdomains acting as specialized sorting platforms. Here, we show that Gas1 co-partitioned and formed a complex with AβPP in raft fractions, wherein Gas1 overexpression triggered immature AβPP accumulation. Pharmacological interference of ER to Golgi transport increased immature AβPP accumulation upon Gas1 expression in these raft fractions, which were found to be positive for the COPII protein complex component Sec31A, a specific marker for ER exit sites. Furthermore, a Gas1 mutant lacking the GPI anchor that could not transit through rafts was still able to form a complex with AβPP but did not lead to immature AβPP accumulation in rafts. Together these data show that Gas1 interfered with AβPP trafficking by interacting with AβPP to facilitate its translocation into specialized ER-associated rafts where immature AβPP accumulated.
Collapse
Affiliation(s)
- Julien Chapuis
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | | | | | | | | |
Collapse
|
12
|
Milnthorpe AT, Soloviev M. The use of EST expression matrixes for the quality control of gene expression data. PLoS One 2012; 7:e32966. [PMID: 22412959 PMCID: PMC3297614 DOI: 10.1371/journal.pone.0032966] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2011] [Accepted: 02/06/2012] [Indexed: 01/10/2023] Open
Abstract
EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding "tissue-specific" genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging.
Collapse
Affiliation(s)
- Andrew T. Milnthorpe
- School of Biological Sciences, CBMS, Royal Holloway University of London, Egham, Surrey, United Kingdom
| | - Mikhail Soloviev
- School of Biological Sciences, CBMS, Royal Holloway University of London, Egham, Surrey, United Kingdom
| |
Collapse
|
13
|
Fierro AC, Vandenbussche F, Engelen K, Van de Peer Y, Marchal K. Meta Analysis of Gene Expression Data within and Across Species. Curr Genomics 2011; 9:525-34. [PMID: 19516959 PMCID: PMC2694560 DOI: 10.2174/138920208786847935] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2008] [Revised: 07/07/2008] [Accepted: 07/18/2008] [Indexed: 01/15/2023] Open
Abstract
Since the second half of the 1990s, a large number of genome-wide analyses have been described that study gene expression at the transcript level. To this end, two major strategies have been adopted, a first one relying on hybridization techniques such as microarrays, and a second one based on sequencing techniques such as serial analysis of gene expression (SAGE), cDNA-AFLP, and analysis based on expressed sequence tags (ESTs). Despite both types of profiling experiments becoming routine techniques in many research groups, their application remains costly and laborious. As a result, the number of conditions profiled in individual studies is still relatively small and usually varies from only two to few hundreds of samples for the largest experiments. More and more, scientific journals require the deposit of these high throughput experiments in public databases upon publication. Mining the information present in these databases offers molecular biologists the possibility to view their own small-scale analysis in the light of what is already available. However, so far, the richness of the public information remains largely unexploited. Several obstacles such as the correct association between ESTs and microarray probes with the corresponding gene transcript, the incompleteness and inconsistency in the annotation of experimental conditions, and the lack of standardized experimental protocols to generate gene expression data, all impede the successful mining of these data. Here, we review the potential and difficulties of combining publicly available expression data from respectively EST analyses and microarray experiments. With examples from literature, we show how meta-analysis of expression profiling experiments can be used to study expression behavior in a single organism or between organisms, across a wide range of experimental conditions. We also provide an overview of the methods and tools that can aid molecular biologists in exploiting these public data.
Collapse
Affiliation(s)
- Ana C Fierro
- Department of Microbial and Molecular Systems, Katholieke Universiteit Leuven, Kasteelpark Arenberg 20, 3001 Leuven, Belgium
| | | | | | | | | |
Collapse
|
14
|
Chapuis J, Vingtdeux V, Campagne F, Davies P, Marambaud P. Growth arrest-specific 1 binds to and controls the maturation and processing of the amyloid-beta precursor protein. Hum Mol Genet 2011; 20:2026-36. [PMID: 21357679 DOI: 10.1093/hmg/ddr085] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Alzheimer's disease (AD), the most common neurodegenerative disorder, is characterized by cerebral deposition of amyloid-β (Aβ), a series of peptides derived from the processing of the amyloid-β precursor protein (APP). To identify new candidate genes for AD, we recently performed a transcriptome analysis to screen for genes preferentially expressed in the hippocampus and located in AD linkage regions. This strategy identified CALHM1 (calcium homeostasis modulator 1), a gene modulating AD age at onset and Aβ metabolism. Here, we focused our attention on another candidate identified using this screen, growth arrest-specific 1 (Gas1), a gene involved in the central nervous system development. We found that Gas1 formed a complex with APP and controlled APP maturation and processing. Gas1 expression inhibited APP full glycosylation and routing to the cell surface by leading to a trafficking blockade of APP between the endoplasmic reticulum and the Golgi. Gas1 expression also resulted in a robust inhibition of APP transport into multivesicular bodies, further demonstrating that Gas1 negatively regulated APP intracellular trafficking. Consequently, Gas1 overexpression led to a reduction in Aβ production, and conversely, Gas1 silencing in cells expressing endogenously Gas1 increased Aβ levels. These results suggest that Gas1 is a novel APP-interacting protein involved in the control of APP maturation and processing.
Collapse
Affiliation(s)
- Julien Chapuis
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, North Shore-LIJ, Manhasset, NY 11030, USA
| | | | | | | | | |
Collapse
|
15
|
Funari VA, Voevodski K, Leyfer D, Yerkes L, Cramer D, Tolan DR. Quantitative gene expression profiles in real time from expressed sequence tag databases. Gene Expr 2010; 14:321-36. [PMID: 20635574 PMCID: PMC2954622 DOI: 10.3727/105221610x12717040569820] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
An accumulation of expressed sequence tag (EST) data in the public domain and the availability of bioinformatic programs have made EST gene expression profiling a common practice. However, the utility and validity of using EST databases (e.g., dbEST) has been criticized, particularly for quantitative assessment of gene expression. Problems with EST sequencing errors, library construction, EST annotation, and multiple paralogs make generation of specific and sensitive qualitative arid quantitative expression profiles a concern. In addition, most EST-derived expression data exists in previously assembled databases. The Virtual Northern Blot (VNB) (http: //tlab.bu.edu/vnb.html) allows generation, evaluation, and optimization of expression profiles in real time, which is especially important for alternatively spliced, novel, or poorly characterized genes. Representative gene families with variable nucleotide sequence identity, tissue specificity, and levels of expression (bcl-xl, aldoA, and cyp2d9) are used to assess the quality of VNB's output. The profiles generated by VNB are more sensitive and specific than those constructed with ESTs listed in preindexed databases at UCSC and NCBI. Moreover, quantitative expression profiles produced by VNB are comparable to quantization obtained from Northern blots and qPCR. The VNB pipeline generates real-time gene expression profiles for single-gene queries that are both qualitatively and quantitatively reliable.
Collapse
Affiliation(s)
| | | | - Dimitry Leyfer
- †Bioinformatics Program, Boston University, Boston, MA, USA
| | - Laura Yerkes
- *Biology Department, Boston University, Boston, MA, USA
| | - Donald Cramer
- *Biology Department, Boston University, Boston, MA, USA
| | - Dean R. Tolan
- *Biology Department, Boston University, Boston, MA, USA
- †Bioinformatics Program, Boston University, Boston, MA, USA
| |
Collapse
|
16
|
Kumar A, Muzik O, Chugani D, Chakraborty P, Chugani HT. PET-derived biodistribution and dosimetry of the benzodiazepine receptor-binding radioligand (11)C-(R)-PK11195 in children and adults. J Nucl Med 2009; 51:139-44. [PMID: 20008990 DOI: 10.2967/jnumed.109.066472] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
UNLABELLED The PET tracer (11)C-(R)-PK11195 (PK) is an antagonist of the peripheral-type benzodiazepine binding site and allows the noninvasive imaging of microglial activation seen in several neurologic disorders affecting the mature and developing brain. The objective of this study was to derive the biodistribution and in vivo radiation dose estimates of PK in children studied for brain inflammatory conditions and in healthy adults. METHODS Twenty-two children (mean age +/- SD, 9.5 +/- 4 y; range, 4-17 y; 10 girls) who underwent dynamic PK PET for conditions involving brain inflammation were studied. Seven healthy adults (age, 27.4 +/- 7.5 y; range, 22-41 y; 3 women) were evaluated using the same protocol. Normal-organ time-activity curves and residence times were derived and absorbed doses then calculated using the OLINDA software. Two other healthy young adults (1 man, 1 woman) also underwent sequential whole-body PET using a PET/CT scanner to obtain corresponding CT images and PK pharmacokinetics. RESULTS PK uptake was highest in the gallbladder and urinary bladder, followed by the liver, kidney, bone marrow, salivary gland, and heart wall, with minimal localization in all other organs including normal brain and lungs. PK was excreted through the hepatobiliary and renal systems. The average effective dose equivalent was 11.6 +/- 0.6 microSv/MBq (mean +/- SD) for young children (age, 4-7 y), 7.7 +/- 1.0 microSv/MBq for older children (age, 8-12 y), 5.3 +/- 0.5 muSv/MBq for adolescents (age, 13-17 y), and 4.6 +/- 2.7 microSv/MBq for adults. The gallbladder wall received the highest radiation dose in children younger than 12 y, whereas the urinary bladder wall received the highest dose in older children and adults. For an administered activity of 17 MBq/kg (0.45 mCi/kg), the effective dose equivalent was about 5 mSv or below for all age groups. CONCLUSION At clinically practical administered activities, the radiation dose from (11)C-PK11195 in both children and adults is comparable to that from other clinical PET tracers and diagnostic radiopharmaceuticals in routine clinical use.
Collapse
Affiliation(s)
- Ajay Kumar
- Department of Pediatrics, School of Medicine, Wayne State University PET Center, Children's Hospital of Michigan, Detroit Medical Center, Detroit, Michigan 48201, USA.
| | | | | | | | | |
Collapse
|
17
|
Luoto P, Laitinen I, Suilamo S, Någren K, Roivainen A. Human dosimetry of carbon-11 labeled N-butan-2-yl-1-(2-chlorophenyl)-N-methylisoquinoline-3-carboxamide extrapolated from whole-body distribution kinetics and radiometabolism in rats. Mol Imaging Biol 2009; 12:435-42. [PMID: 19941083 DOI: 10.1007/s11307-009-0293-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2009] [Revised: 07/23/2009] [Accepted: 07/29/2009] [Indexed: 11/25/2022]
Abstract
PURPOSE Carbon-11 labeled N-butan-2-yl-1-(2-chlorophenyl)-N-methylisoquinoline-3-carboxamide ([11C]PK11195) is a peripheral benzodiazepine receptor (PBR) antagonist that is used as a positron emission tomography (PET) radiopharmaceutical for neuroinflammatory imaging. This study was designed to investigate the radiation dosimetry of [11C]PK11195. PROCEDURES Whole-body distribution kinetics of intravenously administered [11C]PK11195 in rats was assessed by means of dynamic PET imaging, and estimates for human radiation dosimetry were calculated. Rat plasma and various tissue homogenates obtained at different time points after intravenous injection of [11C]PK11195 were analyzed by reversed-phase gradient radio-HPLC method using online radiodetection. In addition, in vitro stability of [11C]PK11195 was determined in rat brain homogenate by incubation at +37 degrees C. RESULTS PET imaging of rats showed the highest radioactivity levels in heart, kidneys, thyroid gland, liver, and lungs. The radioactivity cleared rapidly from lungs and slowly from heart and liver. However, much of the radioactivity retained in kidneys, which was in concordance with the observed low urinary excretion of [11C]PK11195. Extrapolating from the rat data, the effective dose of [11C]PK11195 for a 70-kg man was estimated to be 4.2 +/- 0.3 microSv/MBq. Five different radiometabolites were detected in rat plasma, and the level of intact [11C]PK11195 decreased from 80% +/- 11% (mean +/- SD) at 10 min to 44% +/- 5% at 40 min after injection. In rat heart, brain, kidney, and lung homogenates, more than 90% of total radioactivity originated from intact [11C]PK11195. In liver, however, the amount of [11C]PK11195 was approximately 70% and decreased over time, indicating metabolism by liver enzymes. CONCLUSIONS [11C]PK11195 showed a fast uptake in many rat tissues and it was metabolized relatively fast in vivo, but not in brain in vitro. The estimated effective dose for humans speaks for the use of [11C]PK11195 in human PET imaging.
Collapse
Affiliation(s)
- Pauliina Luoto
- Turku PET Center, University of Turku, FI-20521, Turku, Finland,
| | | | | | | | | |
Collapse
|
18
|
Kogenaru S, del Val C, Hotz-Wagenblatt A, Glatting KH. TissueDistributionDBs: a repository of organism-specific tissue-distribution profiles. Theor Chem Acc 2009. [DOI: 10.1007/s00214-009-0670-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
19
|
Abstract
INTRODUCTION Studies suggest that there is a considerable genetic contribution to individual episodic memory performance. Identifying genes which impact recollection may further elucidate an emerging biology and pave the way towards novel cognitive interventions. To date, several candidate genes have been explored and a few seem to have modest but measurable effects. METHODS Here we review the biology of memory with particular focus on episodic memory, critically appraise the published evidence supporting the role of several candidate genes, and make suggestions for future pathways of research. RESULTS We found moderate evidence for several candidate genes implicated in episodic memory formation, with converging lines of neurobiologic evidence especially strong for only a select few. Perhaps unexpectedly, little work has been done on other aspects of memory, including the semantic and autobiographical systems. CONCLUSIONS Larger studies utilizing more elaborate methodologies to measure the spectrum of episodic memory are required to move the field forward.
Collapse
Affiliation(s)
- Jeremy Koppel
- The Litwin-Zucker Research Center for the Study of Alzheimer's Disease and Memory Disorders, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA.
| | | |
Collapse
|
20
|
Van Deun K, Hoijtink H, Thorrez L, Van Lommel L, Schuit F, Van Mechelen I. Testing the hypothesis of tissue selectivity: the intersection-union test and a Bayesian approach. Bioinformatics 2009; 25:2588-94. [PMID: 19671693 PMCID: PMC2752611 DOI: 10.1093/bioinformatics/btp439] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Motivation: Finding genes that are preferentially expressed in a particular tissue or condition is a problem that cannot be solved by standard statistical testing procedures. A relatively unknown procedure that can be used is the intersection–union test (IUT). However, two disadvantages of the IUT are that it is conservative and it conveys only the information of the least differing target tissue–other tissue pair. Results: We propose a Bayesian procedure that quantifies how much evidence there is in the overall expression profile for selective over-expression. In a small simulation study, it is shown that the proposed method outperforms the IUT when it comes to finding selectively expressed genes. An application to publicly available data consisting of 22 tissues shows that the Bayesian method indeed selects genes with functions that reflect the specific tissue functions. The proposed method can also be used to find genes that are underexpressed in a particular tissue. Availability: Both MATLAB and R code that implement the IUT and the Bayesian procedure in an efficient way, can be downloaded at http://ppw.kuleuven.be/okp/software/BayesianIUT/. Contact:katrijn.vandeun@psy.kuleuven.be
Collapse
Affiliation(s)
- K Van Deun
- Center for Computational Systems Biology SymBioSys, Katholieke Universiteit Leuven, 3000 Leuven, Belgium.
| | | | | | | | | | | |
Collapse
|
21
|
Uchida S, Schneider A, Wiesnet M, Jungblut B, Zarjitskaya P, Jenniches K, Kreymborg KG, Seeger W, Braun T. An integrated approach for the systematic identification and characterization of heart-enriched genes with unknown functions. BMC Genomics 2009; 10:100. [PMID: 19267916 PMCID: PMC2657154 DOI: 10.1186/1471-2164-10-100] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2008] [Accepted: 03/06/2009] [Indexed: 12/11/2022] Open
Abstract
Background High throughput techniques have generated a huge set of biological data, which are deposited in various databases. Efficient exploitation of these databases is often hampered by a lack of appropriate tools, which allow easy and reliable identification of genes that miss functional characterization but are correlated with specific biological conditions (e.g. organotypic expression). Results We have developed a simple algorithm (DGSA = Database-dependent Gene Selection and Analysis) to identify genes with unknown functions involved in organ development concentrating on the heart. Using our approach, we identified a large number of yet uncharacterized genes, which are expressed during heart development. An initial functional characterization of genes by loss-of-function analysis employing morpholino injections into zebrafish embryos disclosed severe developmental defects indicating a decisive function of selected genes for developmental processes. Conclusion We conclude that DGSA is a versatile tool for database mining allowing efficient selection of uncharacterized genes for functional analysis.
Collapse
Affiliation(s)
- Shizuka Uchida
- Max-Planck-Institute for Heart and Lung Research, Parkstrasse 1, Bad Nauheim, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Lambert JC, Campagne F, Marambaud P. [CALHM1, a novel gene to blame in Alzheimer disease]. Med Sci (Paris) 2009; 24:923-4. [PMID: 19038093 DOI: 10.1051/medsci/20082411923] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
23
|
Whole-body distribution and metabolism of [N-methyl-11C](R)-1-(2-chlorophenyl)-N-(1-methylpropyl)-3-isoquinolinecarboxamide in humans; an imaging agent for in vivo assessment of peripheral benzodiazepine receptor activity with positron emission tomography. Eur J Nucl Med Mol Imaging 2008; 36:671-82. [PMID: 19050880 DOI: 10.1007/s00259-008-1000-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2008] [Accepted: 10/29/2008] [Indexed: 10/21/2022]
Abstract
PURPOSE (11)C-PK11195 is a radiopharmaceutical for in vivo assessment of peripheral benzodiazepine receptor (PBR) activity using PET. We sought to clarify the metabolic fate of (11)C-PK11195 in a test-retest setting using radio-HPLC in comparison with radio-TLC, and the whole-body distribution in humans. MATERIALS AND METHODS In order to evaluate the reproducibility of radio-HPLC metabolite analyses, ten patients with Alzheimer's disease (AD) underwent two successive (11)C-PK11195 examinations on separate days. For comparison of different analytical methods, plasma samples from seven patients were also analysed by radio-TLC. In addition, we evaluated the whole-body distribution of (11)C-PK11195 and its uptake in the brain. RESULTS The level of unmetabolized (11)C-PK11195 decreased slowly from 96.3 +/- 1.6% (mean+/-SD) at 5 min to 62.7 +/- 8.3% at 40 min after injection. Large individual variation was observed in the amount of plasma (11)C-PK11195 radiometabolites. The whole-body distribution of (11)C-PK11195 showed the highest radioactivity levels in urinary bladder, adrenal gland, liver, salivary glands, heart, kidneys, and vertebral column. In addition, the hip bone and breast bone were clearly visualized by PET. In patients with AD, (11)C-PK11195 uptake in the brain was the highest in the basal ganglia and thalamus, followed by the cortical grey matter regions and the cerebellum. Low (11)C-PK11195 uptake was observed in the white matter. CONCLUSION Our results indicate that (11)C-PK11195 is eliminated both through the renal and hepatobiliary systems. Careful analysis of plasma metabolites is required to determine the accurate arterial input function for quantitative PET measurement.
Collapse
|
24
|
Hishiki T, Tamada I, Okubo K. Gene-L'EXPO: a tool to extract knowledge From transcriptomes and find 'Literature-Sparse' relationships between genes and tissues. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2008; 2008:313-317. [PMID: 18999036 PMCID: PMC2656066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Revised: 07/15/2008] [Indexed: 05/27/2023]
Abstract
The increasing volume and diversity of transcriptome data in the public domain offer an opportunity to advance new questions and hypotheses. We anticipate that tools that can visualize the gap in the distribution of information between the scientific literature and actual data would prompt such questions. We focused on the roles played by various genes in tissues, and have developed a database that contrasts information on gene expression in tissues with PubMed text and transcriptome data. Data pairs of tissues and the genes that might be expressed there were automatically extracted from text with vocabularies for the genes and tissues. The anatomical categories of various expressed sequence tag (EST) libraries were also automatically determined. These types of information were linked using the hierarchical structure of the Metathesaurus in UMLS.
Collapse
|
25
|
Dreses-Werringloer U, Lambert JC, Vingtdeux V, Zhao H, Vais H, Siebert A, Jain A, Koppel J, Rovelet-Lecrux A, Hannequin D, Pasquier F, Galimberti D, Scarpini E, Mann D, Lendon C, Campion D, Amouyel P, Davies P, Foskett JK, Campagne F, Marambaud P. A polymorphism in CALHM1 influences Ca2+ homeostasis, Abeta levels, and Alzheimer's disease risk. Cell 2008; 133:1149-61. [PMID: 18585350 DOI: 10.1016/j.cell.2008.05.048] [Citation(s) in RCA: 252] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2008] [Revised: 04/30/2008] [Accepted: 05/22/2008] [Indexed: 12/31/2022]
Abstract
Alzheimer's disease (AD) is a genetically heterogeneous disorder characterized by early hippocampal atrophy and cerebral amyloid-beta (Abeta) peptide deposition. Using TissueInfo to screen for genes preferentially expressed in the hippocampus and located in AD linkage regions, we identified a gene on 10q24.33 that we call CALHM1. We show that CALHM1 encodes a multipass transmembrane glycoprotein that controls cytosolic Ca(2+) concentrations and Abeta levels. CALHM1 homomultimerizes, shares strong sequence similarities with the selectivity filter of the NMDA receptor, and generates a large Ca(2+) conductance across the plasma membrane. Importantly, we determined that the CALHM1 P86L polymorphism (rs2986017) is significantly associated with AD in independent case-control studies of 3404 participants (allele-specific OR = 1.44, p = 2 x 10(-10)). We further found that the P86L polymorphism increases Abeta levels by interfering with CALHM1-mediated Ca(2+) permeability. We propose that CALHM1 encodes an essential component of a previously uncharacterized cerebral Ca(2+) channel that controls Abeta levels and susceptibility to late-onset AD.
Collapse
Affiliation(s)
- Ute Dreses-Werringloer
- Litwin-Zucker Research Center for the Study of Alzheimer's Disease, The Feinstein Institute for Medical Research, North Shore-LIJ, Manhasset, NY 11030, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Aguilar D, Skrabanek L, Gross SS, Oliva B, Campagne F. Beyond tissueInfo: functional prediction using tissue expression profile similarity searches. Nucleic Acids Res 2008; 36:3728-37. [PMID: 18483083 PMCID: PMC2441795 DOI: 10.1093/nar/gkn233] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We present and validate tissue expression profile similarity searches (TEPSS), a computational approach to identify transcripts that share similar tissue expression profiles to one or more transcripts in a group of interest. We evaluated TEPSS for its ability to discriminate between pairs of transcripts coding for interacting proteins and non-interacting pairs. We found that ordering protein-protein pairs by TEPSS score produces sets significantly enriched in reported pairs of interacting proteins [interacting versus non-interacting pairs, Odds-ratio (OR) = 157.57, 95% confidence interval (CI) (36.81-375.51) at 1% coverage, employing a large dataset of about 50 000 human protein interactions]. When used with multiple transcripts as input, we find that TEPSS can predict non-obvious members of the cytosolic ribosome. We used TEPSS to predict S-nitrosylation (SNO) protein targets from a set of brain proteins that undergo SNO upon exposure to physiological levels of S-nitrosoglutathione in vitro. While some of the top TEPSS predictions have been validated independently, several of the strongest SNO TEPSS predictions await experimental validation. Our data indicate that TEPSS is an effective and flexible approach to functional prediction. Since the approach does not use sequence similarity, we expect that TEPSS will be useful for various gene discovery applications. TEPSS programs and data are distributed at http://icb.med.cornell.edu/crt/tepss/index.xml.
Collapse
Affiliation(s)
- Daniel Aguilar
- HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Medical College of Cornell University, 1305 York Ave, New York, NY 10021, USA
| | | | | | | | | |
Collapse
|
27
|
Helftenbein G, Koslowski M, Dhaene K, Seitz G, Sahin U, Türeci O. In silico strategy for detection of target candidates for antibody therapy of solid tumors. Gene 2008; 414:76-84. [PMID: 18358640 DOI: 10.1016/j.gene.2008.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2007] [Revised: 02/05/2008] [Accepted: 02/13/2008] [Indexed: 10/22/2022]
Abstract
In contrast to earlier attempts for the identification of target candidates suitable for monoclonal antibody (mAb) based cancer therapies we concentrated on highly selective lineage-specific genes additionally preserved or even overexpressed in orthotopic cancers. In a script aided workflow we reduced all human entries of the RefSeq mRNA database to those encoding transmembrane domain bearing gene products and subjected them to BLAST analysis against the human EST database. All BLAST results were validated in a gene centric way allowing two types of data curation prior to expression profiling of matching ESTs in selected healthy tissues: (i) exclusion of questionable ESTs arising e.g. from genomic contamination and (ii) elimination of erroneously predicted mRNAs as well as transcripts with only weak EST coverage. The impact of such stringent input control on accuracy of prediction is underlined by RT-PCR confirmation of predicted tissue distribution patterns for a number of selected candidates.
Collapse
|
28
|
Klee EW. Data Mining for Biomarker Development: A Review of Tissue Specificity Analysis. Clin Lab Med 2008; 28:127-43, viii. [DOI: 10.1016/j.cll.2007.10.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
29
|
Ganley IG, Espinosa E, Pfeffer SR. A syntaxin 10-SNARE complex distinguishes two distinct transport routes from endosomes to the trans-Golgi in human cells. ACTA ACUST UNITED AC 2008; 180:159-72. [PMID: 18195106 PMCID: PMC2213607 DOI: 10.1083/jcb.200707136] [Citation(s) in RCA: 134] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Mannose 6-phosphate receptors (MPRs) are transported from endosomes to the Golgi after delivering lysosomal enzymes to the endocytic pathway. This process requires Rab9 guanosine triphosphatase (GTPase) and the putative tether GCC185. We show in human cells that a soluble NSF attachment protein receptor (SNARE) complex comprised of syntaxin 10 (STX10), STX16, Vti1a, and VAMP3 is required for this MPR transport but not for the STX6-dependent transport of TGN46 or cholera toxin from early endosomes to the Golgi. Depletion of STX10 leads to MPR missorting and hypersecretion of hexosaminidase. Mouse and rat cells lack STX10 and, thus, must use a different target membrane SNARE for this process. GCC185 binds directly to STX16 and is competed by Rab6. These data support a model in which the GCC185 tether helps Rab9-bearing transport vesicles deliver their cargo to the trans-Golgi and suggest that Rab GTPases can regulate SNARE–tether interactions. Importantly, our data provide a clear molecular distinction between the transport of MPRs and TGN46 to the trans-Golgi.
Collapse
Affiliation(s)
- Ian G Ganley
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | |
Collapse
|
30
|
Banerjee D, Nandagopal K. Potential Interaction Between the GARS-AIRS-GART Gene and CP2/LBP-1c/LSF Transcription Factor in Down Syndrome-related Alzheimer Disease. Cell Mol Neurobiol 2007; 27:1117-26. [PMID: 17902044 DOI: 10.1007/s10571-007-9217-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Accepted: 08/31/2007] [Indexed: 10/22/2022]
Abstract
(1) GARS-AIRS-GART is an important candidate gene in studies of Down syndrome (DS)-related Alzheimer's disease (AD), due to its chromosomal localization (21q22.1) in the Down syndrome critical region, involvement in de novo purine biosynthesis, and over-expression in DS brain. The aim of this study was to identify factor(s) likely to enhance transcription of GARS-AIRS-GART in DS-related AD. (2) Based on a bio-informatics approach, the PromoterInspector, Promoter Scan II, and EBI toolbox CpG plot software programs were used to identify GARS-AIRS-GART sequences important for gene transcription. Transcription factor binding motifs within these regions were mapped with the help of the MatInspector and TFSEARCH programs. Factors implicated in neurodevelopment or neurodegeneration were the focus of attention, and mining of human (T1Dbase) and murine (GNF) expression databases revealed information on the regional distribution of these factors and their relative abundance vis-a-vis GARS-AIRS-GART. (3) The Leader-binding protein 1-c (LBP-1c/CP2/LSF) emerged as a promising candidate from these studies, as MatInspector and TFSEARCH analyses revealed a total of four CP2 binding sites with potential for functional interaction(s) within the promoter and CpG islands of GARS-AIRS-GART. Furthermore, two of these sites harbor sequences for methylation-sensitive restriction enzymes, which suggest that methylation status may, in part, regulate CP2-mediated transcription of GARS-AIRS-GART. A search of T1Dbase and GNF expression databases reveals co-expression of CP2 and GARS-AIRS-GART in brain regions relevant to DS-related AD. (4) The virtual screen identified CP2/LBP-1c/LSF as a factor that likely mediates enhanced transcription of GARS-AIRS-GART in DS-related AD.
Collapse
Affiliation(s)
- Disha Banerjee
- Manovikas Kendra Rehabilitation and Research Institute for the Handicapped, , Kolkata, 700107, India
| | | |
Collapse
|
31
|
Hulbert EM, Smink LJ, Adlem EC, Allen JE, Burdick DB, Burren OS, Cassen VM, Cavnor CC, Dolman GE, Flamez D, Friery KF, Healy BC, Killcoyne SA, Kutlu B, Schuilenburg H, Walker NM, Mychaleckyj J, Eizirik DL, Wicker LS, Todd JA, Goodman N. T1DBase: integration and presentation of complex data for type 1 diabetes research. Nucleic Acids Res 2006; 35:D742-6. [PMID: 17169983 PMCID: PMC1781218 DOI: 10.1093/nar/gkl933] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
T1DBase () [Smink et al. (2005) Nucleic Acids Res., 33, D544–D549; Burren et al. (2004) Hum. Genomics, 1, 98–109] is a public website and database that supports the type 1 diabetes (T1D) research community. T1DBase provides a consolidated T1D-oriented view of the complex data world that now confronts medical researchers and enables scientists to navigate from information they know to information that is new to them. Overview pages for genes and markers summarize information for these elements. The Gene Dossier summarizes information for a list of genes. GBrowse [Stein et al. (2002) Genome Res., 10, 1599–1610] displays genes and other features in their genomic context, and Cytoscape [Shannon et al. (2003) Genome Res., 13, 2498–2504] shows genes in the context of interacting proteins and genes. The Beta Cell Gene Atlas shows gene expression in β cells, islets, and related cell types and lines, and the Tissue Expression Viewer shows expression across other tissues. The Microarray Viewer shows expression from more than 20 array experiments. The Beta Cell Gene Expression Bank contains manually curated gene and pathway annotations for genes expressed in β cells. T1DMart is a query tool for markers and genotypes. PosterPages are ‘home pages’ about specific topics or datasets. The key challenge, now and in the future, is to provide powerful informatics capabilities to T1D scientists in a form they can use to enhance their research.
Collapse
|
32
|
Campagne F, Skrabanek L. Mining expressed sequence tags identifies cancer markers of clinical interest. BMC Bioinformatics 2006; 7:481. [PMID: 17078886 PMCID: PMC1635568 DOI: 10.1186/1471-2105-7-481] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2006] [Accepted: 11/01/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene expression data are a rich source of information about the transcriptional dis-regulation of genes in cancer. Genes that display differential regulation in cancer are a subtype of cancer biomarkers. RESULTS We present an approach to mine expressed sequence tags to discover cancer biomarkers. A false discovery rate analysis suggests that the approach generates less than 22% false discoveries when applied to combined human and mouse whole genome screens. With this approach, we identify the 200 genes most consistently differentially expressed in cancer (called HM200) and proceed to characterize these genes. When used for prediction in a variety of cancer classification tasks (in 24 independent cancer microarray datasets, 59 classifications total), we show that HM200 and the shorter gene list HM100 are very competitive cancer biomarker sets. Indeed, when compared to 13 published cancer marker gene lists, HM200 achieves the best or second best classification performance in 79% of the classifications considered. CONCLUSION These results indicate the existence of at least one general cancer marker set whose predictive value spans several tumor types and classification types. Our comparison with other marker gene lists shows that HM200 markers are mostly novel cancer markers. We also identify the previously published Pomeroy-400 list as another general cancer marker set. Strikingly, Pomeroy-400 has 27 genes in common with HM200. Our data suggest that a core set of genes are responsive to the deregulation of pathways involved in tumorigenesis in a variety of tumor types and that these genes could serve as transcriptional cancer markers in applications of clinical interest. Finally, our study suggests new strategies to select and evaluate cancer biomarkers in microarray studies.
Collapse
Affiliation(s)
- Fabien Campagne
- Institute for Computational Biomedicine and Dept. of Physiology and Biophysics, Weill Medical College of Cornell University; 1300 York Ave; New York, NY 10021, USA
| | - Lucy Skrabanek
- Institute for Computational Biomedicine and Dept. of Physiology and Biophysics, Weill Medical College of Cornell University; 1300 York Ave; New York, NY 10021, USA
| |
Collapse
|
33
|
Savas S, Schmidt S, Jarjanazi H, Ozcelik H. Functional nsSNPs from carcinogenesis-related genes expressed in breast tissue: potential breast cancer risk alleles and their distribution across human populations. Hum Genomics 2006; 2:287-96. [PMID: 16595073 PMCID: PMC3500178 DOI: 10.1186/1479-7364-2-5-287] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Although highly penetrant alleles of BRCA1 and BRCA2 have been shown to predispose to breast cancer, the majority of breast cancer cases are assumed to result from the presence of low-moderate penetrant alleles and environmental carcinogens. Non-synonymous single nucleotide polymorphisms (nsSNPs) are hypothesised to contribute to disease susceptibility and approximately 30 per cent of them are predicted to have a biological significance. In this study, we have applied a bioinformatics-based strategy to identify breast cancer-related nsSNPs from 981 carcinogenesis-related genes expressed in breast tissue. Our results revealed a total of 367 validated nsSNPs, 109 (29.7 per cent) of which are predicted to affect the protein function (functional nsSNPs), suggesting that these nsSNPs are likely to influence the development and homeostasis of breast tissue and hence contribute to breast cancer susceptibility. Sixty-seven of the functional nsSNPs presented as commonly occurring nsSNPs (minor allele frequencies ≥ 5 per cent), representing excellent candidates for breast cancer susceptibility. Additionally, a non-uniform distribution of the common functional nsSNPs among different human populations was observed: 15 nsSNPs were reported to be present in all populations analysed, whereas another set of 15 nsSNPs was specific to particular population(s). We propose that the nsSNPs analysed in this study constitute a unique resource of potential genetic factors for breast cancer susceptibility. Furthermore, the variations in functional nsSNP allele frequencies across major population backgrounds may point to the potential variability of the molecular basis of breast cancer predisposition and treatment response among different human populations.
Collapse
Affiliation(s)
- Sevtap Savas
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| | - Steffen Schmidt
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Hamdi Jarjanazi
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| | - Hilmi Ozcelik
- Fred A. Litwin Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G 1X5, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, 600 University Avenue, Toronto, ON, M5G IX5, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, 100 College Street, Toronto, ON, M5G IL5, Canada
| |
Collapse
|
34
|
Pao SY, Lin WL, Hwang MJ. In silico identification and comparative analysis of differentially expressed genes in human and mouse tissues. BMC Genomics 2006; 7:86. [PMID: 16626500 PMCID: PMC1462998 DOI: 10.1186/1471-2164-7-86] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2005] [Accepted: 04/21/2006] [Indexed: 11/21/2022] Open
Abstract
Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes.
Collapse
Affiliation(s)
- Sheng-Ying Pao
- Institute of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Win-Li Lin
- Institute of Biomedical Engineering, National Taiwan University, Taipei, Taiwan
| | - Ming-Jing Hwang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
35
|
Liu D, Graber JH. Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation. BMC Bioinformatics 2006; 7:77. [PMID: 16503995 PMCID: PMC1431573 DOI: 10.1186/1471-2105-7-77] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2005] [Accepted: 02/17/2006] [Indexed: 12/28/2022] Open
Abstract
Background Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate these libraries. Results We used alignment of ESTs against a reference set of transcripts to estimate the size distributions of the cDNA inserts and sampled mRNA transcripts in individual EST libraries and show how these measurements can be used to inform quantitative comparisons of libraries. While significant attention has been paid to the effects of normalization and substraction, we also find significant biases in transcript sampling introduced by the combined procedures of reverse transcription and selection of cDNA clones for sequencing. Using examples drawn from studies of mRNA 3'-processing (cleavage and polyadenylation), we demonstrate effects of the transcript sampling bias, and provide a method for identifying libraries that can be safely compared without bias. All data sets, supplemental data, and software are available at our supplemental web site [1]. Conclusion The biases we characterize in the transcript sampling of EST libraries represent a significant and heretofore under-appreciated source of false positive candidates for tissue-, cell type-, or developmental stage-specific activity or processing of genes. Uncorrected, quantitative comparison of dissimilar EST libraries will likely result in the identification of statistically significant, but biologically meaningless changes.
Collapse
Affiliation(s)
- Donglin Liu
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Joel H Graber
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| |
Collapse
|
36
|
Pritsker M, Doniger TT, Kramer LC, Westcot SE, Lemischka IR. Diversification of stem cell molecular repertoire by alternative splicing. Proc Natl Acad Sci U S A 2005; 102:14290-5. [PMID: 16183747 PMCID: PMC1242282 DOI: 10.1073/pnas.0502132102] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2005] [Accepted: 08/16/2005] [Indexed: 12/29/2022] Open
Abstract
Complete information regarding transcriptional and posttranscriptional gene regulation in stem cells is necessary to understand the regulation of self-renewal and differentiation. Alternative splicing is a prevalent mode of posttranscriptional regulation, and occurs in approximately one half of all mammalian genes. The frequency and functional impact of alternative splicing in stem cells are yet to be determined. In this study we combine computational and experimental methods to identify splice variants in embryonic and hematopoietic stem cells on a genome-wide scale. Using EST collections derived from stem cells, we detect alternative splicing in >1,000 genes. Systematic RT-PCR and sequencing studies show confirmation of computational predictions at a level of 80%. We find that alternative splicing can modify multiple components of signaling pathways important for stem cell function. We also analyze the distribution of splice variants across different classes of genes. We find that tissue-specific genes have a higher tendency to undergo alternative splicing than ubiquitously expressed genes. Furthermore, the patterns of alternative splicing are only weakly conserved between orthologous genes in human and mouse. Our studies reveal extensive modification of the stem cell molecular repertoire by alternative splicing and provide insights into its overall role as a mechanism of generating genomic diversity.
Collapse
Affiliation(s)
- Moshe Pritsker
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| | | | | | | | | |
Collapse
|
37
|
Abstract
The shotgun proteomic strategy based on digesting proteins into peptides and sequencing them using tandem mass spectrometry and automated database searching has become the method of choice for identifying proteins in most large scale studies. However, the peptide-centric nature of shotgun proteomics complicates the analysis and biological interpretation of the data especially in the case of higher eukaryote organisms. The same peptide sequence can be present in multiple different proteins or protein isoforms. Such shared peptides therefore can lead to ambiguities in determining the identities of sample proteins. In this article we illustrate the difficulties of interpreting shotgun proteomic data and discuss the need for common nomenclature and transparent informatic approaches. We also discuss related issues such as the state of protein sequence databases and their role in shotgun proteomic analysis, interpretation of relative peptide quantification data in the presence of multiple protein isoforms, the integration of proteomic and transcriptional data, and the development of a computational infrastructure for the integration of multiple diverse datasets.
Collapse
|
38
|
O'Dushlaine CT, Edwards RJ, Park SD, Shields DC. Tandem repeat copy-number variation in protein-coding regions of human genes. Genome Biol 2005; 6:R69. [PMID: 16086851 PMCID: PMC1273636 DOI: 10.1186/gb-2005-6-8-r69] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2005] [Revised: 05/31/2005] [Accepted: 07/13/2005] [Indexed: 12/01/2022] Open
Abstract
Tandem repeat polymorphisms in human proteins were characterized using the UniGene dataset. This analysis suggests that 1 in 20 proteins are likely to contain tandem repeat copy-number polymorphisms within coding regions; these were prevalent among protein-binding proteins. Background Tandem repeat variation in protein-coding regions will alter protein length and may introduce frameshifts. Tandem repeat variants are associated with variation in pathogenicity in bacteria and with human disease. We characterized tandem repeat polymorphism in human proteins, using the UniGene database, and tested whether these were associated with host defense roles. Results Protein-coding tandem repeat copy-number polymorphisms were detected in 249 tandem repeats found in 218 UniGene clusters; observed length differences ranged from 2 to 144 nucleotides, with unit copy lengths ranging from 2 to 57. This corresponded to 1.59% (218/13,749) of proteins investigated carrying detectable polymorphisms in the copy-number of protein-coding tandem repeats. We found no evidence that tandem repeat copy-number polymorphism was significantly elevated in defense-response proteins (p = 0.882). An association with the Gene Ontology term 'protein-binding' remained significant after covariate adjustment and correction for multiple testing. Combining this analysis with previous experimental evaluations of tandem repeat polymorphism, we estimate the approximate mean frequency of tandem repeat polymorphisms in human proteins to be 6%. Because 13.9% of the polymorphisms were not a multiple of three nucleotides, up to 1% of proteins may contain frameshifting tandem repeat polymorphisms. Conclusion Around 1 in 20 human proteins are likely to contain tandem repeat copy-number polymorphisms within coding regions. Such polymorphisms are not more frequent among defense-response proteins; their prevalence among protein-binding proteins may reflect lower selective constraints on their structural modification. The impact of frameshifting and longer copy-number variants on protein function and disease merits further investigation.
Collapse
Affiliation(s)
- Colm T O'Dushlaine
- Bioinformatics Core, Department of Clinical Pharmacology and Institute of Biopharmaceutical Sciences, Royal College of Surgeons in Ireland, 123 St Stephen's Green, Dublin 2, Ireland
| | - Richard J Edwards
- Bioinformatics Core, Department of Clinical Pharmacology and Institute of Biopharmaceutical Sciences, Royal College of Surgeons in Ireland, 123 St Stephen's Green, Dublin 2, Ireland
| | - Stephen D Park
- Bioinformatics Core, Department of Clinical Pharmacology and Institute of Biopharmaceutical Sciences, Royal College of Surgeons in Ireland, 123 St Stephen's Green, Dublin 2, Ireland
| | - Denis C Shields
- Bioinformatics Core, Department of Clinical Pharmacology and Institute of Biopharmaceutical Sciences, Royal College of Surgeons in Ireland, 123 St Stephen's Green, Dublin 2, Ireland
| |
Collapse
|
39
|
Cusack BP, Wolfe KH. Changes in alternative splicing of human and mouse genes are accompanied by faster evolution of constitutive exons. Mol Biol Evol 2005; 22:2198-208. [PMID: 16049198 DOI: 10.1093/molbev/msi218] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Alternative splicing is known to be an important source of protein sequence variation, but its evolutionary impact has not been explored in detail. Studying alternative splicing requires extensive sampling of the transcriptome, but new data sets based on expressed sequence tags aligned to chromosomes make it possible to study alternative splicing on a genome-wide scale. Although genes showing alternative splicing by exon skipping are conserved as compared to the genome as a whole, we find that genes where structural differences between human and mouse result in genome-specific alternatively spliced exons in one species show almost 60% greater nonsynonymous divergence in constitutive exons than genes where exon skipping is conserved. This effect is also seen for genes showing species-specific patterns of alternative splicing where gene structure is conserved. Our observations are not attributable to an inherent difference in rate of evolution between these two sets of proteins or to differences with respect to predictors of evolutionary rate such as expression level, tissue specificity, or genetic redundancy. Where genome-specific alternatively spliced exons are seen in mammals, the vast majority of skipped exons appear to be recent additions to gene structures. Furthermore, among genes with genome-specific alternatively spliced exons, the degree of nonsynonymous divergence in constitutive sequence is a function of the frequency of incorporation of these alternative exons into transcripts. These results suggest that alterations in alternative splicing pattern can have knock-on effects in terms of accelerated sequence evolution in constant regions of the protein.
Collapse
Affiliation(s)
- Brian P Cusack
- Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin, Ireland
| | | |
Collapse
|
40
|
Critical evaluation of the JDO API for the persistence and portability requirements of complex biological databases. BMC Bioinformatics 2005; 6:5. [PMID: 15642112 PMCID: PMC545948 DOI: 10.1186/1471-2105-6-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2004] [Accepted: 01/10/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Complex biological database systems have become key computational tools used daily by scientists and researchers. Many of these systems must be capable of executing on multiple different hardware and software configurations and are also often made available to users via the Internet. We have used the Java Data Object (JDO) persistence technology to develop the database layer of such a system known as the SigPath information management system. SigPath is an example of a complex biological database that needs to store various types of information connected by many relationships. RESULTS Using this system as an example, we perform a critical evaluation of current JDO technology; discuss the suitability of the JDO standard to achieve portability, scalability and performance. We show that JDO supports portability of the SigPath system from a relational database backend to an object database backend and achieves acceptable scalability. To answer the performance question, we have created the SigPath JDO application benchmark that we distribute under the Gnu General Public License. This benchmark can be used as an example of using JDO technology to create a complex biological database and makes it possible for vendors and users of the technology to evaluate the performance of other JDO implementations for similar applications. CONCLUSIONS The SigPath JDO benchmark and our discussion of JDO technology in the context of biological databases will be useful to bioinformaticians who design new complex biological databases and aim to create systems that can be ported easily to a variety of database backends.
Collapse
|
41
|
Munoz ET, Bogarad LD, Deem MW. Microarray and EST database estimates of mRNA expression levels differ: the protein length versus expression curve for C. elegans. BMC Genomics 2004; 5:30. [PMID: 15134588 PMCID: PMC434498 DOI: 10.1186/1471-2164-5-30] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2004] [Accepted: 05/10/2004] [Indexed: 11/26/2022] Open
Abstract
Background Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microarray data and from abundance within the Expressed Sequence Tag (EST) database. We suggest that length is a significant factor in biases to measured gene expression rates. As a specific example of the importance of the bias of expression rate with length, we address the following evolutionary question: Does the average C. elegans protein length increase or decrease with expression level? Two different answers to this question have been reported in the literature, one method using expression levels estimated by abundance within the EST database and another using microarrays. We have investigated this issue by constructing the full protein length versus expression curve for C. elegans, using both methods for estimating expression levels. Results The microarray data show a monotonic decrease of length with expression level, whereas the abundance within the EST database data show a non-monotonic behavior. Furthermore, the ratio of the expression level estimated by the EST database to that measured by microarrays is not constant, but rather systematically biased with gene length. Conclusions It is suggested that the length bias may lie primarily in the abundance within the EST database method, being not ameliorated by internal standards as it is in the microarray data, and that this bias should be removed before data interpretation. When this is done, both the microarray and the abundance within the EST database give a monotonic decrease of spliced length with expression level, and the correlation between the EST and microarray data becomes larger. We suggest that standard RNA controls be used to normalize for length bias in any method that measures expression.
Collapse
Affiliation(s)
- Enrique T Munoz
- Department of Bioengineering, Rice University, Houston, TX 77005-1892 USA
| | - Leonard D Bogarad
- Department of Bioengineering, Rice University, Houston, TX 77005-1892 USA
| | - Michael W Deem
- Department of Bioengineering, Rice University, Houston, TX 77005-1892 USA
- Department of Physics & Astronomy, Rice University, Houston, TX 77005-1892 USA
| |
Collapse
|
42
|
McRedmond JP, Park SD, Reilly DF, Coppinger JA, Maguire PB, Shields DC, Fitzgerald DJ. Integration of proteomics and genomics in platelets: a profile of platelet proteins and platelet-specific genes. Mol Cell Proteomics 2003; 3:133-44. [PMID: 14645502 DOI: 10.1074/mcp.m300063-mcp200] [Citation(s) in RCA: 236] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Platelets, while anucleate, contain RNA, some of which is translated into protein upon activation. Hypothesising that the platelet proteome is reflected in the transcriptome, we identified 82 proteins secreted from activated platelets and compared these, as well as published proteomic data, to the transcriptional profile. We also compared the transcriptome of platelets to other tissues to identify platelet-specific genes and used ontology to determine gene categories over-represented in platelets. RNA was isolated from highly pure platelet preparations for hybridization to Affymetrix oligonucleotide arrays. We identified 2,928 distinct messages as being present in platelets. The platelet transcriptome was compared with the proteome by relating both to UniGene clusters. Platelet proteomic data correlated well with the transcriptome, with 69% of secreted proteins detectable at the mRNA level, and similar concordance was obtained using two published datasets. While many of the most abundant mRNAs are for known platelet proteins, messages were detected for proteins not previously reported in platelets. Some of these may represent residual megakaryocyte messages; however, proteomic analysis confirmed the expression of many previously unreported genes in platelets. Transcripts for well-described platelet proteins are among the most platelet-specific messages. Ontological categories related to signal transduction, receptors, ion channels, and membranes are over-represented in platelets, while categories involved in protein synthesis are depleted. Despite the absence of gene transcription, the platelet proteome is mirrored in the transcriptome. Conversely, transcriptional analysis predicts the presence of novel proteins in the platelet. Transcriptional analysis is relevant to platelet biology, providing insights into platelet function and the mechanisms of platelet disorders.
Collapse
Affiliation(s)
- J P McRedmond
- Proteomics and Bioinformatics Cores, Department of Clinical Pharmacology, Royal College of Surgeons in Ireland, Dublin 2, Ireland
| | | | | | | | | | | | | |
Collapse
|
43
|
Huminiecki L, Lloyd AT, Wolfe KH. Congruence of tissue expression profiles from Gene Expression Atlas, SAGEmap and TissueInfo databases. BMC Genomics 2003; 4:31. [PMID: 12885301 PMCID: PMC183867 DOI: 10.1186/1471-2164-4-31] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2003] [Accepted: 07/29/2003] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Extracting biological knowledge from large amounts of gene expression information deposited in public databases is a major challenge of the postgenomic era. Additional insights may be derived by data integration and cross-platform comparisons of expression profiles. However, database meta-analysis is complicated by differences in experimental technologies, data post-processing, database formats, and inconsistent gene and sample annotation. RESULTS We have analysed expression profiles from three public databases: Gene Expression Atlas, SAGEmap and TissueInfo. These are repositories of oligonucleotide microarray, Serial Analysis of Gene Expression and Expressed Sequence Tag human gene expression data respectively. We devised a method, Preferential Expression Measure, to identify genes that are significantly over- or under-expressed in any given tissue. We examined intra- and inter-database consistency of Preferential Expression Measures. There was good correlation between replicate experiments of oligonucleotide microarray data, but there was less coherence in expression profiles as measured by Serial Analysis of Gene Expression and Expressed Sequence Tag counts. We investigated inter-database correlations for six tissue categories, for which data were present in the three databases. Significant positive correlations were found for brain, prostate and vascular endothelium but not for ovary, kidney, and pancreas. CONCLUSION We show that data from Gene Expression Atlas, SAGEmap and TissueInfo can be integrated using the UniGene gene index, and that expression profiles correlate relatively well when large numbers of tags are available or when tissue cellular composition is simple. Finally, in the case of brain, we demonstrate that when PEM values show good correlation, predictions of tissue-specific expression based on integrated data are very accurate.
Collapse
Affiliation(s)
- Lukasz Huminiecki
- Department of Genetics, Smurfit Institute, University of Dublin Trinity College, Dublin 2, Ireland
| | - Andrew T Lloyd
- Department of Genetics, Smurfit Institute, University of Dublin Trinity College, Dublin 2, Ireland
| | - Kenneth H Wolfe
- Department of Genetics, Smurfit Institute, University of Dublin Trinity College, Dublin 2, Ireland
| |
Collapse
|