1
|
Stefanucci L, Collins J, Sims MC, Barrio-Hernandez I, Sun L, Burren OS, Perfetto L, Bender I, Callahan TJ, Fleming K, Guerrero JA, Hermjakob H, Martin MJ, Stephenson J, Paneerselvam K, Petrovski S, Porras P, Robinson PN, Wang Q, Watkins X, Frontini M, Laskowski RA, Beltrao P, Di Angelantonio E, Gomez K, Laffan M, Ouwehand WH, Mumford AD, Freson K, Carss K, Downes K, Gleadall N, Megy K, Bruford E, Vuckovic D. The effects of pathogenic and likely pathogenic variants for inherited hemostasis disorders in 140 214 UK Biobank participants. Blood 2023; 142:2055-2068. [PMID: 37647632 PMCID: PMC10733830 DOI: 10.1182/blood.2023020118] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 08/04/2023] [Accepted: 08/04/2023] [Indexed: 09/01/2023] Open
Abstract
Rare genetic diseases affect millions, and identifying causal DNA variants is essential for patient care. Therefore, it is imperative to estimate the effect of each independent variant and improve their pathogenicity classification. Our study of 140 214 unrelated UK Biobank (UKB) participants found that each of them carries a median of 7 variants previously reported as pathogenic or likely pathogenic. We focused on 967 diagnostic-grade gene (DGG) variants for rare bleeding, thrombotic, and platelet disorders (BTPDs) observed in 12 367 UKB participants. By association analysis, for a subset of these variants, we estimated effect sizes for platelet count and volume, and odds ratios for bleeding and thrombosis. Variants causal of some autosomal recessive platelet disorders revealed phenotypic consequences in carriers. Loss-of-function variants in MPL, which cause chronic amegakaryocytic thrombocytopenia if biallelic, were unexpectedly associated with increased platelet counts in carriers. We also demonstrated that common variants identified by genome-wide association studies (GWAS) for platelet count or thrombosis risk may influence the penetrance of rare variants in BTPD DGGs on their associated hemostasis disorders. Network-propagation analysis applied to an interactome of 18 410 nodes and 571 917 edges showed that GWAS variants with large effect sizes are enriched in DGGs and their first-order interactors. Finally, we illustrate the modifying effect of polygenic scores for platelet count and thrombosis risk on disease severity in participants carrying rare variants in TUBB1 or PROC and PROS1, respectively. Our findings demonstrate the power of association analyses using large population datasets in improving pathogenicity classifications of rare variants.
Collapse
Affiliation(s)
- Luca Stefanucci
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- British Heart Foundation, BHF Centre of Research Excellence, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Janine Collins
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Haematology, Barts Health NHS Trust, London, United Kingdom
| | - Matthew C. Sims
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Haematology, Sheffield Teaching Hospitals NHS Foundation Trust, Royal Hallamshire Hospital, Sheffield, United Kingdom
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | - Inigo Barrio-Hernandez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Luanluan Sun
- Department of Public Health and Primary Care, BHF Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom
| | - Oliver S. Burren
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom
| | - Livia Perfetto
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Department of Biology and Biotechnology “C.Darwin,” Sapienza University of Rome, Rome, Italy
| | - Isobel Bender
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Tiffany J. Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY
| | - Kathryn Fleming
- School of Cellular and Molecular Medicine, University of Bristol, Bristol, United Kingdom
| | - Jose A. Guerrero
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Haematology, Barts Health NHS Trust, London, United Kingdom
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Maria J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - James Stephenson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - NIHR BioResource
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- British Heart Foundation, BHF Centre of Research Excellence, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Haematology, Barts Health NHS Trust, London, United Kingdom
- Department of Haematology, Sheffield Teaching Hospitals NHS Foundation Trust, Royal Hallamshire Hospital, Sheffield, United Kingdom
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Department of Public Health and Primary Care, BHF Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom
- Department of Biology and Biotechnology “C.Darwin,” Sapienza University of Rome, Rome, Italy
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY
- School of Cellular and Molecular Medicine, University of Bristol, Bristol, United Kingdom
- Centre for Genomics Research, Discovery Sciences, AstraZeneca, Cambridge, United Kingdom
- Department of Medicine, Austin Health, The University of Melbourne, Melbourne, Australia
- Genomic Medicine, The Jackson Laboratory, Farmington, CT
- Institute for Systems Genomics, University of Connecticut, Farmington, CT
- Department of Clinical and Biomedical Sciences, Faculty of Health and Life Sciences RILD Building, University of Exeter Medical School, Exeter, United Kingdom
- Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
- Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
- NIHR Blood and Transplant Research Unit in Donor Health and Behaviour, Cambridge, United Kingdom
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, United Kingdom
- Health Data Science Centre, Human Technopole, Milan, Italy
- Haemophilia Centre and Thrombosis Unit, Royal Free London NHS Foundation Trust, London, United Kingdom
- Department of Haematology, Imperial College Healthcare NHS Trust, London, United Kingdom
- Department of Immunology and Inflammation, Centre for Haematology, Imperial College London, London, United Kingdom
- Department of Haematology, University College London Hospitals NHS Trust, London, United Kingdom
- Department of Cardiovascular Sciences, Center for Molecular and Vascular Biology, KULeuven, Leuven, Belgium
- Cambridge Genomics Laboratory, Cambridge University Hospitals National Health Service Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Epidemiology and Biostatistics, Imperial College London, London, United Kingdom
| | - Kalpana Paneerselvam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, AstraZeneca, Cambridge, United Kingdom
- Department of Medicine, Austin Health, The University of Melbourne, Melbourne, Australia
| | - Pablo Porras
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Peter N. Robinson
- Genomic Medicine, The Jackson Laboratory, Farmington, CT
- Institute for Systems Genomics, University of Connecticut, Farmington, CT
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom
| | - Xavier Watkins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Mattia Frontini
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- British Heart Foundation, BHF Centre of Research Excellence, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Clinical and Biomedical Sciences, Faculty of Health and Life Sciences RILD Building, University of Exeter Medical School, Exeter, United Kingdom
| | - Roman A. Laskowski
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Pedro Beltrao
- Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
| | - Emanuele Di Angelantonio
- British Heart Foundation, BHF Centre of Research Excellence, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Public Health and Primary Care, BHF Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom
- Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
- NIHR Blood and Transplant Research Unit in Donor Health and Behaviour, Cambridge, United Kingdom
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, United Kingdom
- Health Data Science Centre, Human Technopole, Milan, Italy
| | - Keith Gomez
- Haemophilia Centre and Thrombosis Unit, Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Mike Laffan
- Department of Haematology, Imperial College Healthcare NHS Trust, London, United Kingdom
- Department of Immunology and Inflammation, Centre for Haematology, Imperial College London, London, United Kingdom
| | - Willem H. Ouwehand
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Department of Haematology, University College London Hospitals NHS Trust, London, United Kingdom
| | - Andrew D. Mumford
- School of Cellular and Molecular Medicine, University of Bristol, Bristol, United Kingdom
| | - Kathleen Freson
- Department of Cardiovascular Sciences, Center for Molecular and Vascular Biology, KULeuven, Leuven, Belgium
| | - Keren Carss
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom
| | - Kate Downes
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
- Cambridge Genomics Laboratory, Cambridge University Hospitals National Health Service Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Nick Gleadall
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- National Health Service Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Karyn Megy
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Elspeth Bruford
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Dragana Vuckovic
- Department of Epidemiology and Biostatistics, Imperial College London, London, United Kingdom
| |
Collapse
|
2
|
Estravís M, Pérez-Pazos J, Martin MJ, Ramos-González J, Gil-Melcón M, Martín-García C, García-Sánchez A, Sanz C, Dávila I. Correspondence regarding the paper "Laorden D, Zamarrón E, Romero D, Domínguez-Ortega J, Villamañán E, Losantos I, Gayá F, Quirce S, Álvarez-Sala R. Evaluation of FEOS score and super-responder criteria in a real-life cohort treated with anti-IL5/IL5R. Respir. Med. 2023; 211: 107216". Respir Med 2023; 214:107280. [PMID: 37172788 DOI: 10.1016/j.rmed.2023.107280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 05/09/2023] [Indexed: 05/15/2023]
Affiliation(s)
- Miguel Estravís
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain
| | - Jacqueline Pérez-Pazos
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Unidad de Farmacogenética y Medicina de Precisión, Servicio de Bioquímica Clínica, Servicio de Alergología, Hospital Universitario de Salamanca, IBSAL, Salamanca, Spain
| | - Maria J Martin
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Departamento de Bioquímica y Biología Molecular, Universidad de Salamanca, Salamanca, Spain
| | - Jacinto Ramos-González
- Servicio de Neumología, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain
| | - María Gil-Melcón
- Servicio de Otorrinolaringología, Complejo Asistencial Univesitario de Salamanca, Salamanca, Spain
| | - Cristina Martín-García
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Servicio de Inmunoalergia, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain
| | - Asunción García-Sánchez
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Departamento de Ciencias Biomédicas y del Diagnóstico, Universidad de Salamanca, Salamanca, Spain.
| | - Catalina Sanz
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Departamento de Microbiología y Genética, Universidad de Salamanca, Salamanca, Spain
| | - Ignacio Dávila
- Grupo de Investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Servicio de Inmunoalergia, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain; Departamento de Ciencias Biomédicas y del Diagnóstico, Universidad de Salamanca, Salamanca, Spain
| |
Collapse
|
3
|
Bowler-Barnett EH, Fan J, Luo J, Magrane M, Martin MJ, Orchard S. UniProt and Mass Spectrometry-Based Proteomics-A 2-Way Working Relationship. Mol Cell Proteomics 2023; 22:100591. [PMID: 37301379 PMCID: PMC10404557 DOI: 10.1016/j.mcpro.2023.100591] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 05/20/2023] [Accepted: 06/07/2023] [Indexed: 06/12/2023] Open
Abstract
The human proteome comprises of all of the proteins produced by the sequences translated from the human genome with additional modifications in both sequence and function caused by nonsynonymous variants and posttranslational modifications including cleavage of the initial transcript into smaller peptides and polypeptides. The UniProtKB database (www.uniprot.org) is the world's leading high-quality, comprehensive and freely accessible resource of protein sequence and functional information and presents a summary of experimentally verified, or computationally predicted, functional information added by our expert biocuration team for each protein in the proteome. Researchers in the field of mass spectrometry-based proteomics both consume and add to the body of data available in UniProtKB, and this review highlights the information we provide to this community and the knowledge we in turn obtain from groups via deposition of large-scale datasets in public domain databases.
Collapse
Affiliation(s)
- E H Bowler-Barnett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - J Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - J Luo
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - M Magrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - M J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - S Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom.
| |
Collapse
|
4
|
Dalkıran A, Atakan A, Rifaioğlu AS, Martin MJ, Atalay RÇ, Acar AC, Doğan T, Atalay V. Transfer learning for drug-target interaction prediction. Bioinformatics 2023; 39:i103-i110. [PMID: 37387156 DOI: 10.1093/bioinformatics/btad234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2023] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Utilizing AI-driven approaches for drug-target interaction (DTI) prediction require large volumes of training data which are not available for the majority of target proteins. In this study, we investigate the use of deep transfer learning for the prediction of interactions between drug candidate compounds and understudied target proteins with scarce training data. The idea here is to first train a deep neural network classifier with a generalized source training dataset of large size and then to reuse this pre-trained neural network as an initial configuration for re-training/fine-tuning purposes with a small-sized specialized target training dataset. To explore this idea, we selected six protein families that have critical importance in biomedicine: kinases, G-protein-coupled receptors (GPCRs), ion channels, nuclear receptors, proteases, and transporters. In two independent experiments, the protein families of transporters and nuclear receptors were individually set as the target datasets, while the remaining five families were used as the source datasets. Several size-based target family training datasets were formed in a controlled manner to assess the benefit provided by the transfer learning approach. RESULTS Here, we present a systematic evaluation of our approach by pre-training a feed-forward neural network with source training datasets and applying different modes of transfer learning from the pre-trained source network to a target dataset. The performance of deep transfer learning is evaluated and compared with that of training the same deep neural network from scratch. We found that when the training dataset contains fewer than 100 compounds, transfer learning outperforms the conventional strategy of training the system from scratch, suggesting that transfer learning is advantageous for predicting binders to under-studied targets. AVAILABILITY AND IMPLEMENTATION The source code and datasets are available at https://github.com/cansyl/TransferLearning4DTI. Our web-based service containing the ready-to-use pre-trained models is accessible at https://tl4dti.kansil.org.
Collapse
Affiliation(s)
- Alperen Dalkıran
- Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey
- Department of Computer Engineering, Adana Alparslan Türkeş Science and Technology University, Adana 01250, Turkey
| | - Ahmet Atakan
- Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey
- Department of Computer Engineering, Erzincan Binali Yıldırım University, Erzincan 24002, Turkey
| | - Ahmet S Rifaioğlu
- Department of Computer Engineering, Iskenderun Technical University, Hatay 31200, Turkey
- Faculty of Medicine, Institute for Computational Biomedicine, Heidelberg University and Heidelberg University Hospital, Heidelberg 69120, Germany
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, Hinxton CB10 1SD, United Kingdom
| | - Rengül Çetin Atalay
- Faculty of Pulmonary and Critical Care Medicine, the University of Chicago, Chicago, IL, 60637, United States
| | - Aybar C Acar
- Cancer Systems Biology Laboratory (Kansil), Middle East Technical University, Ankara 06800, Turkey
| | - Tunca Doğan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, Hinxton CB10 1SD, United Kingdom
- Department of Computer Engineering, Hacettepe University, Ankara 06800, Turkey
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey
| |
Collapse
|
5
|
Aleksander SA, Balhoff J, Carbon S, Cherry JM, Drabkin HJ, Ebert D, Feuermann M, Gaudet P, Harris NL, Hill DP, Lee R, Mi H, Moxon S, Mungall CJ, Muruganugan A, Mushayahama T, Sternberg PW, Thomas PD, Van Auken K, Ramsey J, Siegele DA, Chisholm RL, Fey P, Aspromonte MC, Nugnes MV, Quaglia F, Tosatto S, Giglio M, Nadendla S, Antonazzo G, Attrill H, Dos Santos G, Marygold S, Strelets V, Tabone CJ, Thurmond J, Zhou P, Ahmed SH, Asanitthong P, Luna Buitrago D, Erdol MN, Gage MC, Ali Kadhum M, Li KYC, Long M, Michalak A, Pesala A, Pritazahra A, Saverimuttu SCC, Su R, Thurlow KE, Lovering RC, Logie C, Oliferenko S, Blake J, Christie K, Corbani L, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov D, Smith C, Cuzick A, Seager J, Cooper L, Elser J, Jaiswal P, Gupta P, Jaiswal P, Naithani S, Lera-Ramirez M, Rutherford K, Wood V, De Pons JL, Dwinell MR, Hayman GT, Kaldunski ML, Kwitek AE, Laulederkind SJF, Tutaj MA, Vedi M, Wang SJ, D'Eustachio P, Aimo L, Axelsen K, Bridge A, Hyka-Nouspikel N, Morgat A, Aleksander SA, Cherry JM, Engel SR, Karra K, Miyasato SR, Nash RS, Skrzypek MS, Weng S, Wong ED, Bakker E, Berardini TZ, Reiser L, Auchincloss A, Axelsen K, Argoud-Puy G, Blatter MC, Boutet E, Breuza L, Bridge A, Casals-Casas C, Coudert E, Estreicher A, Livia Famiglietti M, Feuermann M, Gos A, Gruaz-Gumowski N, Hulo C, Hyka-Nouspikel N, Jungo F, Le Mercier P, Lieberherr D, Masson P, Morgat A, Pedruzzi I, Pourcel L, Poux S, Rivoire C, Sundaram S, Bateman A, Bowler-Barnett E, Bye-A-Jee H, Denny P, Ignatchenko A, Ishtiaq R, Lock A, Lussi Y, Magrane M, Martin MJ, Orchard S, Raposo P, Speretta E, Tyagi N, Warner K, Zaru R, Diehl AD, Lee R, Chan J, Diamantakis S, Raciti D, Zarowiecki M, Fisher M, James-Zorn C, Ponferrada V, Zorn A, Ramachandran S, Ruzicka L, Westerfield M. The Gene Ontology knowledgebase in 2023. Genetics 2023; 224:iyad031. [PMID: 36866529 PMCID: PMC10158837 DOI: 10.1093/genetics/iyad031] [Citation(s) in RCA: 218] [Impact Index Per Article: 218.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 02/10/2023] [Accepted: 02/11/2023] [Indexed: 03/04/2023] Open
Abstract
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains, and updates the GO knowledgebase. The GO knowledgebase consists of three components: (1) the GO-a computational knowledge structure describing the functional characteristics of genes; (2) GO annotations-evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and (3) GO Causal Activity Models (GO-CAMs)-mechanistic models of molecular "pathways" (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised, and updated in response to newly published discoveries and receives extensive QA checks, reviews, and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, and guidance on how users can best make use of the data that we provide. We conclude with future directions for the project.
Collapse
|
6
|
Lussi YC, Magrane M, Martin MJ, Orchard S. Searching and Navigating UniProt Databases. Curr Protoc 2023; 3:e700. [PMID: 36912607 DOI: 10.1002/cpz1.700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt website receives about 800,000 unique visitors per month and is the primary means to access UniProt. It provides 10 searchable datasets and four main tools. The key UniProt datasets are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), the UniProt Archive (UniParc), and protein sets for completely sequenced genomes (Proteomes). Other supporting datasets include information about proteins that is present in UniProtKB protein entries, such as literature citations, taxonomy, and subcellular locations, among others. This article focuses on how to use UniProt datasets. The first basic protocol describes navigation and searching mechanisms for the UniProt datasets, and two additional protocols build on the first protocol to describe advanced search and query building. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Searching UniProt datasets Basic Protocol 2: Advanced search and query building Basis Protocol 3: Adding parameters using advanced search.
Collapse
Affiliation(s)
- Yvonne C Lussi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Michele Magrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| |
Collapse
|
7
|
Estravís M, Pérez-Pazos J, Martin MJ, Ramos-González J, Gil-Melcón M, Martín-García C, García-Sánchez A, Sanz C, Dávila I. Quantitative and qualitative methods of evaluating response to biologics in severe asthma patients: Results from a real-world study. J Allergy Clin Immunol Pract 2023; 11:949-951.e2. [PMID: 36423868 DOI: 10.1016/j.jaip.2022.11.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 10/24/2022] [Accepted: 11/04/2022] [Indexed: 11/23/2022]
Affiliation(s)
- Miguel Estravís
- Grupo de investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain
| | - Jacqueline Pérez-Pazos
- Grupo de investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Unidad de Farmacogenética y Medicina de Precisión, Servicio de Bioquímica Clínica, Servicio de Alergología, Hospital Universitario de Salamanca, IBSAL, Salamanca, Spain
| | - Maria J Martin
- Grupo de investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain; Departamento de Bioquímica y Biología Molecular, Universidad de Salamanca, Salamanca, Spain
| | - Jacinto Ramos-González
- Servicio de Neumología, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain
| | - María Gil-Melcón
- Servicio de Otorrinolaringología, Complejo Asistencial Univesitario de Salamanca, Salamanca, Spain
| | - Cristina Martín-García
- Servicio de Inmunoalergia, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain
| | - Asunción García-Sánchez
- Grupo de investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain; Departamento de Ciencias Biomédicas y del Diagnóstico, Universidad de Salamanca, Salamanca, Spain.
| | - Catalina Sanz
- Grupo de investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain; Departamento de Microbiología y Genética, Universidad de Salamanca, Salamanca, Spain
| | - Ignacio Dávila
- Grupo de investigación en Alergia, Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain; Red de Investigación Cooperativa Orientada a Resultados en Salud-RICORS, ISCIII, Salamanca, Spain; Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain; Servicio de Inmunoalergia, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain; Departamento de Ciencias Biomédicas y del Diagnóstico, Universidad de Salamanca, Salamanca, Spain
| |
Collapse
|
8
|
Estravís M, García-Sánchez A, Martin MJ, Pérez-Pazos J, Isidoro-García M, Dávila I, Sanz C. RNY3 modulates cell proliferation and IL13 mRNA levels in a T lymphocyte model: a possible new epigenetic mechanism of IL-13 regulation. J Physiol Biochem 2023; 79:59-69. [PMID: 36089628 PMCID: PMC9905197 DOI: 10.1007/s13105-022-00920-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 08/29/2022] [Indexed: 11/30/2022]
Abstract
Allergic asthma is the most common type of asthma. It is characterized by TH2 cell-driven inflammation in which interleukin-13 (IL-13) plays a pivotal role. Cytoplasmic RNAs (Y-RNAs), a variety of non-coding RNAs that are dysregulated in many cancer types, are also differentially expressed in patients with allergic asthma. Their function in the development of the disease is still unknown. We investigated the potential role of RNY3 RNA (hY3) in the TH2 cell inflammatory response using the Jurkat cell line as a model. hY3 expression levels were modulated to mimic the upregulation effect in allergic disease. We evaluated the effect of hY3 over cell stimulation and the expression of the TH2 cytokine IL13. Total RNA was isolated and retrotranscribed, and RNA levels were assessed by qPCR. In Jurkat cells, hY3 levels increased upon stimulation with phorbol 12-myristate 13-acetate (PMA) and ionomycin. When transfecting with high levels of hY3 mimic molecules, cell proliferation rate decreased while IL13 mRNA levels increased upon stimulation compared to stimulated control cells. Our results show the effect of increased hY3 levels on cell proliferation and the levels of IL13 mRNA in Jurkat cells. Also, we showed that hY3 could act over other cells via exosomes. This study opens up new ways to study the potential regulatory function of hY3 over IL-13 production and its implications for asthma development.
Collapse
Affiliation(s)
- Miguel Estravís
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain
- Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain
| | - Asunción García-Sánchez
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain.
- Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain.
- Departamento de Ciencias Biomédicas y del Diagnóstico, Universidad de Salamanca, Salamanca, Spain.
| | - Maria J Martin
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain
- Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain
- Departamento de Bioquímica y Biología Molecular, Universidad de Salamanca, Salamanca, Spain
| | - Jacqueline Pérez-Pazos
- Unidad de Farmacogenética y Medicina de Precisión, Servicio de Bioquímica Clínica, Servicio de Alergología, Hospital Universitario de Salamanca, IBSAL, Salamanca, Spain
| | - María Isidoro-García
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain
- Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain
- Servicio de Bioquímica Clínica, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain
- Departamento de Medicina, Universidad de Salamanca, Salamanca, Spain
| | - Ignacio Dávila
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain
- Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain
- Departamento de Ciencias Biomédicas y del Diagnóstico, Universidad de Salamanca, Salamanca, Spain
- Servicio de Inmunoalergia, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain
| | - Catalina Sanz
- Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca, Spain
- Red Cooperativa de Investigación en Salud-RETICS ARADyAL, ISCIII, Madrid, Spain
- Departamento de Microbiología y Genética, Universidad de Salamanca, Salamanca, Spain
| |
Collapse
|
9
|
Valenti-Quiroga M, Daunis-I-Estadella P, Emiliano P, Valero F, Martin MJ. NOM fractionation by HPSEC-DAD-OCD for predicting trihalomethane disinfection by-product formation potential in full-scale drinking water treatment plants. Water Res 2022; 227:119314. [PMID: 36351350 DOI: 10.1016/j.watres.2022.119314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 10/04/2022] [Accepted: 10/29/2022] [Indexed: 06/16/2023]
Abstract
Chlorination is a common method for water disinfection; however, it leads to the formation of disinfection by-products (DBPs), which are undesirable toxic pollutants. To prevent their formation, it is crucial to understand the reactivity of natural organic matter (NOM), which is considered a dominant precursor of DBPs. We propose a novel size exclusion chromatography (SEC) approach to evaluate NOM reactivity and the formation potential of total trihalomethanes-formation potentials (tTHMs-FP) and four regulated species (i.e. CHCl3, CHBrCl2, CHBr2Cl, and CHBr3). This method combines enhanced SEC separation with two analytical columns working in tandem and quantification of apparent molecular weight (AMW) NOM fractions using C content (organic carbon detector, OCD), 254-nm spectroscopic (diode-array detector, DAD) measurements, and spectral slopes at low (S206-240) and high (S350-380) wavelengths. Links between THMs-FP and NOM fractions from high performance size exclusion chromatography HPSEC-DAD-OCD were investigated using statistical modelling with multiple linear regressions for samples taken alongside conventional full-scale as well as full- and pilot-scale electrodialysis reversal and bench-scale ion exchange resins. The proposed models revealed promising correlations between the AMW NOM fractions and the THMs-FP. Methodological changes increased fractionated signal correlations relative to bulk regressions, especially in the proposed HPSEC-DAD-OCD method. Furthermore, spectroscopic models based on fractionated signals are presented, providing a promising approach to predict THMs-FP simultaneously considering the effect of the dominant THMs precursors, NOM and Br-.
Collapse
Affiliation(s)
- Meritxell Valenti-Quiroga
- LEQUIA. Institute of the Environment, Universitat de Girona, Carrer Maria Aurèlia Capmany, 69, Girona E-17003, Spain
| | - Pepus Daunis-I-Estadella
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Carrer Universitat de Girona, 6, Girona E-17003, Spain
| | - Pere Emiliano
- Ens d'Abastament d'Aigua Ter-Llobregat (ATL), Sant Martí de l'Erm 2, E-08970 Sant Joan Despí, Barcelona, Spain
| | - Fernando Valero
- Ens d'Abastament d'Aigua Ter-Llobregat (ATL), Sant Martí de l'Erm 2, E-08970 Sant Joan Despí, Barcelona, Spain
| | - Maria J Martin
- LEQUIA. Institute of the Environment, Universitat de Girona, Carrer Maria Aurèlia Capmany, 69, Girona E-17003, Spain.
| |
Collapse
|
10
|
Merino GA, Saidi R, Milone DH, Stegmayer G, Martin MJ. Hierarchical deep learning for predicting GO annotations by integrating protein knowledge. Bioinformatics 2022; 38:4488-4496. [PMID: 35929781 PMCID: PMC9524999 DOI: 10.1093/bioinformatics/btac536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 07/18/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. RESULTS We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations. AVAILABILITY AND IMPLEMENTATION DeeProtGO and a case of use are available at https://github.com/gamerino/DeeProtGO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB101SD, UK
| | - Diego H Milone
- Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Ciudad Universitaria UNL, Santa Fe 3000, Argentina
| | - Georgina Stegmayer
- Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Ciudad Universitaria UNL, Santa Fe 3000, Argentina
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB101SD, UK
| |
Collapse
|
11
|
Nevers Y, Jones TEM, Jyothi D, Yates B, Ferret M, Portell-Silva L, Codo L, Cosentino S, Marcet-Houben M, Vlasova A, Poidevin L, Kress A, Hickman M, Persson E, Piližota I, Guijarro-Clarke C, Iwasaki W, Lecompte O, Sonnhammer E, Roos DS, Gabaldón T, Thybert D, Thomas PD, Hu Y, Emms DM, Bruford E, Capella-Gutierrez S, Martin MJ, Dessimoz C, Altenhoff A. The Quest for Orthologs orthology benchmark service in 2022. Nucleic Acids Res 2022; 50:W623-W632. [PMID: 35552456 PMCID: PMC9252809 DOI: 10.1093/nar/gkac330] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 04/07/2022] [Accepted: 04/30/2022] [Indexed: 11/15/2022] Open
Abstract
The Orthology Benchmark Service (https://orthology.benchmarkservice.org) is the gold standard for orthology inference evaluation, supported and maintained by the Quest for Orthologs consortium. It is an essential resource to compare existing and new methods of orthology inference (the bedrock for many comparative genomics and phylogenetic analysis) over a standard dataset and through common procedures. The Quest for Orthologs Consortium is dedicated to maintaining the resource up to date, through regular updates of the Reference Proteomes and increasingly accessible data through the OpenEBench platform. For this update, we have added a new benchmark based on curated orthology assertion from the Vertebrate Gene Nomenclature Committee, and provided an example meta-analysis of the public predictions present on the platform.
Collapse
Affiliation(s)
- Yannis Nevers
- To whom correspondence should be addressed. Tel: +41 21 692 5449;
| | - Tamsin E M Jones
- HUGO Gene Nomenclature Committee (HGNC), European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Dushyanth Jyothi
- Protein Function development, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Bethan Yates
- HUGO Gene Nomenclature Committee (HGNC), European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Meritxell Ferret
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3 08034 Barcelona, Spain
| | - Laura Portell-Silva
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3 08034 Barcelona, Spain
| | - Laia Codo
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3 08034 Barcelona, Spain
| | - Salvatore Cosentino
- Department of Biological Sciences, Graduate School of Science, the University of Tokyo, Tokyo, Japan
| | - Marina Marcet-Houben
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3 08034 Barcelona, Spain,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
| | - Anna Vlasova
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3 08034 Barcelona, Spain,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
| | - Laetitia Poidevin
- Department of Computer Science, ICube, UMR 7357, Centre de Recherche en Biomédecine de Strasbourg, University of Strasbourg, CNRS, Strasbourg, France,BiGEst-ICube Platform, ICube, UMR 7357, Centre de Recherche en Biomédecine de Strasbourg, University of Strasbourg, CNRS, Strasbourg, France
| | - Arnaud Kress
- Department of Computer Science, ICube, UMR 7357, Centre de Recherche en Biomédecine de Strasbourg, University of Strasbourg, CNRS, Strasbourg, France,BiGEst-ICube Platform, ICube, UMR 7357, Centre de Recherche en Biomédecine de Strasbourg, University of Strasbourg, CNRS, Strasbourg, France
| | - Mark Hickman
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Emma Persson
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Ivana Piližota
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Cristina Guijarro-Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Wataru Iwasaki
- Department of Biological Sciences, Graduate School of Science, the University of Tokyo, Tokyo, Japan,Department of Integrated Biosciences, Graduate School of Frontier Sciences, the University of Tokyo, Kashiwa, Japan
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, Centre de Recherche en Biomédecine de Strasbourg, University of Strasbourg, CNRS, Strasbourg, France
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - David S Roos
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3 08034 Barcelona, Spain,Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain,Centro de Investigaciones Biomédicas en Red de Enfermedades Infecciosas, Barcelona, Spain
| | - David Thybert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90032, USA
| | - Yanhui Hu
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Harvard University, Boston, MA 02115, USA
| | - David M Emms
- Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK
| | - Elspeth Bruford
- HUGO Gene Nomenclature Committee (HGNC), European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK,Department of Haematology, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | | | - Maria J Martin
- Protein Function development, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland,Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland,Department of Computer Science, University College London, London, UK,Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Adrian Altenhoff
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland,Computer Science Department, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
12
|
Kalyuzhnyy A, Eyers PA, Eyers CE, Bowler-Barnett E, Martin MJ, Sun Z, Deutsch EW, Jones AR. Profiling the Human Phosphoproteome to Estimate the True Extent of Protein Phosphorylation. J Proteome Res 2022; 21:1510-1524. [PMID: 35532924 PMCID: PMC9171898 DOI: 10.1021/acs.jproteome.2c00131] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Public phosphorylation databases such as PhosphoSitePlus (PSP) and PeptideAtlas (PA) compile results from published papers or openly available mass spectrometry (MS) data. However, there is no database-level control for false discovery of sites, likely leading to the overestimation of true phosphosites. By profiling the human phosphoproteome, we estimate the false discovery rate (FDR) of phosphosites and predict a more realistic count of true identifications. We rank sites into phosphorylation likelihood sets and analyze them in terms of conservation across 100 species, sequence properties, and functional annotations. We demonstrate significant differences between the sets and develop a method for independent phosphosite FDR estimation. Remarkably, we report estimated FDRs of 84, 98, and 82% within sets of phosphoserine (pSer), phosphothreonine (pThr), and phosphotyrosine (pTyr) sites, respectively, that are supported by only a single piece of identification evidence─the majority of sites in PSP. We estimate that around 62 000 Ser, 8000 Thr, and 12 000 Tyr phosphosites in the human proteome are likely to be true, which is lower than most published estimates. Furthermore, our analysis estimates that 86 000 Ser, 50 000 Thr, and 26 000 Tyr phosphosites are likely false-positive identifications, highlighting the significant potential of false-positive data to be present in phosphorylation databases.
Collapse
Affiliation(s)
- Anton Kalyuzhnyy
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K.,Computational Biology Facility, Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K
| | - Patrick A Eyers
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K
| | - Claire E Eyers
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K.,Centre for Proteome Research, Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K
| | - Emily Bowler-Barnett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge CB10 1SD, U.K
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge CB10 1SD, U.K
| | - Zhi Sun
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Andrew R Jones
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K.,Computational Biology Facility, Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7BE, U.K
| |
Collapse
|
13
|
Ponte-Fernandez C, Gonzalez-Dominguez J, Carvajal-Rodriguez A, Martin MJ. Evaluation of Existing Methods for High-Order Epistasis Detection. IEEE/ACM Trans Comput Biol Bioinform 2022; 19:912-926. [PMID: 33055017 DOI: 10.1109/tcbb.2020.3030312] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.
Collapse
|
14
|
Lovering RC, Gaudet P, Acencio ML, Ignatchenko A, Jolma A, Fornes O, Kuiper M, Kulakovskiy IV, Lægreid A, Martin MJ, Logie C. A GO catalogue of human DNA-binding transcription factors. Biochim Biophys Acta Gene Regul Mech 2021; 1864:194765. [PMID: 34673265 DOI: 10.1016/j.bbagrm.2021.194765] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 10/08/2021] [Accepted: 10/09/2021] [Indexed: 12/27/2022]
Abstract
To control gene transcription, DNA-binding transcription factors recognise specific sequence motifs in gene regulatory regions. A complete and reliable GO annotation of all DNA-binding transcription factors is key to investigating the delicate balance of gene regulation in response to environmental and developmental stimuli. The need for such information is demonstrated by the many lists of transcription factors that have been produced over the past decade. The COST Action Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC) Consortium brought together experts in the field of transcription with the aim of providing high quality and interoperable gene regulatory data. The Gene Ontology (GO) Consortium provides strict definitions for gene product function, including factors that regulate transcription. The collaboration between the GREEKC and GO Consortia has enabled the application of those definitions to produce a new curated catalogue of over 1400 human DNA-binding transcription factors, that can be accessed at https://www.ebi.ac.uk/QuickGO/targetset/dbTF. This catalogue has facilitated an improvement in the GO annotation of human DNA-binding transcription factors and led to the GO annotation of almost sixty thousand DNA-binding transcription factors in over a hundred species. Thus, this work will aid researchers investigating the regulation of transcription in both biomedical and basic science.
Collapse
Affiliation(s)
- Ruth C Lovering
- Functional Gene Annotation, Preclinical and Fundamental Science, UCL Institute of Cardiovascular Science, University College London, London WC1E 6BT, United Kingdom.
| | - Pascale Gaudet
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, 1 Rue Michel-Servet, 1211 Geneve 4, Switzerland.
| | - Marcio L Acencio
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim NO-7491, Norway.
| | - Alex Ignatchenko
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.
| | - Arttu Jolma
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada.
| | - Oriol Fornes
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children's Hospital Research Institute, University of British Columbia, 950 W 28th Ave, Vancouver, British Columbia V5Z 4H4, Canada.
| | - Martin Kuiper
- Department of Biology, Norwegian University of Science and Technology, Trondheim NO-7491, Norway.
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russia; Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia.
| | - Astrid Lægreid
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim NO-7491, Norway.
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.
| | - Colin Logie
- Molecular Biology Department, Faculty of Science, Radboud University, PO Box 9101, 6500HB Nijmegen, the Netherlands.
| |
Collapse
|
15
|
Zaru R, Onwubiko J, Ribeiro AJM, Cochrane K, Tyzack JD, Muthukrishnan V, Pravda L, Thornton JM, O'Donovan C, Velanker S, Orchard S, Leach A, Martin MJ. The Enzyme Portal: an integrative tool for enzyme information and analysis. FEBS J 2021; 289:5875-5890. [PMID: 34437766 DOI: 10.1111/febs.16168] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/10/2021] [Accepted: 08/25/2021] [Indexed: 12/19/2022]
Abstract
Enzymes play essential roles in all life processes and are used extensively in the biomedical and biotechnological fields. However, enzyme-related information is spread across multiple resources making its retrieval time-consuming. In response to this challenge, the Enzyme Portal has been established to facilitate enzyme research, by providing a freely available hub where researchers can easily find and explore enzyme-related information. It integrates relevant enzyme data for a wide range of species from various resources such as UniProtKB, PDBe and ChEMBL. Here, we describe what type of enzyme-related data the Enzyme Portal provides, how the information is organized and, by show-casing two potential use cases, how to access and retrieve it.
Collapse
Affiliation(s)
- Rossana Zaru
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Joseph Onwubiko
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Antonio J M Ribeiro
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Keeva Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Jonathan D Tyzack
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Venkatesh Muthukrishnan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Lukas Pravda
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Janet M Thornton
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Sameer Velanker
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Andrew Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
16
|
Hufsky F, Lamkiewicz K, Almeida A, Aouacheria A, Arighi C, Bateman A, Baumbach J, Beerenwinkel N, Brandt C, Cacciabue M, Chuguransky S, Drechsel O, Finn RD, Fritz A, Fuchs S, Hattab G, Hauschild AC, Heider D, Hoffmann M, Hölzer M, Hoops S, Kaderali L, Kalvari I, von Kleist M, Kmiecinski R, Kühnert D, Lasso G, Libin P, List M, Löchel HF, Martin MJ, Martin R, Matschinske J, McHardy AC, Mendes P, Mistry J, Navratil V, Nawrocki EP, O’Toole ÁN, Ontiveros-Palacios N, Petrov AI, Rangel-Pineros G, Redaschi N, Reimering S, Reinert K, Reyes A, Richardson L, Robertson DL, Sadegh S, Singer JB, Theys K, Upton C, Welzel M, Williams L, Marz M. Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research. Brief Bioinform 2021; 22:642-663. [PMID: 33147627 PMCID: PMC7665365 DOI: 10.1093/bib/bbaa232] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 07/28/2020] [Accepted: 08/26/2020] [Indexed: 12/16/2022] Open
Abstract
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories. Contact:evbc@unj-jena.de.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Christian Brandt
- Institute of Infectious Disease and Infection Control at Jena University Hospital, Germany
| | - Marco Cacciabue
- Consejo Nacional de Investigaciones Científicas y Tócnicas (CONICET) working on FMDV virology at the Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET) and at the Departamento de Ciencias Básicas, Universidad Nacional de Luján (UNLu), Argentina
| | | | - Oliver Drechsel
- bioinformatics department at the Robert Koch-Institute, Germany
| | | | - Adrian Fritz
- Computational Biology of Infection Research group of Alice C. McHardy at the Helmholtz Centre for Infection Research, Germany
| | - Stephan Fuchs
- bioinformatics department at the Robert Koch-Institute, Germany
| | - Georges Hattab
- Bioinformatics Division at Philipps-University Marburg, Germany
| | | | - Dominik Heider
- Data Science in Biomedicine at the Philipps-University of Marburg, Germany
| | | | | | - Stefan Hoops
- Biocomplexity Institute and Initiative at the University of Virginia, USA
| | - Lars Kaderali
- Bioinformatics and head of the Institute of Bioinformatics at University Medicine Greifswald, Germany
| | | | - Max von Kleist
- bioinformatics department at the Robert Koch-Institute, Germany
| | - Renó Kmiecinski
- bioinformatics department at the Robert Koch-Institute, Germany
| | | | - Gorka Lasso
- Chandran Lab, Albert Einstein College of Medicine, USA
| | | | | | | | | | | | | | - Alice C McHardy
- Computational Biology of Infection Research Lab at the Helmholtz Centre for Infection Research in Braunschweig, Germany
| | - Pedro Mendes
- Center for Quantitative Medicine of the University of Connecticut School of Medicine, USA
| | | | - Vincent Navratil
- Bioinformatics and Systems Biology at the Rhône Alpes Bioinformatics core facility, Universitó de Lyon, France
| | | | | | | | | | | | - Nicole Redaschi
- Development of the Swiss-Prot group at the SIB for UniProt and SIB resources that cover viral biology (ViralZone)
| | - Susanne Reimering
- Computational Biology of Infection Research group of Alice C. McHardy at the Helmholtz Centre for Infection Research
| | | | | | | | | | - Sepideh Sadegh
- Chair of Experimental Bioinformatics at Technical University of Munich, Germany
| | - Joshua B Singer
- MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, UK
| | | | - Chris Upton
- Department of Biochemistry and Microbiology, University of Victoria, Canada
| | | | | | - Manja Marz
- Friedrich Schiller University Jena, Germany
| |
Collapse
|
17
|
Elena-Pérez S, Heredero-Jung DH, García-Sánchez A, Estravís M, Martin MJ, Ramos-González J, Triviño JC, Isidoro-García M, Sanz C, Dávila I. Molecular Analysis of IL-5 Receptor Subunit Alpha as a Possible Pharmacogenetic Biomarker in Asthma. Front Med (Lausanne) 2021; 7:624576. [PMID: 33644088 PMCID: PMC7904892 DOI: 10.3389/fmed.2020.624576] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 12/30/2020] [Indexed: 01/17/2023] Open
Abstract
Background: Asthma is a heterogeneous syndrome with a broad clinical spectrum and high drug response variability. The inflammatory response in asthma involves multiple effector cells and mediator molecules. Based on asthma immunopathogenesis, precision medicine can be a promising strategy for identifying biomarkers. Biologic therapies acting on the IL-5/IL-5 receptor axis have been developed. IL-5 promotes proliferation, differentiation and activation of eosinophils by binding to the IL-5 receptor, located on the surface of eosinophils and basophils. This study aimed to investigate the expression of IL5RA in patients with several types of asthma and its expression after treatment with benralizumab, a biologic directed against IL-5 receptor subunit alpha. Methods: Sixty peripheral blood samples, 30 from healthy controls and 30 from asthmatic patients, were selected for a transcriptomic RNAseq study. Differential expression analysis was performed by statistical assessment of fold changes and P-values. A validation study of IL5RA expression was developed using qPCR in 100 controls and 187 asthmatic patients. The effect of benralizumab on IL5RA expression was evaluated in five patients by comparing expression levels between pretreatment and after 3 months of treatment. The IL5RA mRNA levels were normalized to GAPDH and TBP expression values for each sample. Calculations were made by the comparative ΔΔCt method. All procedures followed the MIQE guidelines. Results:IL5RA was one of the most differentially overexpressed coding transcripts in the peripheral blood of asthmatic patients (P = 8.63E-08 and fold change of 2.22). In the qPCR validation study, IL5RA expression levels were significantly higher in asthmatic patients than in controls (P < 0.001). Significant expression differences were present in different asthmatic types. In the biological drug study, patients treated with benralizumab showed a significant decrease in IL5RA expression and blood eosinophil counts. A notable improvement in ACT and lung function was also observed in these patients. Conclusions: These results indicate that IL5RA is overexpressed in patients with different types of asthma. It could help identify which asthmatic patients will respond more efficiently to benralizumab, moving toward a more personalized asthma management. Although further studies are required, IL5RA could play a role as a biomarker and pharmacogenetic factor in asthma.
Collapse
Affiliation(s)
- Sandra Elena-Pérez
- Department of Clinical Biochemistry, University Hospital of Salamanca, Salamanca, Spain
| | | | - Asunción García-Sánchez
- Allergic Disease Research Group IIMD-01, Institute for Biomedical Research of Salamanca, Salamanca, Spain.,Department of Biomedical Sciences and Diagnostics, University of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health - RETICS ARADyAL, Carlos III Health Institute, Madrid, Spain
| | - Miguel Estravís
- Allergic Disease Research Group IIMD-01, Institute for Biomedical Research of Salamanca, Salamanca, Spain.,Department of Biomedical Sciences and Diagnostics, University of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health - RETICS ARADyAL, Carlos III Health Institute, Madrid, Spain
| | - Maria J Martin
- Allergic Disease Research Group IIMD-01, Institute for Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health - RETICS ARADyAL, Carlos III Health Institute, Madrid, Spain
| | | | | | - María Isidoro-García
- Department of Clinical Biochemistry, University Hospital of Salamanca, Salamanca, Spain.,Allergic Disease Research Group IIMD-01, Institute for Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health - RETICS ARADyAL, Carlos III Health Institute, Madrid, Spain.,Department of Medicine, University of Salamanca, Salamanca, Spain
| | - Catalina Sanz
- Allergic Disease Research Group IIMD-01, Institute for Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health - RETICS ARADyAL, Carlos III Health Institute, Madrid, Spain.,Department of Microbiology and Genetics, University of Salamanca, Salamanca, Spain
| | - Ignacio Dávila
- Allergic Disease Research Group IIMD-01, Institute for Biomedical Research of Salamanca, Salamanca, Spain.,Department of Biomedical Sciences and Diagnostics, University of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health - RETICS ARADyAL, Carlos III Health Institute, Madrid, Spain.,Department of Allergy, University Hospital of Salamanca, Salamanca, Spain
| |
Collapse
|
18
|
Martin MJ, Garcia-Sanchez A, Estravis M, Gil-Melcón M, Isidoro-Garcia M, Sanz C, Davila I. Genetics and Epigenetics of Nasal Polyposis: A Systematic Review. J Investig Allergol Clin Immunol 2021; 31:196-211. [PMID: 33502318 DOI: 10.18176/jiaci.0673] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Chronic rhinosinusitis (CRS) is an inflammatory disease of the nose and paranasal sinuses that is often associated with nasal polyposis (CRSwNP) in the most severe cases. As in other complex diseases, genetic factors are thought to play an important role in the risk and development of the disease. Environment may also modulate the epigenetic signature in affected patients. In the present systematic review, we aimed to compile all published data on genetic and epigenetic variations in CRSwNP since 2000. We found 104 articles, 24 of which were related to epigenetic studies. We identified more than 150 genetic variants in 99 genes involved in the pathogenesis of nasal polyposis. These were clustered into 8 main networks, linking genes involved in inflammation and immune response (eg, MHC), cytokine genes (eg, TNF), leukotriene metabolism, and the extracellular matrix. A total of 89 miRNAs were also identified; these are associated mainly with biological functions such as the cell cycle, inflammation, and the immune response. We propose a potential relationship between genes and the miRNAs identified that may open new lines of investigation. An in-depth knowledge of gene variants and epigenetic traits could help us to design more tailored treatment for patients with CRSwNP.
Collapse
Affiliation(s)
- M J Martin
- IBSAL, Institute of Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health-RETICS ARADyAL, Salamanca, Spain.,Department of Biochemistry and Molecular Biology, University of Salamanca, Salamanca, Spain
| | - A Garcia-Sanchez
- IBSAL, Institute of Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health-RETICS ARADyAL, Salamanca, Spain.,Department of Biomedical and Diagnostics Sciences, University of Salamanca, Salamanca, Spain
| | - M Estravis
- IBSAL, Institute of Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health-RETICS ARADyAL, Salamanca, Spain.,Department of Biomedical and Diagnostics Sciences, University of Salamanca, Salamanca, Spain
| | - M Gil-Melcón
- Department of Otorhinolaryngology/Servicio de Otorrinolaringología, Hospital Universitario de Salamanca, Salamanca, Spain
| | - M Isidoro-Garcia
- IBSAL, Institute of Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health-RETICS ARADyAL, Salamanca, Spain.,Department of Clinical Biochemistry/Servicio de Bioquímica Clínica, Hospital Universitario de Salamanca, Salamanca, Spain.,Department of Medicine, University of Salamanca, Salamanca, Spain
| | - C Sanz
- IBSAL, Institute of Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health-RETICS ARADyAL, Salamanca, Spain.,Department of Microbiology and Genetics, University of Salamanca, Salamanca, Spain
| | - I Davila
- IBSAL, Institute of Biomedical Research of Salamanca, Salamanca, Spain.,Network for Cooperative Research in Health-RETICS ARADyAL, Salamanca, Spain.,Department of Biomedical and Diagnostics Sciences, University of Salamanca, Salamanca, Spain.,Department of Immunoallergy/Servicio de Inmunoalergia, Hospital Universitario de Salamanca, Salamanca, Spain
| |
Collapse
|
19
|
Altenhoff AM, Garrayo-Ventas J, Cosentino S, Emms D, Glover NM, Hernández-Plaza A, Nevers Y, Sundesha V, Szklarczyk D, Fernández JM, Codó L, For Orthologs Consortium TQ, Gelpi JL, Huerta-Cepas J, Iwasaki W, Kelly S, Lecompte O, Muffato M, Martin MJ, Capella-Gutierrez S, Thomas PD, Sonnhammer E, Dessimoz C. The Quest for Orthologs benchmark service and consensus calls in 2020. Nucleic Acids Res 2020; 48:W538-W545. [PMID: 32374845 PMCID: PMC7319555 DOI: 10.1093/nar/gkaa308] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 04/16/2020] [Accepted: 04/20/2020] [Indexed: 12/18/2022] Open
Abstract
The identification of orthologs—genes in different species which descended from the same gene in their last common ancestor—is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.
Collapse
Affiliation(s)
- Adrian M Altenhoff
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,ETH Zurich, Department of Computer Science, Zurich, Switzerland
| | | | - Salvatore Cosentino
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - David Emms
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, UK
| | - Natasha M Glover
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Ana Hernández-Plaza
- Centro de Biotecnologia y Genomica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Yannis Nevers
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Vicky Sundesha
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Damian Szklarczyk
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Laia Codó
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | | | - Josep Ll Gelpi
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Department of Biochemistry and Molecular Biomedicine. University of Barcelona. Barcelona, Spain
| | - Jaime Huerta-Cepas
- Centro de Biotecnologia y Genomica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223, Pozuelo de Alarcón, Madrid, Spain
| | - Wataru Iwasaki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - Steven Kelly
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, UK
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, USA
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Genetics, Evolution & Environment, University College London, London, UK.,Department of Computer Science, University College London, London, UK
| |
Collapse
|
20
|
Martin MJ, Estravís M, García-Sánchez A, Dávila I, Isidoro-García M, Sanz C. Genetics and Epigenetics of Atopic Dermatitis: An Updated Systematic Review. Genes (Basel) 2020; 11:E442. [PMID: 32325630 PMCID: PMC7231115 DOI: 10.3390/genes11040442] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 04/10/2020] [Accepted: 04/15/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Atopic dermatitis is a common inflammatory skin disorder that affects up to 15-20% of the population and is characterized by recurrent eczematous lesions with intense itching. As a heterogeneous disease, multiple factors have been suggested to explain the nature of atopic dermatitis (AD), and its high prevalence makes it necessary to periodically compile and update the new information available. In this systematic review, the focus is set at the genetic and epigenetic studies carried out in the last years. METHODS A systematic literature review was conducted in three scientific publication databases (PubMed, Cochrane Library, and Scopus). The search was restricted to publications indexed from July 2016 to December 2019, and keywords related to atopic dermatitis genetics and epigenetics were used. RESULTS A total of 73 original papers met the inclusion criteria established, including 9 epigenetic studies. A total of 62 genes and 5 intergenic regions were described as associated with AD. CONCLUSION Filaggrin (FLG) polymorphisms are confirmed as key genetic determinants for AD development, but also epigenetic regulation and other genes with functions mainly related to the immune system and extracellular matrix, reinforcing the notion of skin homeostasis breakage in AD.
Collapse
Affiliation(s)
- Maria J Martin
- Institute for Biomedical Research of Salamanca (IBSAL), 37007 Salamanca, Spain; (M.J.M.); (M.E.); (I.D.); (C.S.)
- Network for Cooperative Research in Health–RETICS ARADyAL, 37007 Salamanca, Spain
| | - Miguel Estravís
- Institute for Biomedical Research of Salamanca (IBSAL), 37007 Salamanca, Spain; (M.J.M.); (M.E.); (I.D.); (C.S.)
- Network for Cooperative Research in Health–RETICS ARADyAL, 37007 Salamanca, Spain
- Department of Biomedical and Diagnostics Sciences, University of Salamanca, 37007 Salamanca, Spain
| | - Asunción García-Sánchez
- Institute for Biomedical Research of Salamanca (IBSAL), 37007 Salamanca, Spain; (M.J.M.); (M.E.); (I.D.); (C.S.)
- Network for Cooperative Research in Health–RETICS ARADyAL, 37007 Salamanca, Spain
- Department of Biomedical and Diagnostics Sciences, University of Salamanca, 37007 Salamanca, Spain
| | - Ignacio Dávila
- Institute for Biomedical Research of Salamanca (IBSAL), 37007 Salamanca, Spain; (M.J.M.); (M.E.); (I.D.); (C.S.)
- Network for Cooperative Research in Health–RETICS ARADyAL, 37007 Salamanca, Spain
- Department of Immunoallergy, Salamanca University Hospital, 37007 Salamanca, Spain
| | - María Isidoro-García
- Institute for Biomedical Research of Salamanca (IBSAL), 37007 Salamanca, Spain; (M.J.M.); (M.E.); (I.D.); (C.S.)
- Network for Cooperative Research in Health–RETICS ARADyAL, 37007 Salamanca, Spain
- Department of Clinical Biochemistry, University Hospital of Salamanca, 37007 Salamanca, Spain
- Department of Medicine, University of Salamanca, 37007 Salamanca, Spain
| | - Catalina Sanz
- Institute for Biomedical Research of Salamanca (IBSAL), 37007 Salamanca, Spain; (M.J.M.); (M.E.); (I.D.); (C.S.)
- Network for Cooperative Research in Health–RETICS ARADyAL, 37007 Salamanca, Spain
- Department of Microbiology and Genetics, University of Salamanca, 37007 Salamanca, Spain
| |
Collapse
|
21
|
Zhou N, Jiang Y, Bergquist TR, Lee AJ, Kacsoh BZ, Crocker AW, Lewis KA, Georghiou G, Nguyen HN, Hamid MN, Davis L, Dogan T, Atalay V, Rifaioglu AS, Dalkıran A, Cetin Atalay R, Zhang C, Hurto RL, Freddolino PL, Zhang Y, Bhat P, Supek F, Fernández JM, Gemovic B, Perovic VR, Davidović RS, Sumonja N, Veljkovic N, Asgari E, Mofrad MRK, Profiti G, Savojardo C, Martelli PL, Casadio R, Boecker F, Schoof H, Kahanda I, Thurlby N, McHardy AC, Renaux A, Saidi R, Gough J, Freitas AA, Antczak M, Fabris F, Wass MN, Hou J, Cheng J, Wang Z, Romero AE, Paccanaro A, Yang H, Goldberg T, Zhao C, Holm L, Törönen P, Medlar AJ, Zosa E, Borukhov I, Novikov I, Wilkins A, Lichtarge O, Chi PH, Tseng WC, Linial M, Rose PW, Dessimoz C, Vidulin V, Dzeroski S, Sillitoe I, Das S, Lees JG, Jones DT, Wan C, Cozzetto D, Fa R, Torres M, Warwick Vesztrocy A, Rodriguez JM, Tress ML, Frasca M, Notaro M, Grossi G, Petrini A, Re M, Valentini G, Mesiti M, Roche DB, Reeb J, Ritchie DW, Aridhi S, Alborzi SZ, Devignes MD, Koo DCE, Bonneau R, Gligorijević V, Barot M, Fang H, Toppo S, Lavezzo E, Falda M, Berselli M, Tosatto SCE, Carraro M, Piovesan D, Ur Rehman H, Mao Q, Zhang S, Vucetic S, Black GS, Jo D, Suh E, Dayton JB, Larsen DJ, Omdahl AR, McGuffin LJ, Brackenridge DA, Babbitt PC, Yunes JM, Fontana P, Zhang F, Zhu S, You R, Zhang Z, Dai S, Yao S, Tian W, Cao R, Chandler C, Amezola M, Johnson D, Chang JM, Liao WH, Liu YW, Pascarelli S, Frank Y, Hoehndorf R, Kulmanov M, Boudellioua I, Politano G, Di Carlo S, Benso A, Hakala K, Ginter F, Mehryary F, Kaewphan S, Björne J, Moen H, Tolvanen MEE, Salakoski T, Kihara D, Jain A, Šmuc T, Altenhoff A, Ben-Hur A, Rost B, Brenner SE, Orengo CA, Jeffery CJ, Bosco G, Hogan DA, Martin MJ, O'Donovan C, Mooney SD, Greene CS, Radivojac P, Friedberg I. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol 2019; 20:244. [PMID: 31744546 PMCID: PMC6864930 DOI: 10.1186/s13059-019-1835-8] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 09/24/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. RESULTS Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. CONCLUSION We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
Collapse
Affiliation(s)
- Naihui Zhou
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.,Program in Bioinformatics and Computational Biology, Ames, IA, USA
| | - Yuxiang Jiang
- Indiana University Bloomington, Bloomington, Indiana, USA
| | - Timothy R Bergquist
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
| | - Alexandra J Lee
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Balint Z Kacsoh
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA.,Department of Molecular and Systems Biology, Hanover, NH, USA
| | - Alex W Crocker
- Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Kimberley A Lewis
- Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - George Georghiou
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Huy N Nguyen
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.,Program in Computer Science, Ames, IA, USA
| | - Md Nafiz Hamid
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.,Program in Bioinformatics and Computational Biology, Ames, IA, USA
| | - Larry Davis
- Program in Bioinformatics and Computational Biology, Ames, IA, USA
| | - Tunca Dogan
- Department of Computer Engineering, Hacettepe University, Ankara, Turkey.,European Molecular Biolo gy Labora tory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey
| | - Ahmet S Rifaioglu
- Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey.,Department of Computer Engineering, Iskenderun Technical University, Hatay, Turkey
| | - Alperen Dalkıran
- Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey
| | - Rengul Cetin Atalay
- CanSyL, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Rebecca L Hurto
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
| | | | - Fran Supek
- Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - José M Fernández
- INB Coordination Unit, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Catalonia, Spain.,(former) INB GN2, Structural and Computational Biology Programme, Spanish National Cancer Research Centre, Barcelona, Catalonia, Spain
| | - Branislava Gemovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Vladimir R Perovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Radoslav S Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Neven Sumonja
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Ehsaneddin Asgari
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering, University of California Berkeley, Berkeley, CA, USA.,Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Berkeley, CA, USA
| | | | - Giuseppe Profiti
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.,National Research Council, IBIOM, Bologna, Italy
| | - Castrense Savojardo
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Florian Boecker
- University of Bonn: INRES Crop Bioinformatics, Bonn, North Rhine-Westphalia, Germany
| | - Heiko Schoof
- INRES Crop Bioinformatics, University of Bonn, Bonn, Germany
| | - Indika Kahanda
- Gianforte School of Computing, Montana State University, Bozeman, Montana, USA
| | - Natalie Thurlby
- University of Bristol, Computer Science, Bristol, Bristol, United Kingdom
| | - Alice C McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Brunswick, Germany.,RESIST, DFG Cluster of Excellence 2155, Brunswick, Germany
| | - Alexandre Renaux
- Interuniversity Institute of Bioinformatics in Brussels, Université libre de Bruxelles - Vrije Universiteit Brussel, Brussels, Belgium.,Machine Learning Group, Université libre de Bruxelles, Brussels, Belgium.,Artificial Intelligence lab, Vrije Universiteit Brussel, Brussels, Belgium
| | - Rabie Saidi
- European Molecular Biolo gy Labora tory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Julian Gough
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Alex A Freitas
- University of Kent, School of Computing, Canterbury, United Kingdom
| | - Magdalena Antczak
- School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom
| | - Fabio Fabris
- University of Kent, School of Computing, Canterbury, United Kingdom
| | - Mark N Wass
- School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom
| | - Jie Hou
- University of Missouri, Computer Science, Columbia, Missouri, USA.,Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Zheng Wang
- University of Miami, Coral Gables, Florida, USA
| | - Alfonso E Romero
- Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Alberto Paccanaro
- Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Haixuan Yang
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, Galway, Ireland.,Technical University of Munich, Garching, Germany
| | - Tatyana Goldberg
- Department of Informatics, Bioinformatics & Computational Biology-i12, Technische Universitat Munchen, Munich, Germany
| | - Chenguang Zhao
- Faculty for Informatics, Garching, Germany.,Department for Bioinformatics and Computational Biology, Garching, Germany.,School of Computing Sciences and Computer Engineering, Hattiesburg, Mississippi, USA
| | - Liisa Holm
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland
| | - Petri Törönen
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland
| | - Alan J Medlar
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland
| | - Elaine Zosa
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | | | - Ilya Novikov
- Baylor College of Medicine, Department of Biochemistry and Molecular Biology, Houston, TX, USA
| | - Angela Wilkins
- Baylor College of Medicine, Department of Molecular and Human Genetics, Houston, TX, USA
| | - Olivier Lichtarge
- Baylor College of Medicine, Department of Molecular and Human Genetics, Houston, TX, USA
| | - Po-Han Chi
- National TsingHua University, Hsinchu, Taiwan
| | - Wei-Cheng Tseng
- Department of Electrical Engineering in National Tsing Hua University, Hsinchu City, Taiwan
| | - Michal Linial
- The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Peter W Rose
- University of California San Diego, San Diego Supercomputer Center, La Jolla, California, USA
| | - Christophe Dessimoz
- Department of Computational Biology and Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Genetics, Evolution & Environment, and Department of Computer Science, University College London, London, UK.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Vedrana Vidulin
- Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia
| | - Saso Dzeroski
- Jozef Stefan Institute, Ljubljana, Slovenia.,Jozef Stefan International Postgraduate School, Ljubljana, Slovenia
| | - Ian Sillitoe
- Research Department of Structural and Molecular Biology, University College London, London, England
| | - Sayoni Das
- Research Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Jonathan Gill Lees
- Research Department of Structural and Molecular Biology, University College London, London, United Kingdom.,Department of Health and Life Sciences, Oxford Brookes University, London, UK
| | - David T Jones
- The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - Cen Wan
- Department of Computer Science, University College London, London, United Kingdom.,The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom
| | - Domenico Cozzetto
- Department of Computer Science, University College London, London, United Kingdom.,The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom
| | - Rui Fa
- Department of Computer Science, University College London, London, United Kingdom.,The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom
| | - Mateo Torres
- Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Alex Warwick Vesztrocy
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London, WC1E 6BT, United Kingdom.,SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, Spain
| | - Michael L Tress
- Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Marco Frasca
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Marco Notaro
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Giuliano Grossi
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Alessandro Petrini
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Matteo Re
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Giorgio Valentini
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Marco Mesiti
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy.,Institut de Biologie Computationnelle, LIRMM, CNRS-UMR 5506, Universite de Montpellier, Montpellier, France
| | - Daniel B Roche
- Department of Informatics, Bioinformatics and Computational Biology-i12, Technische Universitat Munchen, Munich, Germany
| | - Jonas Reeb
- Department of Informatics, Bioinformatics and Computational Biology-i12, Technische Universitat Munchen, Munich, Germany
| | - David W Ritchie
- University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Sabeur Aridhi
- University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | | | - Marie-Dominique Devignes
- University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.,University of Lorraine, Nancy, Lorraine, France.,Inria, Nancy, France
| | | | - Richard Bonneau
- NYU Center for Data Science, New York, 10010, NY, USA.,Flatiron Institute, CCB, New York, 10010, NY, USA
| | - Vladimir Gligorijević
- Center for Computational Biology (CCB), Flatiron Institute, Simons Foundation, New York, New York, USA
| | - Meet Barot
- Center for Data Science, New York University, New York, 10011, NY, USA
| | - Hai Fang
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Stefano Toppo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Enrico Lavezzo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Marco Falda
- Department of Biology, University of Padova, Padova, Italy
| | - Michele Berselli
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Silvio C E Tosatto
- CNR Institute of Neuroscience, Padova, Italy.,Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Marco Carraro
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Hafeez Ur Rehman
- Department of Computer Science, National University of Computer and Emerging Sciences, Peshawar, Khyber Pakhtoonkhwa, Pakistan
| | - Qizhong Mao
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA.,University of California, Riverside, Philadelphia, PA, USA
| | - Shanshan Zhang
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Slobodan Vucetic
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Gage S Black
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Dane Jo
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Erica Suh
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Jonathan B Dayton
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Dallas J Larsen
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Ashton R Omdahl
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, England, United Kingdom
| | | | - Patricia C Babbitt
- Department of Pharmaceutical Chemistry, San Francisco, CA, USA.,Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 94158, CA, USA
| | - Jeffrey M Yunes
- UC Berkeley - UCSF Graduate Program in Bioengineering, University of California, San Francisco, 94158, CA, USA.,Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 94158, CA, USA
| | - Paolo Fontana
- Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Italy
| | - Feng Zhang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, Shanghai, China.,Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China
| | - Shanfeng Zhu
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Ronghui You
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Zihan Zhang
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Suyang Dai
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Shuwei Yao
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China.,Department of Pediatrics, Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Caleb Chandler
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Miguel Amezola
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Devon Johnson
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jia-Ming Chang
- Department of Computer Science, National Chengchi University, Taipei, Taiwan
| | - Wen-Hung Liao
- Department of Computer Science, National Chengchi University, Taipei, Taiwan
| | - Yi-Wei Liu
- Department of Computer Science, National Chengchi University, Taipei, Taiwan
| | | | | | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Jeddah, Saudi Arabia
| | - Maxat Kulmanov
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Jeddah, Saudi Arabia
| | - Imane Boudellioua
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.,Computer, Electrical and Mathematical Sciences Engineering Division (CEMSE), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Gianfranco Politano
- Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy
| | - Stefano Di Carlo
- Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy
| | - Alfredo Benso
- Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy
| | - Kai Hakala
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku Graduate School (UTUGS), Turku, Finland
| | - Filip Ginter
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku, Turku, Finland
| | - Farrokh Mehryary
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku Graduate School (UTUGS), Turku, Finland
| | - Suwisa Kaewphan
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku Graduate School (UTUGS), Turku, Finland.,Turku Centre for Computer Science (TUCS), Turku, Finland
| | - Jari Björne
- Department of Future Technologies, Faculty of Science and Engineering, University of Turku, Turku, FI-20014, Finland.,Turku Centre for Computer Science (TUCS), Agora, Vesilinnantie 3, Turku, FI-20500, Finland
| | | | | | - Tapio Salakoski
- Department of Future Technologies, Faculty of Science and Engineering, University of Turku, Turku, FI-20014, Finland.,Turku Centre for Computer Science (TUCS), Agora, Vesilinnantie 3, Turku, FI-20500, Finland
| | - Daisuke Kihara
- Department of Biological Sciences, Department of Computer Science, Purdue University, 47907, IN, USA.,Department of Pediatrics, University of Cincinnati, Cincinnati, 45229, OH, USA
| | - Aashish Jain
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Tomislav Šmuc
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Adrian Altenhoff
- Department of Computer Science, ETH Zurich, Zurich, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology-i12, Technische Universitat Munchen, Munich, Germany.,Institute for Food and Plant Sciences WZW, Technische Universität München, Freising, Germany
| | | | - Christine A Orengo
- Research Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Constance J Jeffery
- Biological Sciences, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Giovanni Bosco
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Deborah A Hogan
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA.,Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, Pennsylvania, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| | - Iddo Friedberg
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.
| |
Collapse
|
22
|
McGarvey PB, Nightingale A, Luo J, Huang H, Martin MJ, Wu C, Consortium U. UniProt genomic mapping for deciphering functional effects of missense variants. Hum Mutat 2019; 40:694-705. [PMID: 30840782 PMCID: PMC6563471 DOI: 10.1002/humu.23738] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Revised: 12/17/2018] [Accepted: 02/17/2019] [Indexed: 01/08/2023]
Abstract
Understanding the association of genetic variation with its functional consequences in proteins is essential for the interpretation of genomic data and identifying causal variants in diseases. Integration of protein function knowledge with genome annotation can assist in rapidly comprehending genetic variation within complex biological processes. Here, we describe mapping UniProtKB human sequences and positional annotations, such as active sites, binding sites, and variants to the human genome (GRCh38) and the release of a public genome track hub for genome browsers. To demonstrate the power of combining protein annotations with genome annotations for functional interpretation of variants, we present specific biological examples in disease-related genes and proteins. Computational comparisons of UniProtKB annotations and protein variants with ClinVar clinically annotated single nucleotide polymorphism (SNP) data show that 32% of UniProtKB variants colocate with 8% of ClinVar SNPs. The majority of colocated UniProtKB disease-associated variants (86%) map to 'pathogenic' ClinVar SNPs. UniProt and ClinVar are collaborating to provide a unified clinical variant annotation for genomic, protein, and clinical researchers. The genome track hubs, and related UniProtKB files, are downloadable from the UniProt FTP site and discoverable as public track hubs at the UCSC and Ensembl genome browsers.
Collapse
Affiliation(s)
- Peter B. McGarvey
- Innovation Center for Biomedical InformaticsGeorgetown University Medical CenterWashingtonDC
- Protein Information ResourceGeorgetown Medical CenterWashingtonDC
| | - Andrew Nightingale
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| | - Jie Luo
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| | - Hongzhan Huang
- Center for Bioinformatics and Computational BiologyUniversity of DelawareNewarkDelaware
| | - Maria J. Martin
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute, Wellcome Genome CampusHinxtonUnited Kingdom
| | - Cathy Wu
- Center for Bioinformatics and Computational BiologyUniversity of DelawareNewarkDelaware
| | - UniProt Consortium
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute, Wellcome Genome CampusHinxtonUnited Kingdom
- Swiss Institute of BioinformaticsCentre Medical UniversitaireGenevaSwitzerland
- Protein Information ResourceGeorgetown Medical CenterWashingtonDC
| |
Collapse
|
23
|
Kramarz B, Roncaglia P, Meldal BHM, Huntley RP, Martin MJ, Orchard S, Parkinson H, Brough D, Bandopadhyay R, Hooper NM, Lovering RC. Improving the Gene Ontology Resource to Facilitate More Informative Analysis and Interpretation of Alzheimer's Disease Data. Genes (Basel) 2018; 9:E593. [PMID: 30501127 PMCID: PMC6315915 DOI: 10.3390/genes9120593] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 11/22/2018] [Accepted: 11/23/2018] [Indexed: 12/28/2022] Open
Abstract
The analysis and interpretation of high-throughput datasets relies on access to high-quality bioinformatics resources, as well as processing pipelines and analysis tools. Gene Ontology (GO, geneontology.org) is a major resource for gene enrichment analysis. The aim of this project, funded by the Alzheimer's Research United Kingdom (ARUK) foundation and led by the University College London (UCL) biocuration team, was to enhance the GO resource by developing new neurological GO terms, and use GO terms to annotate gene products associated with dementia. Specifically, proteins and protein complexes relevant to processes involving amyloid-beta and tau have been annotated and the resulting annotations are denoted in GO databases as 'ARUK-UCL'. Biological knowledge presented in the scientific literature was captured through the association of GO terms with dementia-relevant protein records; GO itself was revised, and new GO terms were added. This literature biocuration increased the number of Alzheimer's-relevant gene products that were being associated with neurological GO terms, such as 'amyloid-beta clearance' or 'learning or memory', as well as neuronal structures and their compartments. Of the total 2055 annotations that we contributed for the prioritised gene products, 526 have associated proteins and complexes with neurological GO terms. To ensure that these descriptive annotations could be provided for Alzheimer's-relevant gene products, over 70 new GO terms were created. Here, we describe how the improvements in ontology development and biocuration resulting from this initiative can benefit the scientific community and enhance the interpretation of dementia data.
Collapse
Affiliation(s)
- Barbara Kramarz
- UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.
| | - Paola Roncaglia
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Birgit H M Meldal
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Rachael P Huntley
- UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.
| | - Maria J Martin
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - David Brough
- Division of Neuroscience and Experimental Psychology, School of Biological Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, University of Manchester, AV Hill Building, Oxford Road, Manchester M13 9PT, UK.
| | - Rina Bandopadhyay
- UCL Queen Square Institute of Neurology and Reta Lila Weston Institute of Neurological Studies, 1 Wakefield Street, London WC1N 1PJ, UK.
| | - Nigel M Hooper
- Division of Neuroscience and Experimental Psychology, School of Biological Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, University of Manchester, AV Hill Building, Oxford Road, Manchester M13 9PT, UK.
| | - Ruth C Lovering
- UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.
| |
Collapse
|
24
|
Gonzalez-Dominguez J, Martin MJ. MPIGeneNet: Parallel Calculation of Gene Co-Expression Networks on Multicore Clusters. IEEE/ACM Trans Comput Biol Bioinform 2018; 15:1732-1737. [PMID: 29028205 DOI: 10.1109/tcbb.2017.2761340] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In this work, we present MPIGeneNet, a parallel tool that applies Pearson's correlation and Random Matrix Theory to construct gene co-expression networks. It is based on the state-of-the-art sequential tool RMTGeneNet, which provides networks with high robustness and sensitivity at the expenses of relatively long runtimes for large scale input datasets. MPIGeneNet returns the same results as RMTGeneNet but improves the memory management, reduces the I/O cost, and accelerates the two most computationally demanding steps of co-expression network construction by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on two different systems using three typical input datasets shows that MPIGeneNet is significantly faster than RMTGeneNet. As an example, our tool is up to 175.41 times faster on a cluster with eight nodes, each one containing two 12-core Intel Haswell processors. The source code of MPIGeneNet, as well as a reference manual, are available at https://sourceforge.net/projects/mpigenenet/.
Collapse
|
25
|
Huntley RP, Kramarz B, Sawford T, Umrao Z, Kalea A, Acquaah V, Martin MJ, Mayr M, Lovering RC. Expanding the horizons of microRNA bioinformatics. RNA 2018; 24:1005-1017. [PMID: 29871895 PMCID: PMC6049505 DOI: 10.1261/rna.065565.118] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 06/01/2018] [Indexed: 06/08/2023]
Abstract
MicroRNA regulation of key biological and developmental pathways is a rapidly expanding area of research, accompanied by vast amounts of experimental data. This data, however, is not widely available in bioinformatic resources, making it difficult for researchers to find and analyze microRNA-related experimental data and define further research projects. We are addressing this problem by providing two new bioinformatics data sets that contain experimentally verified functional information for mammalian microRNAs involved in cardiovascular-relevant, and other, processes. To date, our resource provides over 4400 Gene Ontology annotations associated with over 500 microRNAs from human, mouse, and rat and over 2400 experimentally validated microRNA:target interactions. We illustrate how this resource can be used to create microRNA-focused interaction networks with a biological context using the known biological role of microRNAs and the mRNAs they regulate, enabling discovery of associations between gene products, biological pathways and, ultimately, diseases. This data will be crucial in advancing the field of microRNA bioinformatics and will establish consistent data sets for reproducible functional analysis of microRNAs across all biological research areas.
Collapse
Affiliation(s)
- Rachael P Huntley
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Barbara Kramarz
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Tony Sawford
- European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Zara Umrao
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Anastasia Kalea
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Vanessa Acquaah
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Maria J Martin
- European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Manuel Mayr
- King's British Heart Foundation Centre, King's College London, London SE5 9NU, United Kingdom
| | - Ruth C Lovering
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| |
Collapse
|
26
|
Abstract
Summary: ProtVista is a comprehensive visualization tool for the graphical representation of protein sequence features in the UniProt Knowledgebase, experimental proteomics and variation public datasets. The complexity and relationships in this wealth of data pose a challenge in interpretation. Integrative visualization approaches such as provided by ProtVista are thus essential for researchers to understand the data and, for instance, discover patterns affecting function and disease associations. Availability and Implementation: ProtVista is a JavaScript component released as an open source project under the Apache 2 License. Documentation and source code are available at http://ebi-uniprot.github.io/ProtVista/. Contact:martin@ebi.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xavier Watkins
- EMBL-EBI, Hinxton, UK.,Open Targets Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | | | | | | | | |
Collapse
|
27
|
LeDuc RD, Schwämmle V, Shortreed MR, Cesnik AJ, Solntsev SK, Shaw JB, Martin MJ, Vizcaino JA, Alpi E, Danis P, Kelleher NL, Smith LM, Ge Y, Agar JN, Chamot-Rooke J, Loo JA, Pasa-Tolic L, Tsybin YO. ProForma: A Standard Proteoform Notation. J Proteome Res 2018; 17:1321-1325. [PMID: 29397739 PMCID: PMC5837035 DOI: 10.1021/acs.jproteome.7b00851] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The Consortium for Top-Down Proteomics (CTDP) proposes a standardized notation, ProForma, for writing the sequence of fully characterized proteoforms. ProForma provides a means to communicate any proteoform by writing the amino acid sequence using standard one-letter notation and specifying modifications or unidentified mass shifts within brackets following certain amino acids. The notation is unambiguous, human-readable, and can easily be parsed and written by bioinformatic tools. This system uses seven rules and supports a wide range of possible use cases, ensuring compatibility and reproducibility of proteoform annotations. Standardizing proteoform sequences will simplify storage, comparison, and reanalysis of proteomic studies, and the Consortium welcomes input and contributions from the research community on the continued design and maintenance of this standard.
Collapse
Affiliation(s)
- Richard D. LeDuc
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60208, United States
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense, Denmark
| | - Michael R. Shortreed
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Anthony J. Cesnik
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Stefan K. Solntsev
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Jared B. Shaw
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Maria J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Juan A. Vizcaino
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Emanuele Alpi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Paul Danis
- Consortium for Top-Down Proteomics, Cambridge, Massachusetts 02142, United States
| | - Neil L. Kelleher
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60208, United States
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
- Genome Center of Wisconsin, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Ying Ge
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Jeffrey N. Agar
- Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts 02115, United States
| | - Julia Chamot-Rooke
- Mass Spectrometry for Biology Unit, Institut Pasteur, CNRS USR 2000, Paris Cedex 15, France
| | - Joseph A. Loo
- Department of Chemistry and Biochemistry and Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, California 90095, United States
| | - Ljiljana Pasa-Tolic
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | | |
Collapse
|
28
|
Abstract
The Gene Ontology (GO) is widely recognised as the gold standard bioinformatics resource for summarizing functional knowledge of gene products in a consistent and computable, information-rich language. GO describes cellular and organismal processes across all species, yet until now there has been a considerable gene annotation deficit within the neurological and immunological domains, both of which are relevant to Parkinson’s disease. Here we introduce the Parkinson’s disease GO Annotation Project, funded by Parkinson’s UK and supported by the GO Consortium, which is addressing this deficit by providing GO annotation to Parkinson’s-relevant human gene products, principally through expert literature curation. We discuss the steps taken to prioritise proteins, publications and cellular processes for annotation, examples of how GO annotations capture Parkinson’s-relevant information, and the advantages that a topic-focused annotation approach offers to users. Building on the existing GO resource, this project collates a vast amount of Parkinson’s-relevant literature into a set of high-quality annotations to be utilized by the research community.
Collapse
Affiliation(s)
- R E Foulger
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK.
| | - P Denny
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| | - J Hardy
- Department of Molecular Neuroscience, Institute of Neurology, University College London, London, UK
| | - M J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK
| | - T Sawford
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK
| | - R C Lovering
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| |
Collapse
|
29
|
Martin MJ, Gekelman W, Van Compernolle B, Pribyl P, Carter T. Experimental Observation of Convective Cell Formation due to a Fast Wave Antenna in the Large Plasma Device. Phys Rev Lett 2017; 119:205002. [PMID: 29219335 DOI: 10.1103/physrevlett.119.205002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Indexed: 06/07/2023]
Abstract
An experiment in a linear device, the Large Plasma Device, is used to study sheaths caused by an actively powered radio frequency (rf) antenna. The rf antenna used in the experiment consists of a single current strap recessed inside a copper box enclosure without a Faraday screen. A large increase in the plasma potential was observed along magnetic field lines that connect to the antenna limiter. The electric field from the spatial variation of the rectified plasma potential generated E[over →]×B[over →]_{0} flows, often referred to as convective cells. The presence of the flows generated by these potentials is confirmed by Mach probes. The observed convective cell flows are seen to cause the plasma in front of the antenna to flow away and cause a density modification near the antenna edge. These can cause hot spots and damage to the antenna and can result in a decrease in the ion cyclotron range of frequencies antenna coupling.
Collapse
Affiliation(s)
- M J Martin
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - W Gekelman
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - B Van Compernolle
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - P Pribyl
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - T Carter
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
30
|
Pundir S, Onwubiko J, Zaru R, Rosanoff S, Antunes R, Bingley M, Watkins X, O'Donovan C, Martin MJ. An update on the Enzyme Portal: an integrative approach for exploring enzyme knowledge. Protein Eng Des Sel 2017; 30:245-251. [PMID: 28158609 PMCID: PMC5421622 DOI: 10.1093/protein/gzx008] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 12/14/2016] [Indexed: 01/28/2023] Open
Abstract
Enzymes are a key part of life processes and are increasingly important for various areas of research such as medicine, biotechnology, bioprocessing and drug research. The goal of the Enzyme Portal is to provide an interface to all European Bioinformatics Institute (EMBL-EBI) data about enzymes (de Matos, P., et al., (2013), BMC Bioinformatics, 14 (1), 103). These data include enzyme function, sequence features and family classification, protein structure, reactions, pathways, small molecules, diseases and the associated literature. The sources of enzyme data are: the UniProt Knowledgebase (UniProtKB) (UniProt Consortium, 2015), the Protein Data Bank in Europe (PDBe), (Valenkar, S., et al., Nucleic Acids Res.2016; 44, D385–D395) Rhea—a database of enzyme-catalysed reactions (Morgat, A., et al., Nucleic Acids Res. 2015; 43, D459-D464), Reactome—a database of biochemical pathways (Fabregat, A., et al., Nucleic Acids Res. 2016; 44, D481–D487), IntEnz—a resource with enzyme nomenclature information (Fleischmann, A., et al., Nucleic Acids Res. 2004 32, D434–D437) and ChEBI (Hastings, J., et al., Nucleic Acids Res. 2013) and ChEMBL (Bento, A. P., et al., Nucleic Acids Res. 201442, 1083–1090)—resources which contain information about small-molecule chemistry and bioactivity. This article describes the redesign of Enzyme Portal and the increased functionality added to maximise integration and interpretation of these data. Use case examples of the Enzyme Portal and the versatile workflows its supports are illustrated. We welcome the suggestion of new resources for integration.
Collapse
Affiliation(s)
- S Pundir
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - J Onwubiko
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - R Zaru
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - S Rosanoff
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - R Antunes
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - M Bingley
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - X Watkins
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - C O'Donovan
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - M J Martin
- EMBL- European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
31
|
Abstract
It is becoming more evident that computational methods are needed for the identification and the mapping of pathways in new genomes. We introduce an automatic annotation system (ARBA4Path Association Rule-Based Annotator for Pathways) that utilizes rule mining techniques to predict metabolic pathways across wide range of prokaryotes. It was demonstrated that specific combinations of protein domains (recorded in our rules) strongly determine pathways in which proteins are involved and thus provide information that let us very accurately assign pathway membership (with precision of 0.999 and recall of 0.966) to proteins of a given prokaryotic taxon. Our system can be used to enhance the quality of automatically generated annotations as well as annotating proteins with unknown function. The prediction models are represented in the form of human-readable rules, and they can be used effectively to add absent pathway information to many proteins in UniProtKB/TrEMBL database.
Collapse
Affiliation(s)
- Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK.
| | - Imane Boudellioua
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Kingdom of Saudi Arabia
| | - Maria J Martin
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Kingdom of Saudi Arabia
| | - Victor Solovyev
- Softberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY, 10549, USA.
| |
Collapse
|
32
|
Mirdita M, von den Driesch L, Galiez C, Martin MJ, Söding J, Steinegger M. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res 2016; 45:D170-D176. [PMID: 27899574 PMCID: PMC5614098 DOI: 10.1093/nar/gkw1081] [Citation(s) in RCA: 333] [Impact Index Per Article: 41.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 10/14/2016] [Accepted: 11/01/2016] [Indexed: 11/27/2022] Open
Abstract
We present three clustered protein sequence databases, Uniclust90, Uniclust50, Uniclust30 and three databases of multiple sequence alignments (MSAs), Uniboost10, Uniboost20 and Uniboost30, as a resource for protein sequence analysis, function prediction and sequence searches. The Uniclust databases cluster UniProtKB sequences at the level of 90%, 50% and 30% pairwise sequence identity. Uniclust90 and Uniclust50 clusters showed better consistency of functional annotation than those of UniRef90 and UniRef50, owing to an optimised clustering pipeline that runs with our MMseqs2 software for fast and sensitive protein sequence searching and clustering. Uniclust sequences are annotated with matches to Pfam, SCOP domains, and proteins in the PDB, using our HHblits homology detection tool. Due to its high sensitivity, Uniclust contains 17% more Pfam domain annotations than UniProt. Uniboost MSAs of three diversities are built by enriching the Uniclust30 MSAs with local sequence matches from MMseqs2 profile searches through Uniclust30. All databases can be downloaded from the Uniclust server at uniclust.mmseqs.com. Users can search clusters by keywords and explore their MSAs, taxonomic representation, and annotations. Uniclust is updated every two months with the new UniProt release.
Collapse
Affiliation(s)
- Milot Mirdita
- Quantitative and Computational Biology Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Lars von den Driesch
- Quantitative and Computational Biology Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Clovis Galiez
- Quantitative and Computational Biology Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Johannes Söding
- Quantitative and Computational Biology Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Martin Steinegger
- Quantitative and Computational Biology Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany .,Department for Bioinformatics and Computational Biology, Technische Universität München, Munich, Germany.,Department of Chemistry, Seoul National University, Seoul, Korea
| |
Collapse
|
33
|
Jiang Y, Oron TR, Clark WT, Bankapur AR, D'Andrea D, Lepore R, Funk CS, Kahanda I, Verspoor KM, Ben-Hur A, Koo DCE, Penfold-Brown D, Shasha D, Youngs N, Bonneau R, Lin A, Sahraeian SME, Martelli PL, Profiti G, Casadio R, Cao R, Zhong Z, Cheng J, Altenhoff A, Skunca N, Dessimoz C, Dogan T, Hakala K, Kaewphan S, Mehryary F, Salakoski T, Ginter F, Fang H, Smithers B, Oates M, Gough J, Törönen P, Koskinen P, Holm L, Chen CT, Hsu WL, Bryson K, Cozzetto D, Minneci F, Jones DT, Chapman S, Bkc D, Khan IK, Kihara D, Ofer D, Rappoport N, Stern A, Cibrian-Uhalte E, Denny P, Foulger RE, Hieta R, Legge D, Lovering RC, Magrane M, Melidoni AN, Mutowo-Meullenet P, Pichler K, Shypitsyna A, Li B, Zakeri P, ElShal S, Tranchevent LC, Das S, Dawson NL, Lee D, Lees JG, Sillitoe I, Bhat P, Nepusz T, Romero AE, Sasidharan R, Yang H, Paccanaro A, Gillis J, Sedeño-Cortés AE, Pavlidis P, Feng S, Cejuela JM, Goldberg T, Hamp T, Richter L, Salamov A, Gabaldon T, Marcet-Houben M, Supek F, Gong Q, Ning W, Zhou Y, Tian W, Falda M, Fontana P, Lavezzo E, Toppo S, Ferrari C, Giollo M, Piovesan D, Tosatto SCE, Del Pozo A, Fernández JM, Maietta P, Valencia A, Tress ML, Benso A, Di Carlo S, Politano G, Savino A, Rehman HU, Re M, Mesiti M, Valentini G, Bargsten JW, van Dijk ADJ, Gemovic B, Glisic S, Perovic V, Veljkovic V, Veljkovic N, Almeida-E-Silva DC, Vencio RZN, Sharan M, Vogel J, Kansakar L, Zhang S, Vucetic S, Wang Z, Sternberg MJE, Wass MN, Huntley RP, Martin MJ, O'Donovan C, Robinson PN, Moreau Y, Tramontano A, Babbitt PC, Brenner SE, Linial M, Orengo CA, Rost B, Greene CS, Mooney SD, Friedberg I, Radivojac P. An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol 2016; 17:184. [PMID: 27604469 PMCID: PMC5015320 DOI: 10.1186/s13059-016-1037-6] [Citation(s) in RCA: 252] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 08/04/2016] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.
Collapse
Affiliation(s)
- Yuxiang Jiang
- Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
| | | | - Wyatt T Clark
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Asma R Bankapur
- Department of Microbiology, Miami University, Oxford, OH, USA
| | | | | | - Christopher S Funk
- Computational Bioscience Program, University of Colorado School of Medicine, Aurora, CO, USA
| | - Indika Kahanda
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Karin M Verspoor
- Department of Computing and Information Systems, University of Melbourne, Parkville, Victoria, Australia
- Health and Biomedical Informatics Centre, University of Melbourne, Parkville, Victoria, Australia
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | | | - Duncan Penfold-Brown
- Social Media and Political Participation Lab, New York University, New York, NY, USA
- CY Data Science, New York, NY, USA
| | - Dennis Shasha
- Department of Computer Science, New York University, New York, NY, USA
| | - Noah Youngs
- CY Data Science, New York, NY, USA
- Department of Computer Science, New York University, New York, NY, USA
- Simons Center for Data Analysis, New York, NY, USA
| | - Richard Bonneau
- Department of Computer Science, New York University, New York, NY, USA
- Simons Center for Data Analysis, New York, NY, USA
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | - Alexandra Lin
- Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA, USA
| | - Sayed M E Sahraeian
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | | | - Giuseppe Profiti
- Biocomputing Group, BiGeA, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, BiGeA, University of Bologna, Bologna, Italy
| | - Renzhi Cao
- Computer Science Department, University of Missouri, Columbia, MO, USA
| | - Zhaolong Zhong
- Computer Science Department, University of Missouri, Columbia, MO, USA
| | - Jianlin Cheng
- Computer Science Department, University of Missouri, Columbia, MO, USA
| | - Adrian Altenhoff
- ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Nives Skunca
- ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Christophe Dessimoz
- Bioinformatics Group, Department of Computer Science, University College London, London, UK
- University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Tunca Dogan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Kai Hakala
- Department of Information Technology, University of Turku, Turku, Finland
- University of Turku Graduate School, University of Turku, Turku, Finland
| | - Suwisa Kaewphan
- Department of Information Technology, University of Turku, Turku, Finland
- University of Turku Graduate School, University of Turku, Turku, Finland
- Turku Centre for Computer Science, Turku, Finland
| | - Farrokh Mehryary
- Department of Information Technology, University of Turku, Turku, Finland
- University of Turku Graduate School, University of Turku, Turku, Finland
| | - Tapio Salakoski
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science, Turku, Finland
| | - Filip Ginter
- Department of Information Technology, University of Turku, Turku, Finland
| | - Hai Fang
- University of Bristol, Bristol, UK
| | | | | | | | - Petri Törönen
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Patrik Koskinen
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Liisa Holm
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
- Department of Biological and Environmental Sciences, Universitity of Helsinki, Helsinki, Finland
| | - Ching-Tai Chen
- Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Wen-Lian Hsu
- Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Kevin Bryson
- Bioinformatics Group, Department of Computer Science, University College London, London, UK
| | - Domenico Cozzetto
- Bioinformatics Group, Department of Computer Science, University College London, London, UK
| | - Federico Minneci
- Bioinformatics Group, Department of Computer Science, University College London, London, UK
| | - David T Jones
- Bioinformatics Group, Department of Computer Science, University College London, London, UK
| | - Samuel Chapman
- Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA
| | - Dukka Bkc
- Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA
| | - Ishita K Khan
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Dan Ofer
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Nadav Rappoport
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Amos Stern
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Elena Cibrian-Uhalte
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Paul Denny
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| | - Rebecca E Foulger
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| | - Reija Hieta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Duncan Legge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Ruth C Lovering
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| | - Michele Magrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Anna N Melidoni
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| | | | - Klemens Pichler
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Aleksandra Shypitsyna
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Biao Li
- Buck Institute for Research on Aging, Novato, CA, USA
| | - Pooya Zakeri
- Department of Electrical Engineering, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium
- iMinds Department Medical Information Technologies, Leuven, Belgium
| | - Sarah ElShal
- Department of Electrical Engineering, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium
- iMinds Department Medical Information Technologies, Leuven, Belgium
| | - Léon-Charles Tranchevent
- Inserm UMR-S1052, CNRS UMR5286, Cancer Research Centre of Lyon, Lyon, France
- Université de Lyon 1, Villeurbanne, France
- Centre Léon Bérard, Lyon, France
| | - Sayoni Das
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Natalie L Dawson
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - David Lee
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Jonathan G Lees
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London, UK
| | | | | | - Alfonso E Romero
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK
| | - Rajkumar Sasidharan
- Department of Molecular, Cell and Developmental Biology, University of California at Los Angeles, Los Angeles, CA, USA
| | - Haixuan Yang
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, Ireland
| | - Alberto Paccanaro
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics Cold Spring Harbor Laboratory, New York, NY, USA
| | | | - Paul Pavlidis
- Department of Psychiatry and Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Shou Feng
- Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
| | - Juan M Cejuela
- Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
| | - Tatyana Goldberg
- Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
| | - Tobias Hamp
- Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
| | - Lothar Richter
- Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
| | - Asaf Salamov
- DOE Joint Genome Institute, Walnut Creek, CA, USA
| | - Toni Gabaldon
- Bioinformatics and Genomics, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
| | - Marina Marcet-Houben
- Bioinformatics and Genomics, Centre for Genomic Regulation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Fran Supek
- Universitat Pompeu Fabra, Barcelona, Spain
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation, Barcelona, Spain
| | - Qingtian Gong
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China
- Children's Hospital of Fudan University, Shanghai, China
| | - Wei Ning
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China
- Children's Hospital of Fudan University, Shanghai, China
| | - Yuanpeng Zhou
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China
- Children's Hospital of Fudan University, Shanghai, China
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai, China
- Children's Hospital of Fudan University, Shanghai, China
| | - Marco Falda
- Department of Molecular Medicine, University of Padua, Padua, Italy
| | - Paolo Fontana
- Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Italy
| | - Enrico Lavezzo
- Department of Molecular Medicine, University of Padua, Padua, Italy
| | - Stefano Toppo
- Department of Molecular Medicine, University of Padua, Padua, Italy
| | - Carlo Ferrari
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Manuel Giollo
- Department of Information Engineering, University of Padua, Padova, Italy
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Damiano Piovesan
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Silvio C E Tosatto
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Angela Del Pozo
- Instituto De Genetica Medica y Molecular, Hospital Universitario de La Paz, Madrid, Spain
| | - José M Fernández
- Spanish National Bioinformatics Institute, Spanish National Cancer Research Institute, Madrid, Spain
| | - Paolo Maietta
- Structural and Computational Biology Programme, Spanish National Cancer Research Institute, Madrid, Spain
| | - Alfonso Valencia
- Structural and Computational Biology Programme, Spanish National Cancer Research Institute, Madrid, Spain
| | - Michael L Tress
- Structural and Computational Biology Programme, Spanish National Cancer Research Institute, Madrid, Spain
| | - Alfredo Benso
- Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
| | - Stefano Di Carlo
- Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
| | - Gianfranco Politano
- Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
| | - Alessandro Savino
- Control and Computer Engineering Department, Politecnico di Torino, Torino, Italy
| | - Hafeez Ur Rehman
- National University of Computer & Emerging Sciences, Islamabad, Pakistan
| | - Matteo Re
- Anacleto Lab, Dipartimento di informatica, Università degli Studi di Milano, Milan, Italy
| | - Marco Mesiti
- Anacleto Lab, Dipartimento di informatica, Università degli Studi di Milano, Milan, Italy
| | - Giorgio Valentini
- Anacleto Lab, Dipartimento di informatica, Università degli Studi di Milano, Milan, Italy
| | - Joachim W Bargsten
- Applied Bioinformatics, Bioscience, Wageningen University and Research Centre, Wageningen, Netherlands
| | - Aalt D J van Dijk
- Applied Bioinformatics, Bioscience, Wageningen University and Research Centre, Wageningen, Netherlands
- Biometris, Wageningen University, Wageningen, Netherlands
| | - Branislava Gemovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Sanja Glisic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Vladmir Perovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Veljko Veljkovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | - Nevena Veljkovic
- Center for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade, Serbia
| | | | - Ricardo Z N Vencio
- Department of Computing and Mathematics FFCLRP-USP, University of Sao Paulo, Ribeirao Preto, Brazil
| | - Malvika Sharan
- Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
| | - Jörg Vogel
- Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
| | - Lakesh Kansakar
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Shanshan Zhang
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Slobodan Vucetic
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Zheng Wang
- University of Southern Mississippi, Hattiesburg, MS, USA
| | - Michael J E Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, UK
| | - Mark N Wass
- School of Biosciences, University of Kent, Canterbury, Kent, UK
| | - Rachael P Huntley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Peter N Robinson
- Institut für Medizinische Genetik und Humangenetik, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Yves Moreau
- Department of Electrical Engineering ESAT-SCD and IBBT-KU Leuven Future Health Department, Katholieke Universiteit Leuven, Leuven, Belgium
| | | | - Patricia C Babbitt
- California Institute for Quantitative Biosciences, University of California San Francisco, San Francisco, CA, USA
| | - Steven E Brenner
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Michal Linial
- Department of Chemical Biology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Christine A Orengo
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Burkhard Rost
- Department for Bioinformatics and Computational Biology-I12, Technische Universität München, Garching, Germany
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
| | - Iddo Friedberg
- Department of Microbiology, Miami University, Oxford, OH, USA.
- Department of Computer Science, Miami University, Oxford, OH, USA.
| | - Predrag Radivojac
- Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA.
| |
Collapse
|
34
|
Martin MJ, Lee H, Meakin G, Green A, Simms RL, Reynolds C, Winters S, Shaw DE, Soomro I, Harrison TW. Assessment of a rapid liquid-based cytology method for measuring sputum cell counts. Thorax 2016; 71:1163-1164. [PMID: 27503234 DOI: 10.1136/thoraxjnl-2016-208817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Revised: 07/04/2016] [Accepted: 07/06/2016] [Indexed: 11/03/2022]
Abstract
Differential sputum cell counting is not widely available despite proven clinical utility in the management of asthma. We compared eosinophil counts obtained using liquid-based cytology (LBC), a routine histopathological processing method, and the current standard method. Eosinophil counts obtained using LBC were a strong predictor of sputum eosinophilia (≥3%) determined by the standard method suggesting LBC could be used in the management of asthma.
Collapse
Affiliation(s)
- M J Martin
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - H Lee
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - G Meakin
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - A Green
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - R L Simms
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - C Reynolds
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - S Winters
- Department of Cellular Pathology, Nottingham University Hospitals NHS Trust, Nottingham City Hospital, Nottingham, UK
| | - D E Shaw
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - I Soomro
- Department of Cellular Pathology, Nottingham University Hospitals NHS Trust, Nottingham City Hospital, Nottingham, UK
| | - T W Harrison
- The Asthma Centre, Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| |
Collapse
|
35
|
Doğan T, MacDougall A, Saidi R, Poggioli D, Bateman A, O'Donovan C, Martin MJ. UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB. Bioinformatics 2016; 32:2264-71. [PMID: 27153729 PMCID: PMC4965628 DOI: 10.1093/bioinformatics/btw114] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2015] [Revised: 01/22/2016] [Accepted: 02/25/2016] [Indexed: 11/17/2022] Open
Abstract
MOTIVATION Similarity-based methods have been widely used in order to infer the properties of genes and gene products containing little or no experimental annotation. New approaches that overcome the limitations of methods that rely solely upon sequence similarity are attracting increased attention. One of these novel approaches is to use the organization of the structural domains in proteins. RESULTS We propose a method for the automatic annotation of protein sequences in the UniProt Knowledgebase (UniProtKB) by comparing their domain architectures, classifying proteins based on the similarities and propagating functional annotation. The performance of this method was measured through a cross-validation analysis using the Gene Ontology (GO) annotation of a sub-set of UniProtKB/Swiss-Prot. The results demonstrate the effectiveness of this approach in detecting functional similarity with an average F-score: 0.85. We applied the method on nearly 55.3 million uncharacterized proteins in UniProtKB/TrEMBL resulted in 44 818 178 GO term predictions for 12 172 114 proteins. 22% of these predictions were for 2 812 016 previously non-annotated protein entries indicating the significance of the value added by this approach. AVAILABILITY AND IMPLEMENTATION The results of the method are available at: ftp://ftp.ebi.ac.uk/pub/contrib/martin/DAAC/ CONTACT: tdogan@ebi.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tunca Doğan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Alistair MacDougall
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Diego Poggioli
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
36
|
Boudellioua I, Saidi R, Hoehndorf R, Martin MJ, Solovyev V. Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining. PLoS One 2016; 11:e0158896. [PMID: 27390860 PMCID: PMC4938425 DOI: 10.1371/journal.pone.0158896] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/23/2016] [Indexed: 11/21/2022] Open
Abstract
The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations.
Collapse
Affiliation(s)
- Imane Boudellioua
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Maria J. Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Victor Solovyev
- Softberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY 10549, United States of America
| |
Collapse
|
37
|
Kılıç S, Sagitova DM, Wolfish S, Bely B, Courtot M, Ciufo S, Tatusova T, O'Donovan C, Chibucos MC, Martin MJ, Erill I. From data repositories to submission portals: rethinking the role of domain-specific databases in CollecTF. Database (Oxford) 2016; 2016:baw055. [PMID: 27114493 PMCID: PMC4843526 DOI: 10.1093/database/baw055] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 03/20/2016] [Indexed: 11/12/2022]
Abstract
Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data.Database URL: http://www.collectf.org/.
Collapse
Affiliation(s)
- Sefa Kılıç
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD, 21250, USA
| | - Dinara M Sagitova
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD, 21250, USA
| | - Shoshannah Wolfish
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Benoit Bely
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Stacy Ciufo
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, Rockville Pike, Bethesda, MD, 20894, USA
| | - Tatiana Tatusova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, Rockville Pike, Bethesda, MD, 20894, USA
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marcus C Chibucos
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ivan Erill
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD, 21250, USA
| |
Collapse
|
38
|
Woolliams ER, Anhalt K, Ballico M, Bloembergen P, Bourson F, Briaudeau S, Campos J, Cox MG, del Campo D, Dong W, Dury MR, Gavrilov V, Grigoryeva I, Hernanz ML, Jahan F, Khlevnoy B, Khromchenko V, Lowe DH, Lu X, Machin G, Mantilla JM, Martin MJ, McEvoy HC, Rougié B, Sadli M, Salim SGR, Sasajima N, Taubert DR, Todd ADW, Van den Bossche R, van der Ham E, Wang T, Whittam A, Wilthan B, Woods DJ, Woodward JT, Yamada Y, Yamaguchi Y, Yoon HW, Yuan Z. Thermodynamic temperature assignment to the point of inflection of the melting curve of high-temperature fixed points. Philos Trans A Math Phys Eng Sci 2016; 374:20150044. [PMID: 26903099 DOI: 10.1098/rsta.2015.0044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/25/2015] [Indexed: 06/05/2023]
Abstract
The thermodynamic temperature of the point of inflection of the melting transition of Re-C, Pt-C and Co-C eutectics has been determined to be 2747.84 ± 0.35 K, 2011.43 ± 0.18 K and 1597.39 ± 0.13 K, respectively, and the thermodynamic temperature of the freezing transition of Cu has been determined to be 1357.80 ± 0.08 K, where the ± symbol represents 95% coverage. These results are the best consensus estimates obtained from measurements made using various spectroradiometric primary thermometry techniques by nine different national metrology institutes. The good agreement between the institutes suggests that spectroradiometric thermometry techniques are sufficiently mature (at least in those institutes) to allow the direct realization of thermodynamic temperature above 1234 K (rather than the use of a temperature scale) and that metal-carbon eutectics can be used as high-temperature fixed points for thermodynamic temperature dissemination. The results directly support the developing mise en pratique for the definition of the kelvin to include direct measurement of thermodynamic temperature.
Collapse
Affiliation(s)
- E R Woolliams
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - K Anhalt
- Physikalisch-Technische Bundesanstalt (PTB), Abbestrasse 2-12, Berlin 10587, Germany
| | - M Ballico
- Temperature Standards, National Measurement Institute Australia (NMIA), Bradfield Road, West Lindfield, New South Wales 2070, Australia
| | - P Bloembergen
- Research Institute for Physical Measurement, National Metrology Institute of Japan, National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba, Ibaraki 305-8563, Japan Division of Thermophysics and Process Measurements, National Institute of Metrology (NIM), No. 18 Bei San Huan Dong Lu, Beijing 100029, People's Republic of China
| | - F Bourson
- High Temperature Metrology Department, Laboratoire commun de métrologie (LNE-Cnam), 61 rue du Landy, Saint Denis 93210, France
| | - S Briaudeau
- High Temperature Metrology Department, Laboratoire commun de métrologie (LNE-Cnam), 61 rue du Landy, Saint Denis 93210, France
| | - J Campos
- Optical Institute, Spanish National Research Council (CSIC), Serrano, 144, Madrid 28006, Spain
| | - M G Cox
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - D del Campo
- Centro Español de Metrologia, C/del Alfar, 2, Tres Cantos 28760, Spain
| | - W Dong
- Division of Thermophysics and Process Measurements, National Institute of Metrology (NIM), No. 18 Bei San Huan Dong Lu, Beijing 100029, People's Republic of China
| | - M R Dury
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - V Gavrilov
- All-Russian Research Institute for Optical and Physical Measurements (VNIIOFI), Ozernaya 46, Moscow 119361, Russia
| | - I Grigoryeva
- All-Russian Research Institute for Optical and Physical Measurements (VNIIOFI), Ozernaya 46, Moscow 119361, Russia
| | - M L Hernanz
- Optical Institute, Spanish National Research Council (CSIC), Serrano, 144, Madrid 28006, Spain
| | - F Jahan
- Temperature Standards, National Measurement Institute Australia (NMIA), Bradfield Road, West Lindfield, New South Wales 2070, Australia
| | - B Khlevnoy
- All-Russian Research Institute for Optical and Physical Measurements (VNIIOFI), Ozernaya 46, Moscow 119361, Russia
| | - V Khromchenko
- Sensor Science Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA
| | - D H Lowe
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - X Lu
- Division of Thermophysics and Process Measurements, National Institute of Metrology (NIM), No. 18 Bei San Huan Dong Lu, Beijing 100029, People's Republic of China
| | - G Machin
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - J M Mantilla
- Centro Español de Metrologia, C/del Alfar, 2, Tres Cantos 28760, Spain
| | - M J Martin
- Centro Español de Metrologia, C/del Alfar, 2, Tres Cantos 28760, Spain
| | - H C McEvoy
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - B Rougié
- High Temperature Metrology Department, Laboratoire commun de métrologie (LNE-Cnam), 61 rue du Landy, Saint Denis 93210, France
| | - M Sadli
- High Temperature Metrology Department, Laboratoire commun de métrologie (LNE-Cnam), 61 rue du Landy, Saint Denis 93210, France
| | - S G R Salim
- High Temperature Metrology Department, Laboratoire commun de métrologie (LNE-Cnam), 61 rue du Landy, Saint Denis 93210, France Radiometry and Photometry Division, National Institute of Standards (NIS), PO Box 136, President Sadat Street, El-Haram, Giza, Egypt
| | - N Sasajima
- Research Institute for Physical Measurement, National Metrology Institute of Japan, National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba, Ibaraki 305-8563, Japan
| | - D R Taubert
- Physikalisch-Technische Bundesanstalt (PTB), Abbestrasse 2-12, Berlin 10587, Germany
| | - A D W Todd
- National Research Council Canada, 1200 Montreal Road, Ottawa, Ontario K1A 0R6, Canada
| | - R Van den Bossche
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK Department of Physics, University of Surrey, Guildford, Surrey GU2 7XH, UK
| | - E van der Ham
- Temperature Standards, National Measurement Institute Australia (NMIA), Bradfield Road, West Lindfield, New South Wales 2070, Australia
| | - T Wang
- Division of Thermophysics and Process Measurements, National Institute of Metrology (NIM), No. 18 Bei San Huan Dong Lu, Beijing 100029, People's Republic of China
| | - A Whittam
- National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - B Wilthan
- Physikalisch-Technische Bundesanstalt (PTB), Abbestrasse 2-12, Berlin 10587, Germany
| | - D J Woods
- National Research Council Canada, 1200 Montreal Road, Ottawa, Ontario K1A 0R6, Canada
| | - J T Woodward
- Sensor Science Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA
| | - Y Yamada
- Research Institute for Physical Measurement, National Metrology Institute of Japan, National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba, Ibaraki 305-8563, Japan
| | - Y Yamaguchi
- Research Institute for Physical Measurement, National Metrology Institute of Japan, National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba, Ibaraki 305-8563, Japan
| | - H W Yoon
- Sensor Science Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA
| | - Z Yuan
- Division of Thermophysics and Process Measurements, National Institute of Metrology (NIM), No. 18 Bei San Huan Dong Lu, Beijing 100029, People's Republic of China
| |
Collapse
|
39
|
Sadli M, Machin G, Anhalt K, Bourson F, Briaudeau S, del Campo D, Diril A, Kozlova O, Lowe DH, Mantilla Amor JM, Martin MJ, McEvoy HC, Ojanen-Saloranta M, Pehlivan Ö, Rougié B, Salim SGR. Dissemination of thermodynamic temperature above the freezing point of silver. Philos Trans A Math Phys Eng Sci 2016; 374:20150043. [PMID: 26903097 DOI: 10.1098/rsta.2015.0043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/05/2016] [Indexed: 06/05/2023]
Abstract
The mise-en-pratique for the definition of the kelvin at high temperatures will formally allow dissemination of thermodynamic temperature either directly or mediated through high-temperature fixed points (HTFPs). In this paper, these two distinct dissemination methods are evaluated, namely source-based and detector-based. This was achieved by performing two distinct dissemination trials: one based on HTFPs, the other based on absolutely calibrated radiation thermometers or filter radiometers. These trials involved six national metrology institutes in Europe in the frame of the European Metrology Research Programme joint project 'Implementing the new kelvin' (InK). The results have shown that both dissemination routes are possible, with similar standard uncertainties of 1-2 K, over the range 1273-2773 K, showing that, depending on the facilities available in the laboratory, it will soon be possible to disseminate thermodynamic temperatures above 1273 K to users by either of the two methods with uncertainties comparable to the current temperature scale.
Collapse
Affiliation(s)
- M Sadli
- High Temperature Metrology Department, Laboratoire Commun de Métrologie, LNE-Cnam, 61 rue du Landy, St Denis 93210, France
| | - G Machin
- Engineering Measurement Division, National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - K Anhalt
- Physikalisch-Technische Bundesanstalt (PTB), Abbestrasse 2-12, Berlin 10587, Germany
| | - F Bourson
- High Temperature Metrology Department, Laboratoire Commun de Métrologie, LNE-Cnam, 61 rue du Landy, St Denis 93210, France
| | - S Briaudeau
- High Temperature Metrology Department, Laboratoire Commun de Métrologie, LNE-Cnam, 61 rue du Landy, St Denis 93210, France
| | - D del Campo
- Centro Español de Metrologia (CEM), C/del Alfar, 2, Tres Cantos 28760, Spain
| | - A Diril
- Tubitak Ulusal Metroloji Enstitüsü (TUBITAK UME), Gebze/Kocaeli 41400, Turkey
| | - O Kozlova
- High Temperature Metrology Department, Laboratoire Commun de Métrologie, LNE-Cnam, 61 rue du Landy, St Denis 93210, France
| | - D H Lowe
- Engineering Measurement Division, National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - J M Mantilla Amor
- Centro Español de Metrologia (CEM), C/del Alfar, 2, Tres Cantos 28760, Spain
| | - M J Martin
- Centro Español de Metrologia (CEM), C/del Alfar, 2, Tres Cantos 28760, Spain
| | - H C McEvoy
- Engineering Measurement Division, National Physical Laboratory (NPL), Hampton Road, Teddington TW11 0LW, UK
| | - M Ojanen-Saloranta
- Thermal and Mass Metrology, VTT Technical Research Centre of Finland Ltd, Centre for Metrology MIKES, Espoo 02150, Finland
| | - Ö Pehlivan
- Tubitak Ulusal Metroloji Enstitüsü (TUBITAK UME), Gebze/Kocaeli 41400, Turkey
| | - B Rougié
- High Temperature Metrology Department, Laboratoire Commun de Métrologie, LNE-Cnam, 61 rue du Landy, St Denis 93210, France
| | - S G R Salim
- High Temperature Metrology Department, Laboratoire Commun de Métrologie, LNE-Cnam, 61 rue du Landy, St Denis 93210, France Radiometry and Photometry Division, National Institute of Standards (NIS), President Sadat Street, El-Haram, PO Box 136, Giza, Egypt
| |
Collapse
|
40
|
Abstract
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data (UniProt Consortium, 2015). The UniProt Web site receives ∼400,000 unique visitors per month and is the primary means to access UniProt. Along with various datasets that you can search, UniProt provides three main tools. These are the 'BLAST' tool for sequence similarity searching, the 'Align' tool for multiple sequence alignment, and the 'Retrieve/ID Mapping' tool for using a list of identifiers to retrieve UniProtKB proteins and to convert database identifiers from UniProt to external databases or vice versa. This unit provides three basic protocols, three alternate protocols, and two support protocols for using UniProt tools.
Collapse
Affiliation(s)
- Sangya Pundir
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | -
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, United Kingdom.,Swiss Institute of Bioinformatics, University Medical Center, Geneva, Switzerland.,Protein Information Resource, Georgetown University Medical Center, Washington, D.C.,Protein Information Resource, University of Delaware, Newark, Delaware
| |
Collapse
|
41
|
Da Silva S, Cal-Pereyra LG, Benech A, Acosta-Dibarrat J, Martin MJ, Abreu MC, Perini S, González-Montaña JR. Evaluation of a fibrate, specific stimulant of PPARα, as a therapeutic alternative to the treatment of clinical ovine pregnancy toxaemia. J Vet Pharmacol Ther 2016; 39:497-503. [PMID: 26969801 DOI: 10.1111/jvp.12304] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 12/22/2015] [Accepted: 02/14/2016] [Indexed: 11/27/2022]
Abstract
Ovine pregnancy toxaemia is a metabolic disorder affecting sheep in their last 6 weeks of pregnancy as a result of their inability to maintain adequate energy homoeostasis. Different alternative treatments are available with variable results. The aim of this research was to evaluate a peroxisome proliferator-activated receptor alpha (PPARα) stimulant as an alternative to treat clinical pregnancy toxaemia. Thirty-three adult sheep, with known gestation date and carrying a single foetus, were fasted from day 130 of gestation until animals showed clinical disease. From that moment onwards, sheep were treated during 6 days with three different therapeutic alternatives: 10 mg/kg of 2-methyl-2-phenoxy-propionic acid; 10 mg/kg of 2-methyl-2-phenoxy-propionic acid + 100 mL of propylene glycol oral; or 100 mL of propylene glycol oral. Glycaemia and serum β-hydroxybutyrate (BHOB) were determined daily. Liver biopsies were taken at day 130 of gestation, at the beginning and end of treatments and at 5 days postpartum, evaluating the extent and degree of the steatosis lesion. Even though in sheep treated with 2-methyl-2-phenoxy-propionic acid, serum concentrations of glucose and BHOB recovered more slowly, we conclude that 2-methyl-2-phenoxy-propionic acid alone or combined with propylene glycol can be used as an alternative to effectively treat fatty liver, and therefore pregnancy toxaemia.
Collapse
Affiliation(s)
- S Da Silva
- Pathology Department, Veterinary Faculty, University of La República, Montevideo, Uruguay
| | - L G Cal-Pereyra
- Pathology Department, Veterinary Faculty, University of La República, Montevideo, Uruguay
| | - A Benech
- Small Animals Department, Veterinary Faculty, University of La República, Montevideo, Uruguay
| | - J Acosta-Dibarrat
- Center for Research and Advanced Studies in Animal Health, Faculty of Veterinary Medicine, Autonomous University of Mexico State, Toluca, Mexico
| | - M J Martin
- Medicine, Surgery and Anatomy Veterinary Department, Veterinary Faculty, University of León, León, Spain
| | - M C Abreu
- Pathology Department, Veterinary Faculty, University of La República, Montevideo, Uruguay
| | - S Perini
- Pathology Department, Veterinary Faculty, University of La República, Montevideo, Uruguay
| | - J R González-Montaña
- Medicine, Surgery and Anatomy Veterinary Department, Veterinary Faculty, University of León, León, Spain
| |
Collapse
|
42
|
Abstract
The microbiota has recently been recognized as a driver of health that affects the immune, nervous, and metabolic systems. This influence is partially exerted through the metabolites produced, which may be relevant for optimal infant development and health. The gut microbiota begins developing early in life, and this initial colonization is remarkably important because it may influence long-term microbiota composition and activity. Considering that the microbiome may play a key role in health and disease, maintaining a protective microbiota could be critical in preventing dysbiosis-related diseases such as allergies, autoimmunity disorders, and metabolic syndrome. Breast milk and milk glycans in particular are thought to play a major role in shaping the early-life microbiota and promoting its development, thus affecting health. This review describes some of the effects the microbiota has on the host and discusses the role microbial metabolites play in shaping newborn health and development. We describe the gut microbiota structure and function during early life and the factors that determine its composition and hypothesize about the effects of human milk oligosaccharides and other prebiotic fibers on the neonatal microbiota.
Collapse
Affiliation(s)
| | - Maria J Martin
- Discovery R&D Department, Abbott Nutrition, Granada, Spain
| | | |
Collapse
|
43
|
Martin MJ, Wilson E, Gerrard-Tarpey W, Meakin G, Hearson G, McKeever TM, Shaw DE, Harrison TW. The utility of exhaled nitric oxide in patients with suspected asthma. Thorax 2016; 71:562-4. [PMID: 26903595 DOI: 10.1136/thoraxjnl-2015-208014] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 01/28/2016] [Indexed: 11/03/2022]
Abstract
The value of FENO measurements in patients with symptoms suggestive of asthma is unclear. We performed an observational study to assess the ability of FENO to diagnose asthma and to predict response to inhaled corticosteroids (ICS). Our findings suggest FENO is not useful for asthma diagnosis but is accurate at predicting ICS response.
Collapse
Affiliation(s)
- M J Martin
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - E Wilson
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - W Gerrard-Tarpey
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - G Meakin
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - G Hearson
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - T M McKeever
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - D E Shaw
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| | - T W Harrison
- Nottingham Respiratory Research Unit, University of Nottingham, Nottingham City Hospital, Nottingham, UK
| |
Collapse
|
44
|
Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LSL. Host-virus interactions in hepatitis B and hepatitis C infection. J Gastroenterol 2016; 32:D115-9. [PMID: 14681372 PMCID: PMC308865 DOI: 10.1093/nar/gkh131] [Citation(s) in RCA: 2197] [Impact Index Per Article: 274.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Hepatitis B virus (HBV) and hepatitis C virus (HCV) are among the most endemic pathogens worldwide, with more than 500 million people globally currently infected with these viruses. These pathogens can cause acute and chronic hepatitis that progress to liver cirrhosis or hepatocellular carcinoma. Both viruses utilize multifaceted strategies to evade the host surveillance system and fall below the immunological radar. HBV has developed specific strategies to evade recognition by the innate immune system and is acknowledged to be a stealth virus. However, extensive research has revealed that HBV is recognized by dendritic cells (DCs) and natural killer (NK) cells. Indoleamine-2, 3-dioxygenase is an enforcer of sequential immune reactions in acute hepatitis B, and this molecule has been shown to be induced by the interaction of HBV-infected hepatocytes, DCs, and NK cells. The interleukin-28B genotype has been reported to influence HCV eradication either therapeutically or spontaneously, but the biological function of its gene product, a type-III interferon (IFN-λ3), remains to be elucidated. Human BDCA3(+)DCs have also been shown to be a potent producer of IFN-λ3 in HCV infection, suggesting the possibility that BDCA3(+)DCs could play a key role in developing therapeutic HCV vaccine. Here we review the current state of research on immune responses against HBV and HCV infection, with a specific focus on innate immunity. A comprehensive study based on clinical samples is urgently needed to improve our understanding of the immune mechanisms associated with viral control and thus to develop novel immune modulatory therapies to cure chronic HBV and HCV infection.
Collapse
Affiliation(s)
- Rolf Apweiler
- The EMBL Outstation--European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Camprubi Robles M, Campoy C, Garcia Fernandez L, Lopez-Pedrosa JM, Rueda R, Martin MJ. Maternal Diabetes and Cognitive Performance in the Offspring: A Systematic Review and Meta-Analysis. PLoS One 2015; 10:e0142583. [PMID: 26566144 PMCID: PMC4643884 DOI: 10.1371/journal.pone.0142583] [Citation(s) in RCA: 74] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 10/24/2015] [Indexed: 12/02/2022] Open
Abstract
Objective Diabetes during gestation is one of the most common pregnancy complications associated with adverse health effects for the mother and the child. Maternal diabetes has been proposed to negatively affect the cognitive abilities of the child, but experimental research assessing its impact is conflicting. The main aim of our study was to compare the cognitive function in children of diabetic and healthy pregnant women. Methods A systematic review and meta-analysis was conducted through a literature search using different electronic databases from the index date to January 31, 2015. We included studies that assessed the cognitive abilities in children (up to 14 years) of diabetic and non-diabetic mothers using standardized and validated neuropsychological tests. Results Of 7,698 references reviewed, 12 studies involving 6,140 infants met our inclusion criteria and contributed to meta-analysis. A random effect model was used to compute the standardized mean differences and 95% confidence interval (CI) were calculated. Infants (1–2 years) of diabetic mothers had significantly lower scores of mental and psychomotor development compared to control infants. The effect size for mental development was -0.41 (95% CI -0.59, -0.24; p<0.0001) and for psychomotor development was -0.31 (95% CI -0.55, -0.07; p = 0.0125) with non-significant heterogeneity. Diabetes during pregnancy could be associated with decreased intelligence quotient scores in school-age children, although studies showed significant heterogeneity. Conclusion The association between maternal diabetes and deleterious effects on mental/psychomotor development and overall intellectual function in the offspring must be taken with caution. Results are based on observational cohorts and a direct causal influence of intrauterine hyperglycemia remains uncertain. Therefore, more trials that include larger populations are warranted to elucidate whether gestational diabetes mellitus (GDM) has a negative impact on offspring central nervous system (CNS).
Collapse
Affiliation(s)
| | - Cristina Campoy
- Department of Pediatrics, School of Medicine, University of Granada, Granada, Spain
| | | | | | - Ricardo Rueda
- Abbott Nutrition, Research and Development, Granada, Spain
| | - Maria J. Martin
- Abbott Nutrition, Research and Development, Granada, Spain
- * E-mail:
| |
Collapse
|
46
|
Abstract
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt Web site receives ∼400,000 unique visitors per month and is the primary means to access UniProt. It provides ten searchable datasets and three main tools. The key UniProt datasets are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), the UniProt Archive (UniParc), and protein sets for completely sequenced genomes (Proteomes). Other supporting datasets include information about proteins that is present in UniProtKB protein entries such as literature citations, taxonomy, and subcellular locations, among others. This paper focuses on how to use UniProt datasets. The basic protocol describes navigation and searching mechanisms for the UniProt datasets, while two alternative protocols build on the basic protocol to describe advanced search and query building.
Collapse
Affiliation(s)
- Sangya Pundir
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Michele Magrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | -
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom.,Swiss Institute of Bioinformatics, Geneva, Switzerland.,Protein Information Resource, Washington D.C
| |
Collapse
|
47
|
Abstract
The plasma potential, V(p), is a key quantity in experimental plasma physics. Its spatial gradients directly yield the electrostatic field present. Emissive probes operating under space-charge limited emission conditions float close to V(p) even under time-varying conditions. Throughout their long history in plasma physics, they have mostly been constructed with resistively heated tungsten wire filaments. In high density plasmas (>10(12) cm(-3)), hexaboride emitters are required because tungsten filaments cannot be heated to sufficient emission without component failure. A resistively heated emissive probe with a cerium hexaboride, CeB6, emitter has been developed to work in plasma densities up to 10(13) cm(-3). To show functionality, three spatial profiles of V(p) are compared using the emissive probe, a cold floating probe, and a swept probe inside a plasma containing regions with and without current. The swept probe and emissive probe agree well across the profile while the floating cold probe fails in the current carrying region.
Collapse
Affiliation(s)
- M J Martin
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - J Bonde
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - W Gekelman
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| | - P Pribyl
- Department of Physics and Astronomy, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
48
|
Vega E, Monclús H, Gonzalez-Olmos R, Martin MJ. Optimizing chemical conditioning for odour removal of undigested sewage sludge in drying processes. J Environ Manage 2015; 150:111-119. [PMID: 25438118 DOI: 10.1016/j.jenvman.2014.11.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 11/03/2014] [Accepted: 11/08/2014] [Indexed: 06/04/2023]
Abstract
Emission of odours during the thermal drying in sludge handling processes is one of the main sources of odour problems in wastewater treatment plants. The objective of this work was to assess the use of the response surface methodology as a technique to optimize the chemical conditioning process of undigested sewage sludges, in order to improve the dewaterability, and to reduce the odour emissions during the thermal drying of the sludge. Synergistic effects between inorganic conditioners (iron chloride and calcium oxide) were observed in terms of sulphur emissions and odour reduction. The developed quadratic models indicated that optimizing the conditioners dosage is possible to increase a 70% the dewaterability, reducing a 50% and 54% the emission of odour and volatile sulphur compounds respectively. The optimization of the conditioning process was validated experimentally.
Collapse
Affiliation(s)
- Esther Vega
- LEQUIA, Institute of the Environment, University of Girona, Campus Montilivi, E-17071 Girona, Catalonia, Spain
| | - Hèctor Monclús
- LEQUIA, Institute of the Environment, University of Girona, Campus Montilivi, E-17071 Girona, Catalonia, Spain
| | - Rafael Gonzalez-Olmos
- LEQUIA, Institute of the Environment, University of Girona, Campus Montilivi, E-17071 Girona, Catalonia, Spain; IQS School of Engineering, Universitat Ramon Llull, Via Augusta 390, 08017 Barcelona, Spain
| | - Maria J Martin
- LEQUIA, Institute of the Environment, University of Girona, Campus Montilivi, E-17071 Girona, Catalonia, Spain.
| |
Collapse
|
49
|
Alpi E, Griss J, da Silva AWS, Bely B, Antunes R, Zellner H, Ríos D, O'Donovan C, Vizcaíno JA, Martin MJ. Analysis of the tryptic search space in UniProt databases. Proteomics 2014; 15:48-57. [PMID: 25307260 PMCID: PMC4298651 DOI: 10.1002/pmic.201400227] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Revised: 08/07/2014] [Accepted: 10/06/2014] [Indexed: 01/23/2023]
Abstract
In this article, we provide a comprehensive study of the content of the Universal Protein Resource (UniProt) protein data sets for human and mouse. The tryptic search spaces of the UniProtKB (UniProt knowledgebase) complete proteome sets were compared with other data sets from UniProtKB and with the corresponding International Protein Index, reference sequence, Ensembl, and UniRef100 (where UniRef is UniProt reference clusters) organism-specific data sets. All protein forms annotated in UniProtKB (both the canonical sequences and isoforms) were evaluated in this study. In addition, natural and disease-associated amino acid variants annotated in UniProtKB were included in the evaluation. The peptide unicity was also evaluated for each data set. Furthermore, the peptide information in the UniProtKB data sets was also compared against the available peptide-level identifications in the main MS-based proteomics repositories. Identifying the peptides observed in these repositories is an important resource of information for protein databases as they provide supporting evidence for the existence of otherwise predicted proteins. Likewise, the repositories could use the information available in UniProtKB to direct reprocessing efforts on specific sets of peptides/proteins of interest. In summary, we provide comprehensive information about the different organism-specific sequence data sets available from UniProt, together with the pros and cons for each, in terms of search space for MS-based bottom-up proteomics workflows. The aim of the analysis is to provide a clear view of the tryptic search space of UniProt and other protein databases to enable scientists to select those most appropriate for their purposes.
Collapse
Affiliation(s)
- Emanuele Alpi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Huntley RP, Sawford T, Mutowo-Meullenet P, Shypitsyna A, Bonilla C, Martin MJ, O'Donovan C. The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res 2014; 43:D1057-63. [PMID: 25378336 PMCID: PMC4383930 DOI: 10.1093/nar/gku1113] [Citation(s) in RCA: 372] [Impact Index Per Article: 37.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The Gene Ontology Annotation (GOA) resource (http://www.ebi.ac.uk/GOA) provides evidence-based Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB). Manual annotations provided by UniProt curators are supplemented by manual and automatic annotations from model organism databases and specialist annotation groups. GOA currently supplies 368 million GO annotations to almost 54 million proteins in more than 480 000 taxonomic groups. The resource now provides annotations to five times the number of proteins it did 4 years ago. As a member of the GO Consortium, we adhere to the most up-to-date Consortium-agreed annotation guidelines via the use of quality control checks that ensures that the GOA resource supplies high-quality functional information to proteins from a wide range of species. Annotations from GOA are freely available and are accessible through a powerful web browser as well as a variety of annotation file formats.
Collapse
Affiliation(s)
- Rachael P Huntley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Sawford
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Prudence Mutowo-Meullenet
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleksandra Shypitsyna
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carlos Bonilla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|