1
|
Rahman N, O'Cathail C, Zyoud A, Sokolov A, Oude Munnink B, Grüning B, Cummins C, Amid C, Nieuwenhuijse DF, Visontai D, Yuan DY, Gupta D, Prasad DK, Gulyás GM, Rinck G, McKinnon J, Rajan J, Knaggs J, Skiby JE, Stéger J, Szarvas J, Gueye K, Papp K, Hoek M, Kumar M, Ventouratou MA, Bouquieaux MC, Koliba M, Mansurova M, Haseeb M, Worp N, Harrison PW, Leinonen R, Thorne R, Selvakumar S, Hunt S, Venkataraman S, Jayathilaka S, Cezard T, Maier W, Waheed Z, Iqbal Z, Aarestrup FM, Csabai I, Koopmans M, Burdett T, Cochrane G. Mobilisation and analyses of publicly available SARS-CoV-2 data for pandemic responses. Microb Genom 2024; 10:001188. [PMID: 38358325 PMCID: PMC10926692 DOI: 10.1099/mgen.0.001188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 01/14/2024] [Indexed: 02/16/2024] Open
Abstract
The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.
Collapse
Affiliation(s)
- Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Bas Oude Munnink
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Björn Grüning
- University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Clara Amid
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | | | - Dávid Visontai
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - David Yu Yuan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Divyae K. Prasad
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Gábor Máté Gulyás
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jasmine McKinnon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeff Knaggs
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeffrey Edward Skiby
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - József Stéger
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Judit Szarvas
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Khadim Gueye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Krisztián Papp
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Maarten Hoek
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Marianna A. Ventouratou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Martin Koliba
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Milena Mansurova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Nathalie Worp
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Peter W. Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Ross Thorne
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sarah Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sundar Venkataraman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Timothée Cezard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Wolfgang Maier
- University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Zamin Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Istvan Csabai
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Marion Koopmans
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| |
Collapse
|
2
|
George N, Fexova S, Fuentes AM, Madrigal P, Bi Y, Iqbal H, Kumbham U, Nolte N, Zhao L, Thanki A, Yu I, Marugan Calles J, Erdos K, Vilmovsky L, Kurri S, Vathrakokoili-Pournara A, Osumi-Sutherland D, Prakash A, Wang S, Tello-Ruiz M, Kumari S, Ware D, Goutte-Gattat D, Hu Y, Brown N, Perrimon N, Vizcaíno JA, Burdett T, Teichmann S, Brazma A, Papatheodorou I. Expression Atlas update: insights from sequencing data at both bulk and single cell level. Nucleic Acids Res 2024; 52:D107-D114. [PMID: 37992296 PMCID: PMC10767917 DOI: 10.1093/nar/gkad1021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/13/2023] [Accepted: 10/30/2023] [Indexed: 11/24/2023] Open
Abstract
Expression Atlas (www.ebi.ac.uk/gxa) and its newest counterpart the Single Cell Expression Atlas (www.ebi.ac.uk/gxa/sc) are EMBL-EBI's knowledgebases for gene and protein expression and localisation in bulk and at single cell level. These resources aim to allow users to investigate their expression in normal tissue (baseline) or in response to perturbations such as disease or changes to genotype (differential) across multiple species. Users are invited to search for genes or metadata terms across species or biological conditions in a standardised consistent interface. Alongside these data, new features in Single Cell Expression Atlas allow users to query metadata through our new cell type wheel search. At the experiment level data can be explored through two types of dimensionality reduction plots, t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP), overlaid with either clustering or metadata information to assist users' understanding. Data are also visualised as marker gene heatmaps identifying genes that help confer cluster identity. For some data, additional visualisations are available as interactive cell level anatomograms and cell type gene expression heatmaps.
Collapse
Affiliation(s)
- Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Silvie Fexova
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Alfonso Munoz Fuentes
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Pedro Madrigal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Yalan Bi
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Haider Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Upendra Kumbham
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Nadja Francesca Nolte
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Lingyun Zhao
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Anil S Thanki
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Iris D Yu
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Jose C Marugan Calles
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Karoly Erdos
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Liora Vilmovsky
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Sandeep R Kurri
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | | | - David Osumi-Sutherland
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Marcela K Tello-Ruiz
- Cold Spring Harbour Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Sunita Kumari
- Cold Spring Harbour Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Doreen Ware
- Cold Spring Harbour Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA
- USDA ARS NEA, Plant Soil & Nutrition Laboratory Research Unit, Ithaca, NY 14853, USA
| | - Damien Goutte-Gattat
- FlyBase-Cambridge, Department of Physiology, Development and Neuroscience, University of Cambridge Downing Street, Cambridge CB2 3DY, UK
| | - Yanhui Hu
- Perrimon Lab, Department of Genetics, Harvard Medical School, Boston MA 02115, USA
| | - Nick Brown
- FlyBase-Cambridge, Department of Physiology, Development and Neuroscience, University of Cambridge Downing Street, Cambridge CB2 3DY, UK
| | - Norbert Perrimon
- Perrimon Lab, Department of Genetics, Harvard Medical School, Boston MA 02115, USA
- FlyBase-Harvard Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Sarah Teichmann
- Wellcome Trust Sanger Institute. Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| |
Collapse
|
3
|
Yuan D, Ahamed A, Burgin J, Cummins C, Devraj R, Gueye K, Gupta D, Gupta V, Haseeb M, Ihsan M, Ivanov E, Jayathilaka S, Kadhirvelu VB, Kumar M, Lathi A, Leinonen R, McKinnon J, Meszaros L, O’Cathail C, Ouma D, Paupério J, Pesant S, Rahman N, Rinck G, Selvakumar S, Suman S, Sunthornyotin Y, Ventouratou M, Vijayaraja S, Waheed Z, Woollard P, Zyoud A, Burdett T, Cochrane G. The European Nucleotide Archive in 2023. Nucleic Acids Res 2024; 52:D92-D97. [PMID: 37956313 PMCID: PMC10767888 DOI: 10.1093/nar/gkad1067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 10/23/2023] [Accepted: 10/25/2023] [Indexed: 11/15/2023] Open
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) is maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). The ENA is one of the three members of the International Nucleotide Sequence Database Collaboration (INSDC). It serves the bioinformatics community worldwide via the submission, processing, archiving and dissemination of sequence data. The ENA supports data types ranging from raw reads, through alignments and assemblies to functional annotation. The data is enriched with contextual information relating to samples and experimental configurations. In this article, we describe recent progress and improvements to ENA services. In particular, we focus upon three areas of work in 2023: FAIRness of ENA data, pandemic preparedness and foundational technology. For FAIRness, we have introduced minimal requirements for spatiotemporal annotation, created a metadata-based classification system, incorporated third party metadata curations with archived records, and developed a new rapid visualisation platform, the ENA Notebooks. For foundational enhancements, we have improved the INSDC data exchange and synchronisation pipelines, and invested in site reliability engineering for ENA infrastructure. In order to support genomic surveillance efforts, we have continued to provide ENA services in support of SARS-CoV-2 data mobilisation and have adapted these for broader pathogen surveillance efforts.
Collapse
Affiliation(s)
- David Yuan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alisha Ahamed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rajkumar Devraj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Khadim Gueye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vikas Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maira Ihsan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ankur Lathi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jasmine McKinnon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lili Meszaros
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Colman O’Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dennis Ouma
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joana Paupério
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephane Pesant
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Swati Suman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yanisa Sunthornyotin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marianna Ventouratou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Woollard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
4
|
Wittner R, Holub P, Mascia C, Frexia F, Müller H, Plass M, Allocca C, Betsou F, Burdett T, Cancio I, Chapman A, Chapman M, Courtot M, Curcin V, Eder J, Elliot M, Exter K, Goble C, Golebiewski M, Kisler B, Kremer A, Leo S, Lin‐Gibson S, Marsano A, Mattavelli M, Moore J, Nakae H, Perseil I, Salman A, Sluka J, Soiland‐Reyes S, Strambio‐De‐Castillia C, Sussman M, Swedlow JR, Zatloukal K, Geiger J. Toward a common standard for data and specimen provenance in life sciences. Learn Health Syst 2024; 8:e10365. [PMID: 38249839 PMCID: PMC10797572 DOI: 10.1002/lrh2.10365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 03/17/2023] [Accepted: 03/24/2023] [Indexed: 01/23/2024] Open
Abstract
Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.
Collapse
Affiliation(s)
- Rudolf Wittner
- BBMRI‐ERICGrazAustria
- Institute of Computer Science & Faculty of InformaticsMasaryk UniversityBrnoCzechia
| | - Petr Holub
- BBMRI‐ERICGrazAustria
- Institute of Computer Science & Faculty of InformaticsMasaryk UniversityBrnoCzechia
| | - Cecilia Mascia
- CRS4—Center for Advanced StudiesResearch and Development in SardiniaPulaItaly
| | - Francesca Frexia
- CRS4—Center for Advanced StudiesResearch and Development in SardiniaPulaItaly
| | | | | | - Clare Allocca
- National Institute of Standards and TechnologyGaithersburgMarylandUSA
| | - Fay Betsou
- Biological Resource Center of Institut Pasteur (CRBIP)ParisFrance
| | - Tony Burdett
- EMBL's European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Ibon Cancio
- Plentzia Marine Station (PiE‐UPV/EHU)University of the Basque Country, EMBRC‐SpainBilbaoSpain
| | | | | | | | | | | | - Mark Elliot
- Department of Social Statistics, School of Social SciencesUniversity of ManchesterManchesterUK
| | - Katrina Exter
- Flanders Marine Institute (VLIZ), EMBRC‐BelgiumOstendBelgium
| | - Carole Goble
- Department of Computer ScienceUniversity of ManchesterManchesterUK
| | - Martin Golebiewski
- Heidelberg Institute for Theoretical Studies (HITS gGmbH)HeidelbergGermany
| | | | | | - Simone Leo
- CRS4—Center for Advanced StudiesResearch and Development in SardiniaPulaItaly
| | | | - Anna Marsano
- Department of BiomedicineUniversity of BaselBaselSwitzerland
| | - Marco Mattavelli
- SCI‐STI‐MMÉcole Politechnique Fédérale de LausanneLausanneSwitzerland
| | - Josh Moore
- Centre for Gene Regulation and Expression and Division of Computational Biology, School of Life SciencesUniversity of DundeeDundeeUK
- German BioImaging–Gesellschaft für Mikroskopie und Bildanalyse e.V.KonstanzGermany
| | - Hiroki Nakae
- Japan bio‐Measurement and Analysis ConsortiumTokyoJapan
| | - Isabelle Perseil
- INSERM–Institut National de la Sante et de la Recherche MedicaleParisFrance
| | - Ayat Salman
- Standards Council of CanadaOttawaOntarioCanada
- Canadian Primary Care Sentinel Surveillance Network (CPCSSN) Department of Family MedicineQueen's UniversityKingstonOntarioCanada
| | - James Sluka
- Biocomplexity InstituteIndiana UniversityBloomingtonIndianaUSA
| | - Stian Soiland‐Reyes
- Department of Computer ScienceUniversity of ManchesterManchesterUK
- Informatics InstituteUniversity of AmsterdamAmsterdamThe Netherlands
| | | | - Michael Sussman
- US Department of AgricultureWashingtonDistrict of ColumbiaUSA
| | - Jason R. Swedlow
- Centre for Gene Regulation and Expression and Division of Computational Biology, School of Life SciencesUniversity of DundeeDundeeUK
| | | | - Jörg Geiger
- Interdisciplinary Bank of Biomaterials and Data Würzburg (ibdw)WürzburgGermany
| |
Collapse
|
5
|
Welter D, Rocca-Serra P, Grouès V, Sallam N, Ancien F, Shabani A, Asariardakani S, Alper P, Ghosh S, Burdett T, Sansone SA, Gu W, Satagopam V. The Translational Data Catalog - discoverable biomedical datasets. Sci Data 2023; 10:470. [PMID: 37474618 PMCID: PMC10359386 DOI: 10.1038/s41597-023-02258-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 05/22/2023] [Indexed: 07/22/2023] Open
Abstract
The discoverability of datasets resulting from the diverse range of translational and biomedical projects remains sporadic. It is especially difficult for datasets emerging from pre-competitive projects, often due to the legal constraints of data-sharing agreements, and the different priorities of the private and public sectors. The Translational Data Catalog is a single discovery point for the projects and datasets produced by a number of major research programmes funded by the European Commission. Funded by and rooted in a number of these European private-public partnership projects, the Data Catalog is built on FAIR-enabling community standards, and its mission is to ensure that datasets are findable and accessible by machines. Here we present its creation, content, value and adoption, as well as the next steps for sustainability within the ELIXIR ecosystem.
Collapse
Affiliation(s)
- Danielle Welter
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
- Luxembourg National Data Service (PNED G.I.E), 6 avenue des Hauts-Fourneaux, L-4362, Esch-sur-Alzette, Luxembourg
| | - Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
- AstraZeneca, Data Office, Data Science & AI unit R&D, 136 Hills Rd, Cambridge, UK
| | - Valentin Grouès
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Nirmeen Sallam
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - François Ancien
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Abetare Shabani
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Saeideh Asariardakani
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Pinar Alper
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
- Luxembourg National Data Service (PNED G.I.E), 6 avenue des Hauts-Fourneaux, L-4362, Esch-sur-Alzette, Luxembourg
| | - Soumyabrata Ghosh
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Wei Gu
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg.
- Luxembourg National Data Service (PNED G.I.E), 6 avenue des Hauts-Fourneaux, L-4362, Esch-sur-Alzette, Luxembourg.
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg.
- Frankfurt Institute for Advanced Studies (FIAS), Ruth-Moufang-Straße 1, D-60438, Frankfurt am Main, Germany.
| |
Collapse
|
6
|
Gurbich TA, Almeida A, Beracochea M, Burdett T, Burgin J, Cochrane G, Raj S, Richardson L, Rogers AB, Sakharova E, Salazar GA, Finn RD. MGnify Genomes: A Resource for Biome-specific Microbial Genome Catalogues. J Mol Biol 2023; 435:168016. [PMID: 36806692 PMCID: PMC10318097 DOI: 10.1016/j.jmb.2023.168016] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/07/2023] [Accepted: 02/12/2023] [Indexed: 02/18/2023]
Abstract
An increasingly common output arising from the analysis of shotgun metagenomic datasets is the generation of metagenome-assembled genomes (MAGs), with tens of thousands of MAGs now described in the literature. However, the discovery and comparison of these MAG collections is hampered by the lack of uniformity in their generation, annotation and storage. To address this, we have developed MGnify Genomes, a growing collection of biome-specific non-redundant microbial genome catalogues generated using MAGs and publicly available isolate genomes. Genomes within a biome-specific catalogue are organised into species clusters. For species that contain multiple conspecific genomes, the highest quality genome is selected as the representative, always prioritising an isolate genome over a MAG. The species representative sequences and annotations can be visualised on the MGnify website and the full catalogue and associated analysis outputs can be downloaded from MGnify servers. A suite of online search tools is provided allowing users to compare their own sequences, ranging from a gene to sets of genomes, against the catalogues. Seven biomes are available currently, comprising over 300,000 genomes that represent 11,048 non-redundant species, and include 36 taxonomic classes not currently represented by cultured genomes. MGnify Genomes is available at https://www.ebi.ac.uk/metagenomics/browse/genomes/.
Collapse
Affiliation(s)
- Tatiana A Gurbich
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Alexandre Almeida
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Martin Beracochea
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Lorna Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Alexander B Rogers
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Ekaterina Sakharova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Gustavo A Salazar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
7
|
Welter D, Juty N, Rocca-Serra P, Xu F, Henderson D, Gu W, Strubel J, Giessmann RT, Emam I, Gadiya Y, Abbassi-Daloii T, Alharbi E, Gray AJG, Courtot M, Gribbon P, Ioannidis V, Reilly DS, Lynch N, Boiten JW, Satagopam V, Goble C, Sansone SA, Burdett T. FAIR in action - a flexible framework to guide FAIRification. Sci Data 2023; 10:291. [PMID: 37208349 PMCID: PMC10199076 DOI: 10.1038/s41597-023-02167-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 03/28/2023] [Indexed: 05/21/2023] Open
Abstract
The COVID-19 pandemic has highlighted the need for FAIR (Findable, Accessible, Interoperable, and Reusable) data more than any other scientific challenge to date. We developed a flexible, multi-level, domain-agnostic FAIRification framework, providing practical guidance to improve the FAIRness for both existing and future clinical and molecular datasets. We validated the framework in collaboration with several major public-private partnership projects, demonstrating and delivering improvements across all aspects of FAIR and across a variety of datasets and their contexts. We therefore managed to establish the reproducibility and far-reaching applicability of our approach to FAIRification tasks.
Collapse
Affiliation(s)
- Danielle Welter
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Nick Juty
- University of Manchester, Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | - Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Fuqi Xu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK
| | - David Henderson
- Bayer AG, Business Development & Licensing & OI, Muellerstrasse 178, 13353, Berlin, Germany
| | - Wei Gu
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Jolanda Strubel
- The Hyve BV, Arthur van Schendelstraat 650, 3511 MJ, Utrecht, The Netherlands
| | - Robert T Giessmann
- Bayer AG, Business Development & Licensing & OI, Muellerstrasse 178, 13353, Berlin, Germany
- Institute for Globally Distributed Open Research and Education (IGDORE), Gothenburg, Sweden
| | - Ibrahim Emam
- Data Science Institute, Imperial College, London, UK
| | - Yojana Gadiya
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP) and Fraunhofer Cluster of Excellence for Immune Mediated Diseases (CIMD), Schnackenburgallee 114, 22525 Hamburg, and Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Tooba Abbassi-Daloii
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, The Netherlands
| | - Ebtisam Alharbi
- College of Computer and Information Systems, Umm Al-Qura University, Mecca, Saudi Arabia
| | - Alasdair J G Gray
- Department of Computer Science, Heriot-Watt University, Edinburgh, EH14 4AS, Scotland, UK
| | - Melanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK
- Ontario Institute for Cancer Research MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, M5G 0A3, Canada
| | - Philip Gribbon
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP) and Fraunhofer Cluster of Excellence for Immune Mediated Diseases (CIMD), Schnackenburgallee 114, 22525 Hamburg, and Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Vassilios Ioannidis
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Dorothy S Reilly
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Basel, Switzerland
| | | | | | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
| | - Carole Goble
- University of Manchester, Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK.
| |
Collapse
|
8
|
Rocca-Serra P, Gu W, Ioannidis V, Abbassi-Daloii T, Capella-Gutierrez S, Chandramouliswaran I, Splendiani A, Burdett T, Giessmann RT, Henderson D, Batista D, Emam I, Gadiya Y, Giovanni L, Willighagen E, Evelo C, Gray AJG, Gribbon P, Juty N, Welter D, Quast K, Peeters P, Plasterer T, Wood C, van der Horst E, Reilly D, van Vlijmen H, Scollen S, Lister A, Thurston M, Granell R, Sansone SA. The FAIR Cookbook - the essential resource for and by FAIR doers. Sci Data 2023; 10:292. [PMID: 37208467 PMCID: PMC10198982 DOI: 10.1038/s41597-023-02166-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 04/19/2023] [Indexed: 05/21/2023] Open
Abstract
The notion that data should be Findable, Accessible, Interoperable and Reusable, according to the FAIR Principles, has become a global norm for good data stewardship and a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. Despite such global endorsements, however, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an open, online resource of hands-on recipes for "FAIR doers" in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbook covers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. Part of the ELIXIR ecosystem, and recommended by funders, the FAIR Cookbook is open to contributions of new recipes.
Collapse
Affiliation(s)
- Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK.
- AstraZeneca, Data Office, Data Science & AI unit R&D, 136 Hills Rd, Cambridge, UK.
| | - Wei Gu
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
- Luxembourg National Data Service, 6 Avenue des Hauts-Fourneaux, Esch-sur-Alzette, Luxembourg, L-4362, Esch-sur-Alzette, Luxembourg
| | - Vassilios Ioannidis
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Tooba Abbassi-Daloii
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | | | - Ishwar Chandramouliswaran
- Office of Data Science Strategy, National Institutes of Health, 9000 Rockville Pike, Bethesda, Maryland, 20892, USA
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK
| | - Robert T Giessmann
- Bayer AG, Business Development & Licensing & OI, Pharmaceuticals, 13342, Berlin, Germany
- Institute for Globally Distributed Open Research and Education (IGDORE), Berlin, Germany
| | - David Henderson
- Bayer AG, Business Development & Licensing & OI, Pharmaceuticals, 13342, Berlin, Germany
| | - Dominique Batista
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Ibrahim Emam
- Data Science Institute, Imperial College London, William Penney Laboratory, South Kensington Campus, London, SW7 2AZ, UK
| | - Yojana Gadiya
- Fraunhofer Institute for Translational Medicine and Pharmacology and Fraunhofer Cluster of Excellence for Immune Mediated Diseases, Schnackenburgallee 114, 22525 Hamburg, and Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Lucas Giovanni
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | - Egon Willighagen
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | - Chris Evelo
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | - Alasdair J G Gray
- Department of Computer Science, Heriot-Watt University, Edinburgh, EH14 4AS, Scotland, UK
| | - Philip Gribbon
- Fraunhofer Institute for Translational Medicine and Pharmacology and Fraunhofer Cluster of Excellence for Immune Mediated Diseases, Schnackenburgallee 114, 22525 Hamburg, and Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Nick Juty
- The University of Manchester, Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | - Danielle Welter
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
- Luxembourg National Data Service, 6 Avenue des Hauts-Fourneaux, Esch-sur-Alzette, Luxembourg, L-4362, Esch-sur-Alzette, Luxembourg
| | - Karsten Quast
- Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397, Biberach an der Riss, Germany
| | - Paul Peeters
- Janssen, Turnhoutseweg 30, B-2340, Beerse, Belgium
| | - Tom Plasterer
- AstraZeneca Pharmaceuticals, 36 Gatehouse Drive, Waltham, MA, 02451, USA
| | - Colin Wood
- AstraZeneca, da Vinci Building, Melbourn Science Park, Cambridge Road, Royston, SG8 6HM, UK
| | - Eelke van der Horst
- The Hyve BV, Arthur van Schendelstraat 650, 3511 MJ, Utrecht, The Netherlands
| | - Dorothy Reilly
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Basel, Switzerland
| | | | - Serena Scollen
- ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Allyson Lister
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Milo Thurston
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Ramon Granell
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK.
| |
Collapse
|
9
|
Fahlgren N, Kapoor M, Yordanova G, Papatheodorou I, Waese J, Cole B, Harrison P, Ware D, Tickle T, Paten B, Burdett T, Elsik CG, Tuggle CK, Provart NJ. Toward a data infrastructure for the Plant Cell Atlas. Plant Physiol 2023; 191:35-46. [PMID: 36200899 PMCID: PMC9806565 DOI: 10.1093/plphys/kiac468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/18/2022] [Indexed: 06/16/2023]
Abstract
We review how a data infrastructure for the Plant Cell Atlas might be built using existing infrastructure and platforms. The Human Cell Atlas has developed an extensive infrastructure for human and mouse single cell data, while the European Bioinformatics Institute has developed a Single Cell Expression Atlas, that currently houses several plant data sets. We discuss issues related to appropriate ontologies for describing a plant single cell experiment. We imagine how such an infrastructure will enable biologists and data scientists to glean new insights into plant biology in the coming decades, as long as such data are made accessible to the community in an open manner.
Collapse
Affiliation(s)
- Noah Fahlgren
- Donald Danforth Plant Science Center, Saint Louis, Missouri 63132, USA
| | - Muskan Kapoor
- Bioinformatics and Computational Biology Program, Department of Animal Science, Iowa State University, Ames, Iowa 50011, USA
| | | | | | - Jamie Waese
- Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Benjamin Cole
- DOE-Joint Genome Institute, Lawrence Berkeley National Laboratory, 1, Cyclotron Road, Berkeley, California 94720, USA
| | - Peter Harrison
- EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Doreen Ware
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, New York 14853, USA
| | - Timothy Tickle
- Data Sciences Platform, The Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, Baskin School of Engineering, 1156 High Street, Santa Cruz, California 95064, USA
| | - Tony Burdett
- EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Christine G Elsik
- Division of Animal Sciences/Division of Plant Science & Technology/Institute for Data Science & Informatics, University of Missouri, Columbia, Missouri 65211, USA
| | - Christopher K Tuggle
- Bioinformatics and Computational Biology Program, Department of Animal Science, Iowa State University, Ames, Iowa 50011, USA
| | - Nicholas J Provart
- Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| |
Collapse
|
10
|
Richardson L, Allen B, Baldi G, Beracochea M, Bileschi M, Burdett T, Burgin J, Caballero-Pérez J, Cochrane G, Colwell L, Curtis T, Escobar-Zepeda A, Gurbich T, Kale V, Korobeynikov A, Raj S, Rogers A, Sakharova E, Sanchez S, Wilkinson D, Finn R. MGnify: the microbiome sequence data analysis resource in 2023. Nucleic Acids Res 2022; 51:D753-D759. [PMID: 36477304 PMCID: PMC9825492 DOI: 10.1093/nar/gkac1080] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/19/2022] [Accepted: 11/01/2022] [Indexed: 12/12/2022] Open
Abstract
The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.
Collapse
Affiliation(s)
- Lorna Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ben Allen
- School of Engineering, Newcastle University, Newcastle upon Tyne, UK
| | - Germana Baldi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Martin Beracochea
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Juan Caballero-Pérez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Lucy J Colwell
- Google Research, Brain Team, Mountain View, CA, USA,Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Tom Curtis
- School of Engineering, Newcastle University, Newcastle upon Tyne, UK
| | - Alejandra Escobar-Zepeda
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Tatiana A Gurbich
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Varsha Kale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnology, St Petersburg State University, St Petersburg, Russia
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alexander B Rogers
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ekaterina Sakharova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Santiago Sanchez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Robert D Finn
- To whom correspondence should be addressed. Tel: +44 1223 492679;
| |
Collapse
|
11
|
Burgin J, Ahamed A, Cummins C, Devraj R, Gueye K, Gupta D, Gupta V, Haseeb M, Ihsan M, Ivanov E, Jayathilaka S, Balavenkataraman Kadhirvelu V, Kumar M, Lathi A, Leinonen R, Mansurova M, McKinnon J, O’Cathail C, Paupério J, Pesant S, Rahman N, Rinck G, Selvakumar S, Suman S, Vijayaraja S, Waheed Z, Woollard P, Yuan D, Zyoud A, Burdett T, Cochrane G. The European Nucleotide Archive in 2022. Nucleic Acids Res 2022; 51:D121-D125. [PMID: 36399492 PMCID: PMC9825583 DOI: 10.1093/nar/gkac1051] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 10/21/2022] [Accepted: 10/25/2022] [Indexed: 11/19/2022] Open
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), offers those producing data an open and supported platform for the management, archiving, publication, and dissemination of data; and to the scientific community as a whole, it offers a globally comprehensive data set through a host of data discovery and retrieval tools. Here, we describe recent updates to the ENA's submission and retrieval services as well as focused efforts to improve connectivity, reusability, and interoperability of ENA data and metadata.
Collapse
Affiliation(s)
- Josephine Burgin
- To whom correspondence should be addressed. Tel: +44 1223 49 4246; Fax: +44 1223 494 468;
| | - Alisha Ahamed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rajkumar Devraj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Khadim Gueye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vikas Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maira Ihsan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ankur Lathi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Milena Mansurova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jasmine McKinnon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Colman O’Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joana Paupério
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stéphane Pesant
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Swati Suman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Woollard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Yuan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
12
|
Sheffield NC, Bonazzi VR, Bourne PE, Burdett T, Clark T, Grossman RL, Spjuth O, Yates AD. From biomedical cloud platforms to microservices: next steps in FAIR data and analysis. Sci Data 2022; 9:553. [PMID: 36075919 PMCID: PMC9458632 DOI: 10.1038/s41597-022-01619-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 08/08/2022] [Indexed: 11/29/2022] Open
Affiliation(s)
- Nathan C Sheffield
- Center for Public Health Genomics, School of Medicine, University of Virginia, 22908, Charlottesville, VA, USA.
- School of Data Science, University of Virginia, Charlottesville VA 22904, Charlottesville, VA, USA.
- Department of Biomedical Engineering, School of Medicine, University of Virginia, 22904, Charlottesville, VA, USA.
- Department of Public Health Sciences, School of Medicine, University of Virginia, 22908, Charlottesville, VA, USA.
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, 22908, Charlottesville, VA, USA.
| | | | - Philip E Bourne
- School of Data Science, University of Virginia, Charlottesville VA 22904, Charlottesville, VA, USA
- Department of Biomedical Engineering, School of Medicine, University of Virginia, 22904, Charlottesville, VA, USA
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Timothy Clark
- School of Data Science, University of Virginia, Charlottesville VA 22904, Charlottesville, VA, USA
- Department of Public Health Sciences, School of Medicine, University of Virginia, 22908, Charlottesville, VA, USA
| | - Robert L Grossman
- Center for Translational Data Science, University of Chicago, Chicago, IL, 60615, USA
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, 75124, Uppsala, Sweden
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
13
|
Evans RA, Leavy OC, Richardson M, Elneima O, McAuley HJC, Shikotra A, Singapuri A, Sereno M, Saunders RM, Harris VC, Houchen-Wolloff L, Aul R, Beirne P, Bolton CE, Brown JS, Choudhury G, Diar-Bakerly N, Easom N, Echevarria C, Fuld J, Hart N, Hurst J, Jones MG, Parekh D, Pfeffer P, Rahman NM, Rowland-Jones SL, Shah AM, Wootton DG, Chalder T, Davies MJ, De Soyza A, Geddes JR, Greenhalf W, Greening NJ, Heaney LG, Heller S, Howard LS, Jacob J, Jenkins RG, Lord JM, Man WDC, McCann GP, Neubauer S, Openshaw PJM, Porter JC, Rowland MJ, Scott JT, Semple MG, Singh SJ, Thomas DC, Toshner M, Lewis KE, Thwaites RS, Briggs A, Docherty AB, Kerr S, Lone NI, Quint J, Sheikh A, Thorpe M, Zheng B, Chalmers JD, Ho LP, Horsley A, Marks M, Poinasamy K, Raman B, Harrison EM, Wain LV, Brightling CE, Abel K, Adamali H, Adeloye D, Adeyemi O, Adrego R, Aguilar Jimenez LA, Ahmad S, Ahmad Haider N, Ahmed R, Ahwireng N, Ainsworth M, Al-Sheklly B, Alamoudi A, Ali M, Aljaroof M, All AM, Allan L, Allen RJ, Allerton L, Allsop L, Almeida P, Altmann D, Alvarez Corral M, Amoils S, Anderson D, Antoniades C, Arbane G, Arias A, Armour C, Armstrong L, Armstrong N, Arnold D, Arnold H, Ashish A, Ashworth A, Ashworth M, Aslani S, Assefa-Kebede H, Atkin C, Atkin P, Aung H, Austin L, Avram C, Ayoub A, Babores M, Baggott R, Bagshaw J, Baguley D, Bailey L, Baillie JK, Bain S, Bakali M, Bakau M, Baldry E, Baldwin D, Ballard C, Banerjee A, Bang B, Barker RE, Barman L, Barratt S, Barrett F, Basire D, Basu N, Bates M, Bates A, Batterham R, Baxendale H, Bayes H, Beadsworth M, Beckett P, Beggs M, Begum M, Bell D, Bell R, Bennett K, Beranova E, Bermperi A, Berridge A, Berry C, Betts S, Bevan E, Bhui K, Bingham M, Birchall K, Bishop L, Bisnauthsing K, Blaikely J, Bloss A, Bolger A, Bonnington J, Botkai A, Bourne C, Bourne M, Bramham K, Brear L, Breen G, Breeze J, Bright E, Brill S, Brindle K, Broad L, Broadley A, Brookes C, Broome M, Brown A, Brown A, Brown J, Brown J, Brown M, Brown M, Brown V, Brugha T, Brunskill N, Buch M, Buckley P, Bularga A, Bullmore E, Burden L, Burdett T, Burn D, Burns G, Burns A, Busby J, Butcher R, Butt A, Byrne S, Cairns P, Calder PC, Calvelo E, Carborn H, Card B, Carr C, Carr L, Carson G, Carter P, Casey A, Cassar M, Cavanagh J, Chablani M, Chambers RC, Chan F, Channon KM, Chapman K, Charalambou A, Chaudhuri N, Checkley A, Chen J, Cheng Y, Chetham L, Childs C, Chilvers ER, Chinoy H, Chiribiri A, Chong-James K, Choudhury N, Chowienczyk P, Christie C, Chrystal M, Clark D, Clark C, Clarke J, Clohisey S, Coakley G, Coburn Z, Coetzee S, Cole J, Coleman C, Conneh F, Connell D, Connolly B, Connor L, Cook A, Cooper B, Cooper J, Cooper S, Copeland D, Cosier T, Coulding M, Coupland C, Cox E, Craig T, Crisp P, Cristiano D, Crooks MG, Cross A, Cruz I, Cullinan P, Cuthbertson D, Daines L, Dalton M, Daly P, Daniels A, Dark P, Dasgin J, David A, David C, Davies E, Davies F, Davies G, Davies GA, Davies K, Dawson J, Daynes E, Deakin B, Deans A, Deas C, Deery J, Defres S, Dell A, Dempsey K, Denneny E, Dennis J, Dewar A, Dharmagunawardena R, Dickens C, Dipper A, Diver S, Diwanji SN, Dixon M, Djukanovic R, Dobson H, Dobson SL, Donaldson A, Dong T, Dormand N, Dougherty A, Dowling R, Drain S, Draxlbauer K, Drury K, Dulawan P, Dunleavy A, Dunn S, Earley J, Edwards S, Edwardson C, El-Taweel H, Elliott A, Elliott K, Ellis Y, Elmer A, Evans D, Evans H, Evans J, Evans R, Evans RI, Evans T, Evenden C, Evison L, Fabbri L, Fairbairn S, Fairman A, Fallon K, Faluyi D, Favager C, Fayzan T, Featherstone J, Felton T, Finch J, Finney S, Finnigan J, Finnigan L, Fisher H, Fletcher S, Flockton R, Flynn M, Foot H, Foote D, Ford A, Forton D, Fraile E, Francis C, Francis R, Francis S, Frankel A, Fraser E, Free R, French N, Fu X, Furniss J, Garner L, Gautam N, George J, George P, Gibbons M, Gill M, Gilmour L, Gleeson F, Glossop J, Glover S, Goodman N, Goodwin C, Gooptu B, Gordon H, Gorsuch T, Greatorex M, Greenhaff PL, Greenhalgh A, Greenwood J, Gregory H, Gregory R, Grieve D, Griffin D, Griffiths L, Guerdette AM, Guillen Guio B, Gummadi M, Gupta A, Gurram S, Guthrie E, Guy Z, H Henson H, Hadley K, Haggar A, Hainey K, Hairsine B, Haldar P, Hall I, Hall L, Halling-Brown M, Hamil R, Hancock A, Hancock K, Hanley NA, Haq S, Hardwick HE, Hardy E, Hardy T, Hargadon B, Harrington K, Harris E, Harrison P, Harvey A, Harvey M, Harvie M, Haslam L, Havinden-Williams M, Hawkes J, Hawkings N, Haworth J, Hayday A, Haynes M, Hazeldine J, Hazelton T, Heeley C, Heeney JL, Heightman M, Henderson M, Hesselden L, Hewitt M, Highett V, Hillman T, Hiwot T, Hoare A, Hoare M, Hockridge J, Hogarth P, Holbourn A, Holden S, Holdsworth L, Holgate D, Holland M, Holloway L, Holmes K, Holmes M, Holroyd-Hind B, Holt L, Hormis A, Hosseini A, Hotopf M, Howard K, Howell A, Hufton E, Hughes AD, Hughes J, Hughes R, Humphries A, Huneke N, Hurditch E, Husain M, Hussell T, Hutchinson J, Ibrahim W, Ilyas F, Ingham J, Ingram L, Ionita D, Isaacs K, Ismail K, Jackson T, James WY, Jarman C, Jarrold I, Jarvis H, Jastrub R, Jayaraman B, Jezzard P, Jiwa K, Johnson C, Johnson S, Johnston D, Jolley CJ, Jones D, Jones G, Jones H, Jones H, Jones I, Jones L, Jones S, Jose S, Kabir T, Kaltsakas G, Kamwa V, Kanellakis N, Kaprowska S, Kausar Z, Keenan N, Kelly S, Kemp G, Kerslake H, Key AL, Khan F, Khunti K, Kilroy S, King B, King C, Kingham L, Kirk J, Kitterick P, Klenerman P, Knibbs L, Knight S, Knighton A, Kon O, Kon S, Kon SS, Koprowska S, Korszun A, Koychev I, Kurasz C, Kurupati P, Laing C, Lamlum H, Landers G, Langenberg C, Lasserson D, Lavelle-Langham L, Lawrie A, Lawson C, Lawson C, Layton A, Lea A, Lee D, Lee JH, Lee E, Leitch K, Lenagh R, Lewis D, Lewis J, Lewis V, Lewis-Burke N, Li X, Light T, Lightstone L, Lilaonitkul W, Lim L, Linford S, Lingford-Hughes A, Lipman M, Liyanage K, Lloyd A, Logan S, Lomas D, Loosley R, Lota H, Lovegrove W, Lucey A, Lukaschuk E, Lye A, Lynch C, MacDonald S, MacGowan G, Macharia I, Mackie J, Macliver L, Madathil S, Madzamba G, Magee N, Magtoto MM, Mairs N, Majeed N, Major E, Malein F, Malim M, Mallison G, Mandal S, Mangion K, Manisty C, Manley R, March K, Marciniak S, Marino P, Mariveles M, Marouzet E, Marsh S, Marshall B, Marshall M, Martin J, Martineau A, Martinez LM, Maskell N, Matila D, Matimba-Mupaya W, Matthews L, Mbuyisa A, McAdoo S, Weir McCall J, McAllister-Williams H, McArdle A, McArdle P, McAulay D, McCormick J, McCormick W, McCourt P, McGarvey L, McGee C, Mcgee K, McGinness J, McGlynn K, McGovern A, McGuinness H, McInnes IB, McIntosh J, McIvor E, McIvor K, McLeavey L, McMahon A, McMahon MJ, McMorrow L, Mcnally T, McNarry M, McNeill J, McQueen A, McShane H, Mears C, Megson C, Megson S, Mehta P, Meiring J, Melling L, Mencias M, Menzies D, Merida Morillas M, Michael A, Milligan L, Miller C, Mills C, Mills NL, Milner L, Misra S, Mitchell J, Mohamed A, Mohamed N, Mohammed S, Molyneaux PL, Monteiro W, Moriera S, Morley A, Morrison L, Morriss R, Morrow A, Moss AJ, Moss P, Motohashi K, Msimanga N, Mukaetova-Ladinska E, Munawar U, Murira J, Nanda U, Nassa H, Nasseri M, Neal A, Needham R, Neill P, Newell H, Newman T, Newton-Cox A, Nicholson T, Nicoll D, Nolan CM, Noonan MJ, Norman C, Novotny P, Nunag J, Nwafor L, Nwanguma U, Nyaboko J, O'Donnell K, O'Brien C, O'Brien L, O'Regan D, Odell N, Ogg G, Olaosebikan O, Oliver C, Omar Z, Orriss-Dib L, Osborne L, Osbourne R, Ostermann M, Overton C, Owen J, Oxton J, Pack J, Pacpaco E, Paddick S, Painter S, Pakzad A, Palmer S, Papineni P, Paques K, Paradowski K, Pareek M, Parfrey H, Pariante C, Parker S, Parkes M, Parmar J, Patale S, Patel B, Patel M, Patel S, Pattenadk D, Pavlides M, Payne S, Pearce L, Pearl JE, Peckham D, Pendlebury J, Peng Y, Pennington C, Peralta I, Perkins E, Peterkin Z, Peto T, Petousi N, Petrie J, Phipps J, Pimm J, Piper Hanley K, Pius R, Plant H, Plein S, Plekhanova T, Plowright M, Polgar O, Poll L, Porter J, Portukhay S, Powell N, Prabhu A, Pratt J, Price A, Price C, Price C, Price D, Price L, Price L, Prickett A, Propescu J, Pugmire S, Quaid S, Quigley J, Qureshi H, Qureshi IN, Radhakrishnan K, Ralser M, Ramos A, Ramos H, Rangeley J, Rangelov B, Ratcliffe L, Ravencroft P, Reddington A, Reddy R, Redfearn H, Redwood D, Reed A, Rees M, Rees T, Regan K, Reynolds W, Ribeiro C, Richards A, Richardson E, Rivera-Ortega P, Roberts K, Robertson E, Robinson E, Robinson L, Roche L, Roddis C, Rodger J, Ross A, Ross G, Rossdale J, Rostron A, Rowe A, Rowland A, Rowland J, Roy K, Roy M, Rudan I, Russell R, Russell E, Saalmink G, Sabit R, Sage EK, Samakomva T, Samani N, Sampson C, Samuel K, Samuel R, Sanderson A, Sapey E, Saralaya D, Sargant J, Sarginson C, Sass T, Sattar N, Saunders K, Saunders P, Saunders LC, Savill H, Saxon W, Sayer A, Schronce J, Schwaeble W, Scott K, Selby N, Sewell TA, Shah K, Shah P, Shankar-Hari M, Sharma M, Sharpe C, Sharpe M, Shashaa S, Shaw A, Shaw K, Shaw V, Shelton S, Shenton L, Shevket K, Short J, Siddique S, Siddiqui S, Sidebottom J, Sigfrid L, Simons G, Simpson J, Simpson N, Singh C, Singh S, Sissons D, Skeemer J, Slack K, Smith A, Smith D, Smith S, Smith J, Smith L, Soares M, Solano TS, Solly R, Solstice AR, Soulsby T, Southern D, Sowter D, Spears M, Spencer LG, Speranza F, Stadon L, Stanel S, Steele N, Steiner M, Stensel D, Stephens G, Stephenson L, Stern M, Stewart I, Stimpson R, Stockdale S, Stockley J, Stoker W, Stone R, Storrar W, Storrie A, Storton K, Stringer E, Strong-Sheldrake S, Stroud N, Subbe C, Sudlow CL, Suleiman Z, Summers C, Summersgill C, Sutherland D, Sykes DL, Sykes R, Talbot N, Tan AL, Tarusan L, Tavoukjian V, Taylor A, Taylor C, Taylor J, Te A, Tedd H, Tee CJ, Teixeira J, Tench H, Terry S, Thackray-Nocera S, Thaivalappil F, Thamu B, Thickett D, Thomas C, Thomas S, Thomas AK, Thomas-Woods T, Thompson T, Thompson AAR, Thornton T, Tilley J, Tinker N, Tiongson GF, Tobin M, Tomlinson J, Tong C, Touyz R, Tripp KA, Tunnicliffe E, Turnbull A, Turner E, Turner S, Turner V, Turner K, Turney S, Turtle L, Turton H, Ugoji J, Ugwuoke R, Upthegrove R, Valabhji J, Ventura M, Vere J, Vickers C, Vinson B, Wade E, Wade P, Wainwright T, Wajero LO, Walder S, Walker S, Walker S, Wall E, Wallis T, Walmsley S, Walsh JA, Walsh S, Warburton L, Ward TJC, Warwick K, Wassall H, Waterson S, Watson E, Watson L, Watson J, Welch C, Welch H, Welsh B, Wessely S, West S, Weston H, Wheeler H, White S, Whitehead V, Whitney J, Whittaker S, Whittam B, Whitworth V, Wight A, Wild J, Wilkins M, Wilkinson D, Williams N, Williams N, Williams J, Williams-Howard SA, Willicombe M, Willis G, Willoughby J, Wilson A, Wilson D, Wilson I, Window N, Witham M, Wolf-Roberts R, Wood C, Woodhead F, Woods J, Wormleighton J, Worsley J, Wraith D, Wrey Brown C, Wright C, Wright L, Wright S, Wyles J, Wynter I, Xu M, Yasmin N, Yasmin S, Yates T, Yip KP, Young B, Young S, Young A, Yousuf AJ, Zawia A, Zeidan L, Zhao B, Zongo O. Clinical characteristics with inflammation profiling of long COVID and association with 1-year recovery following hospitalisation in the UK: a prospective observational study. Lancet Respir Med 2022; 10:761-775. [PMID: 35472304 PMCID: PMC9034855 DOI: 10.1016/s2213-2600(22)00127-8] [Citation(s) in RCA: 144] [Impact Index Per Article: 72.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/23/2022] [Accepted: 03/31/2022] [Indexed: 11/25/2022]
Abstract
BACKGROUND No effective pharmacological or non-pharmacological interventions exist for patients with long COVID. We aimed to describe recovery 1 year after hospital discharge for COVID-19, identify factors associated with patient-perceived recovery, and identify potential therapeutic targets by describing the underlying inflammatory profiles of the previously described recovery clusters at 5 months after hospital discharge. METHODS The Post-hospitalisation COVID-19 study (PHOSP-COVID) is a prospective, longitudinal cohort study recruiting adults (aged ≥18 years) discharged from hospital with COVID-19 across the UK. Recovery was assessed using patient-reported outcome measures, physical performance, and organ function at 5 months and 1 year after hospital discharge, and stratified by both patient-perceived recovery and recovery cluster. Hierarchical logistic regression modelling was performed for patient-perceived recovery at 1 year. Cluster analysis was done using the clustering large applications k-medoids approach using clinical outcomes at 5 months. Inflammatory protein profiling was analysed from plasma at the 5-month visit. This study is registered on the ISRCTN Registry, ISRCTN10980107, and recruitment is ongoing. FINDINGS 2320 participants discharged from hospital between March 7, 2020, and April 18, 2021, were assessed at 5 months after discharge and 807 (32·7%) participants completed both the 5-month and 1-year visits. 279 (35·6%) of these 807 patients were women and 505 (64·4%) were men, with a mean age of 58·7 (SD 12·5) years, and 224 (27·8%) had received invasive mechanical ventilation (WHO class 7-9). The proportion of patients reporting full recovery was unchanged between 5 months (501 [25·5%] of 1965) and 1 year (232 [28·9%] of 804). Factors associated with being less likely to report full recovery at 1 year were female sex (odds ratio 0·68 [95% CI 0·46-0·99]), obesity (0·50 [0·34-0·74]) and invasive mechanical ventilation (0·42 [0·23-0·76]). Cluster analysis (n=1636) corroborated the previously reported four clusters: very severe, severe, moderate with cognitive impairment, and mild, relating to the severity of physical health, mental health, and cognitive impairment at 5 months. We found increased inflammatory mediators of tissue damage and repair in both the very severe and the moderate with cognitive impairment clusters compared with the mild cluster, including IL-6 concentration, which was increased in both comparisons (n=626 participants). We found a substantial deficit in median EQ-5D-5L utility index from before COVID-19 (retrospective assessment; 0·88 [IQR 0·74-1·00]), at 5 months (0·74 [0·64-0·88]) to 1 year (0·75 [0·62-0·88]), with minimal improvements across all outcome measures at 1 year after discharge in the whole cohort and within each of the four clusters. INTERPRETATION The sequelae of a hospital admission with COVID-19 were substantial 1 year after discharge across a range of health domains, with the minority in our cohort feeling fully recovered. Patient-perceived health-related quality of life was reduced at 1 year compared with before hospital admission. Systematic inflammation and obesity are potential treatable traits that warrant further investigation in clinical trials. FUNDING UK Research and Innovation and National Institute for Health Research.
Collapse
|
14
|
Liyanage I, Burdett T, Droesbeke B, Erdos K, Fernandez R, Gray A, Haseeb M, Jupp S, Penim F, Pommier C, Rocca-Serra P, Courtot M, Coppens F. ELIXIR biovalidator for semantic validation of life science metadata. Bioinformatics 2022; 38:3141-3142. [PMID: 35380605 PMCID: PMC9154242 DOI: 10.1093/bioinformatics/btac195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 02/25/2022] [Accepted: 04/01/2022] [Indexed: 01/14/2023] Open
Abstract
SUMMARY To advance biomedical research, increasingly large amounts of complex data need to be discovered and integrated. This requires syntactic and semantic validation to ensure shared understanding of relevant entities. This article describes the ELIXIR biovalidator, which extends the syntactic validation of the widely used AJV library with ontology-based validation of JSON documents. AVAILABILITY AND IMPLEMENTATION Source code: https://github.com/elixir-europe/biovalidator, Release: v1.9.1, License: Apache License 2.0, Deployed at: https://www.ebi.ac.uk/biosamples/schema/validator/validate. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Isuru Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Bert Droesbeke
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium
| | - Karoly Erdos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Rolando Fernandez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Alasdair Gray
- Department of Computer Science, Heriot-Watt University, Edinburgh EH14 4AS, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Simon Jupp
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Flavia Penim
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Cyril Pommier
- INRAE, BioinfOmics, Plant Bioinformatics Facility, Université Paris-Saclay, 78026 Versailles, France,INRAE, URGI, Université Paris-Saclay, 78026 Versailles, France
| | - Philippe Rocca-Serra
- Department of Engineering Science, University of Oxford e-Research Centre, University of Oxford, Oxford OX1 3QG, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK,Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada,To whom correspondence should be addressed.
| | - Frederik Coppens
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium
| |
Collapse
|
15
|
Moreno P, Fexova S, George N, Manning JR, Miao Z, Mohammed S, Muñoz-Pomer A, Fullgrabe A, Bi Y, Bush N, Iqbal H, Kumbham U, Solovyev A, Zhao L, Prakash A, García-Seisdedos D, Kundu DJ, Wang S, Walzer M, Clarke L, Osumi-Sutherland D, Tello-Ruiz MK, Kumari S, Ware D, Eliasova J, Arends MJ, Nawijn MC, Meyer K, Burdett T, Marioni J, Teichmann S, Vizcaíno JA, Brazma A, Papatheodorou I. Expression Atlas update: gene and protein expression in multiple species. Nucleic Acids Res 2022; 50:D129-D140. [PMID: 34850121 PMCID: PMC8728300 DOI: 10.1093/nar/gkab1030] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/11/2021] [Accepted: 11/19/2021] [Indexed: 01/21/2023] Open
Abstract
The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa.
Collapse
Affiliation(s)
- Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Silvie Fexova
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Jonathan R Manning
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Zhichiao Miao
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Suhaib Mohammed
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Alfonso Muñoz-Pomer
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Anja Fullgrabe
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Yalan Bi
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Natassja Bush
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Haider Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Upendra Kumbham
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Andrey Solovyev
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Lingyun Zhao
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - David García-Seisdedos
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Deepti J Kundu
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - David Osumi-Sutherland
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | | | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- USDA ARS NEA, Plant Soil & Nutrition Laboratory Research Unit, Ithaca, NY 14853, USA
| | - Jana Eliasova
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Mark J Arends
- Edinburgh Pathology, University of Edinburgh, Institute of Genetics & Cancer, Edinburgh, UK
| | - Martijn C Nawijn
- Department of Pathology and Medical Biology, GRIAC research institute, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Kerstin Meyer
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - John Marioni
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Sarah Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
16
|
Cummins C, Ahamed A, Aslam R, Burgin J, Devraj R, Edbali O, Gupta D, Harrison PW, Haseeb M, Holt S, Ibrahim T, Ivanov E, Jayathilaka S, Kadhirvelu V, Kay S, Kumar M, Lathi A, Leinonen R, Madeira F, Madhusoodanan N, Mansurova M, O'Cathail C, Pearce M, Pesant S, Rahman N, Rajan J, Rinck G, Selvakumar S, Sokolov A, Suman S, Thorne R, Totoo P, Vijayaraja S, Waheed Z, Zyoud A, Lopez R, Burdett T, Cochrane G. The European Nucleotide Archive in 2021. Nucleic Acids Res 2021; 50:D106-D110. [PMID: 34850158 PMCID: PMC8728206 DOI: 10.1093/nar/gkab1051] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 10/14/2021] [Accepted: 10/18/2021] [Indexed: 12/02/2022] Open
Abstract
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena), maintained at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) provides freely accessible services, both for deposition of, and access to, open nucleotide sequencing data. Open scientific data are of paramount importance to the scientific community and contribute daily to the acceleration of scientific advance. Here, we outline the major updates to ENA’s services and infrastructure that have been delivered over the past year.
Collapse
Affiliation(s)
- Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alisha Ahamed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raheela Aslam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rajkumar Devraj
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ossama Edbali
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Talal Ibrahim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vishnukumar Kadhirvelu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ankur Lathi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fabio Madeira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nandana Madhusoodanan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Milena Mansurova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matt Pearce
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stéphane Pesant
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Swati Suman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ross Thorne
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Prabhat Totoo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
17
|
Lawson J, Cabili MN, Kerry G, Boughtwood T, Thorogood A, Alper P, Bowers SR, Boyles RR, Brookes AJ, Brush M, Burdett T, Clissold H, Donnelly S, Dyke SO, Freeberg MA, Haendel MA, Hata C, Holub P, Jeanson F, Jene A, Kawashima M, Kawashima S, Konopko M, Kyomugisha I, Li H, Linden M, Rodriguez LL, Morita M, Mulder N, Muller J, Nagaie S, Nasir J, Ogishima S, Ota Wang V, Paglione LD, Pandya RN, Parkinson H, Philippakis AA, Prasser F, Rambla J, Reinold K, Rushton GA, Saltzman A, Saunders G, Sofia HJ, Spalding JD, Swertz MA, Tulchinsky I, van Enckevort EJ, Varma S, Voisin C, Yamamoto N, Yamasaki C, Zass L, Guidry Auvil JM, Nyrönen TH, Courtot M. The Data Use Ontology to streamline responsible access to human biomedical datasets. Cell Genom 2021; 1:None. [PMID: 34820659 PMCID: PMC8591903 DOI: 10.1016/j.xgen.2021.100028] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 07/02/2021] [Accepted: 08/09/2021] [Indexed: 11/25/2022]
Abstract
Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.
Collapse
Affiliation(s)
- Jonathan Lawson
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Moran N. Cabili
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Giselle Kerry
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Tiffany Boughtwood
- Australian Genomics, Murdoch Children’s Research Institute, Parkville, VIC, Australia
| | - Adrian Thorogood
- Centre of Genomics and Policy, Department of Human Genetics, McGill University, Montreal, QC, Canada,ELIXIR-Luxembourg, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Pinar Alper
- ELIXIR-Luxembourg, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | | | | | | | - Matthew Brush
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Tony Burdett
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Hayley Clissold
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Stacey Donnelly
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Stephanie O.M. Dyke
- McGill Centre for Integrative Neuroscience, Montreal Neurological Institute, Department of Neurology & Neurosurgery, Faculty of Medicine, McGill University, Montreal, QC, Canada
| | - Mallory A. Freeberg
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Chihiro Hata
- Bioinformation and DDBJ Center, National Institute of Genetics, Mishima, Japan
| | - Petr Holub
- BBMRI-ERIC, AT and Masaryk University, Brno, Czech Republic
| | | | - Aina Jene
- Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Minae Kawashima
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
| | - Shuichi Kawashima
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, Japan
| | | | - Irene Kyomugisha
- Division of Human Genetics, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Haoyuan Li
- Canada’s Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - Mikael Linden
- ELIXIR-Finland, CSC - IT Center for Science Ltd, Espoo, Finland
| | | | | | - Nicola Mulder
- Computational Biology Division, IDM, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jean Muller
- Laboratoire de Génétique Médicale, Institut de Génétique Médicale d’Alsace, INSERM U1112, Université; de Strasbourg, Strasbourg, France,Laboratoire de Diagnostic Génétique, Institut de Génétique Médicale d’Alsace, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Satoshi Nagaie
- Tohoku Medical Megabank Organization (ToMMo), Tohoku University, Sendai, Japan
| | - Jamal Nasir
- Department of Life Sciences, University of Northampton, Northampton, UK
| | - Soichi Ogishima
- Tohoku Medical Megabank Organization (ToMMo), Tohoku University, Sendai, Japan
| | - Vivian Ota Wang
- Office of Data Sharing, National Cancer Institute, NIH, Rockville, MD, USA
| | | | | | - Helen Parkinson
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Anthony A. Philippakis
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Fabian Prasser
- Berlin Institute of Health at Charité—Universitätsmedizin Berlin, Berlin, Germany
| | - Jordi Rambla
- Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Kathy Reinold
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Gregory A. Rushton
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Andrea Saltzman
- Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - Heidi J. Sofia
- National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - John D. Spalding
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Morris A. Swertz
- Genomics Coordination Center, Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | | | - Esther J. van Enckevort
- Genomics Coordination Center, Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Susheel Varma
- Health Data Research UK, Gibbs Building, 215 Euston Road, London NW1 2BE, UK
| | | | | | | | - Lyndon Zass
- Computational Biology Division, IDM, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | | | | | - Mélanie Courtot
- European Molecular Biology Laboratory—European Bioinformatics Institute (EMBL-EBI), Hinxton, UK,Corresponding author
| |
Collapse
|
18
|
Rehm HL, Page AJ, Smith L, Adams JB, Alterovitz G, Babb LJ, Barkley MP, Baudis M, Beauvais MJ, Beck T, Beckmann JS, Beltran S, Bernick D, Bernier A, Bonfield JK, Boughtwood TF, Bourque G, Bowers SR, Brookes AJ, Brudno M, Brush MH, Bujold D, Burdett T, Buske OJ, Cabili MN, Cameron DL, Carroll RJ, Casas-Silva E, Chakravarty D, Chaudhari BP, Chen SH, Cherry JM, Chung J, Cline M, Clissold HL, Cook-Deegan RM, Courtot M, Cunningham F, Cupak M, Davies RM, Denisko D, Doerr MJ, Dolman LI, Dove ES, Dursi LJ, Dyke SO, Eddy JA, Eilbeck K, Ellrott KP, Fairley S, Fakhro KA, Firth HV, Fitzsimons MS, Fiume M, Flicek P, Fore IM, Freeberg MA, Freimuth RR, Fromont LA, Fuerth J, Gaff CL, Gan W, Ghanaim EM, Glazer D, Green RC, Griffith M, Griffith OL, Grossman RL, Groza T, Guidry Auvil JM, Guigó R, Gupta D, Haendel MA, Hamosh A, Hansen DP, Hart RK, Hartley DM, Haussler D, Hendricks-Sturrup RM, Ho CW, Hobb AE, Hoffman MM, Hofmann OM, Holub P, Hsu JS, Hubaux JP, Hunt SE, Husami A, Jacobsen JO, Jamuar SS, Janes EL, Jeanson F, Jené A, Johns AL, Joly Y, Jones SJ, Kanitz A, Kato K, Keane TM, Kekesi-Lafrance K, Kelleher J, Kerry G, Khor SS, Knoppers BM, Konopko MA, Kosaki K, Kuba M, Lawson J, Leinonen R, Li S, Lin MF, Linden M, Liu X, Liyanage IU, Lopez J, Lucassen AM, Lukowski M, Mann AL, Marshall J, Mattioni M, Metke-Jimenez A, Middleton A, Milne RJ, Molnár-Gábor F, Mulder N, Munoz-Torres MC, Nag R, Nakagawa H, Nasir J, Navarro A, Nelson TH, Niewielska A, Nisselle A, Niu J, Nyrönen TH, O’Connor BD, Oesterle S, Ogishima S, Ota Wang V, Paglione LA, Palumbo E, Parkinson HE, Philippakis AA, Pizarro AD, Prlic A, Rambla J, Rendon A, Rider RA, Robinson PN, Rodarmer KW, Rodriguez LL, Rubin AF, Rueda M, Rushton GA, Ryan RS, Saunders GI, Schuilenburg H, Schwede T, Scollen S, Senf A, Sheffield NC, Skantharajah N, Smith AV, Sofia HJ, Spalding D, Spurdle AB, Stark Z, Stein LD, Suematsu M, Tan P, Tedds JA, Thomson AA, Thorogood A, Tickle TL, Tokunaga K, Törnroos J, Torrents D, Upchurch S, Valencia A, Guimera RV, Vamathevan J, Varma S, Vears DF, Viner C, Voisin C, Wagner AH, Wallace SE, Walsh BP, Williams MS, Winkler EC, Wold BJ, Wood GM, Woolley JP, Yamasaki C, Yates AD, Yung CK, Zass LJ, Zaytseva K, Zhang J, Goodhand P, North K, Birney E. GA4GH: International policies and standards for data sharing across genomic research and healthcare. Cell Genom 2021; 1:100029. [PMID: 35072136 PMCID: PMC8774288 DOI: 10.1016/j.xgen.2021.100029] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.
Collapse
Affiliation(s)
- Heidi L. Rehm
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Angela J.H. Page
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Global Alliance for Genomics and Health, Toronto, ON, Canada
| | - Lindsay Smith
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Jeremy B. Adams
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Gil Alterovitz
- Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | | | | | - Michael Baudis
- University of Zurich, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michael J.S. Beauvais
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- McGill University, Montreal, QC, Canada
| | - Tim Beck
- University of Leicester, Leicester, UK
| | | | - Sergi Beltran
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Universitat de Barcelona, Barcelona, Spain
| | - David Bernick
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Tiffany F. Boughtwood
- Australian Genomics, Parkville, VIC, Australia
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
| | - Guillaume Bourque
- McGill University, Montreal, QC, Canada
- Canadian Center for Computational Genomics, Montreal, QC, Canada
| | | | | | - Michael Brudno
- Canadian Center for Computational Genomics, Montreal, QC, Canada
- University of Toronto, Toronto, ON, Canada
- University Health Network, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
- Canadian Distributed Infrastructure for Genomics (CanDIG), Toronto, ON, Canada
| | | | - David Bujold
- McGill University, Montreal, QC, Canada
- Canadian Center for Computational Genomics, Montreal, QC, Canada
- Canadian Distributed Infrastructure for Genomics (CanDIG), Toronto, ON, Canada
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | - Daniel L. Cameron
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | | | | | | | - Bimal P. Chaudhari
- Nationwide Children’s Hospital, Columbus, OH, USA
- The Ohio State University, Columbus, OH, USA
| | - Shu Hui Chen
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Justina Chung
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Melissa Cline
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
| | | | | | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | | | | | | | | | - L. Jonathan Dursi
- University Health Network, Toronto, ON, Canada
- Canadian Distributed Infrastructure for Genomics (CanDIG), Toronto, ON, Canada
| | | | | | | | | | - Susan Fairley
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Khalid A. Fakhro
- Sidra Medicine, Doha, Qatar
- Weill Cornell Medicine - Qatar, Doha, Qatar
| | - Helen V. Firth
- Wellcome Sanger Institute, Hinxton, UK
- Addenbrooke’s Hospital, Cambridge, UK
| | | | | | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Ian M. Fore
- National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mallory A. Freeberg
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Lauren A. Fromont
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | - Clara L. Gaff
- Australian Genomics, Parkville, VIC, Australia
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | - Weiniu Gan
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Elena M. Ghanaim
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - David Glazer
- Verily Life Sciences, South San Francisco, CA, USA
| | - Robert C. Green
- Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Malachi Griffith
- Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Obi L. Griffith
- Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | | | | | | | - Roderic Guigó
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Ada Hamosh
- Johns Hopkins University, Baltimore, MD, USA
| | - David P. Hansen
- Australian Genomics, Parkville, VIC, Australia
- The Australian e-Health Research Centre, CSIRO, Herston, QLD, Australia
| | - Reece K. Hart
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Invitae, San Francisco, CA, USA
- MyOme, Inc, San Bruno, CA, USA
| | | | - David Haussler
- UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA, USA
| | | | | | | | - Michael M. Hoffman
- University of Toronto, Toronto, ON, Canada
- University Health Network, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
| | - Oliver M. Hofmann
- University of Toronto, Toronto, ON, Canada
- University of Melbourne, Melbourne, VIC, Australia
| | - Petr Holub
- BBMRI-ERIC, Graz, Austria
- Masaryk University, Brno, Czech Republic
| | | | | | - Sarah E. Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Ammar Husami
- Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | | | - Saumya S. Jamuar
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Republic of Singapore
- SingHealth Duke-NUS Institute of Precision Medicine, Singapore, Republic of Singapore
| | - Elizabeth L. Janes
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- University of Waterloo, Waterloo, ON, Canada
| | | | - Aina Jené
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Amber L. Johns
- Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Yann Joly
- McGill University, Montreal, QC, Canada
| | - Steven J.M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Alexander Kanitz
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Thomas M. Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
- University of Nottingham, Nottingham, UK
| | - Kristina Kekesi-Lafrance
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- McGill University, Montreal, QC, Canada
| | | | - Giselle Kerry
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Seik-Soon Khor
- National Center for Global Health and Medicine Hospital, Tokyo, Japan
- University of Tokyo, Tokyo, Japan
| | | | | | | | | | | | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Stephanie Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Global Alliance for Genomics and Health, Toronto, ON, Canada
| | | | - Mikael Linden
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | | | - Isuru Udara Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | | | - Alice L. Mann
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Wellcome Sanger Institute, Hinxton, UK
| | | | | | | | - Anna Middleton
- Wellcome Connecting Science, Hinxton, UK
- University of Cambridge, Cambridge, UK
| | - Richard J. Milne
- Wellcome Connecting Science, Hinxton, UK
- University of Cambridge, Cambridge, UK
| | | | - Nicola Mulder
- H3ABioNet, Computational Biology Division, IDM, Faculty of Health Sciences, Cape Town, South Africa
| | | | - Rishi Nag
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Hidewaki Nakagawa
- Japan Agency for Medical Research & Development (AMED), Tokyo, Japan
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Arcadi Navarro
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Barcelonaβeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain
| | | | - Ania Niewielska
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Amy Nisselle
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
- Human Genetics Society of Australasia Education, Ethics & Social Issues Committee, Alexandria, NSW, Australia
| | - Jeffrey Niu
- University Health Network, Toronto, ON, Canada
| | - Tommi H. Nyrönen
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | | | - Sabine Oesterle
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Vivian Ota Wang
- National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Emilio Palumbo
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Helen E. Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | | | | | - Jordi Rambla
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | - Renee A. Rider
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter N. Robinson
- The Jackson Laboratory, Farmington, CT, USA
- University of Connecticut, Farmington, CT, USA
| | - Kurt W. Rodarmer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | - Alan F. Rubin
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | - Manuel Rueda
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | | | | | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University of Basel, Basel, Switzerland
| | | | | | | | - Neerjah Skantharajah
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | | | - Heidi J. Sofia
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Dylan Spalding
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | | | - Zornitza Stark
- Australian Genomics, Parkville, VIC, Australia
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
| | - Lincoln D. Stein
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | | | - Patrick Tan
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Republic of Singapore
- Precision Health Research Singapore, Singapore, Republic of Singapore
- Genome Institute of Singapore, Singapore, Republic of Singapore
| | | | - Alastair A. Thomson
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Adrian Thorogood
- McGill University, Montreal, QC, Canada
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | | | - Katsushi Tokunaga
- University of Tokyo, Tokyo, Japan
- National Center for Global Health and Medicine, Tokyo, Japan
| | - Juha Törnroos
- CSC–IT Center for Science, Espoo, Finland
- ELIXIR Finland, Espoo, Finland
| | - David Torrents
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Barcelona Supercomputing Center, Barcelona, Spain
| | - Sean Upchurch
- California Institute of Technology, Pasadena, CA, USA
| | - Alfonso Valencia
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Barcelona Supercomputing Center, Barcelona, Spain
| | | | - Jessica Vamathevan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Susheel Varma
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
- Health Data Research UK, London, UK
| | - Danya F. Vears
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Melbourne, Melbourne, VIC, Australia
- Human Genetics Society of Australasia Education, Ethics & Social Issues Committee, Alexandria, NSW, Australia
- Melbourne Law School, University of Melbourne, Parkville, VIC, Australia
| | - Coby Viner
- University of Toronto, Toronto, ON, Canada
- University Health Network, Toronto, ON, Canada
| | | | - Alex H. Wagner
- Nationwide Children’s Hospital, Columbus, OH, USA
- The Ohio State University, Columbus, OH, USA
| | | | | | | | - Eva C. Winkler
- Section of Translational Medical Ethics, University Hospital Heidelberg, Heidelberg, Germany
| | | | | | | | | | - Andrew D. Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Christina K. Yung
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Indoc Research, Toronto, ON, Canada
| | - Lyndon J. Zass
- H3ABioNet, Computational Biology Division, IDM, Faculty of Health Sciences, Cape Town, South Africa
| | - Ksenia Zaytseva
- McGill University, Montreal, QC, Canada
- Canadian Centre for Computational Genomics, Montreal, QC, Canada
| | - Junjun Zhang
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Peter Goodhand
- Global Alliance for Genomics and Health, Toronto, ON, Canada
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Kathryn North
- Murdoch Children’s Research Institute, Parkville, VIC, Australia
- University of Toronto, Toronto, ON, Canada
- University of Melbourne, Melbourne, VIC, Australia
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
- European Molecular Biology Laboratory, Heidelberg, Germany
| |
Collapse
|
19
|
Courtot M, Gupta D, Liyanage I, Xu F, Burdett T. BioSamples database: FAIRer samples metadata to accelerate research data management. Nucleic Acids Res 2021; 50:D1500-D1507. [PMID: 34747489 PMCID: PMC8728232 DOI: 10.1093/nar/gkab1046] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/13/2021] [Accepted: 10/14/2021] [Indexed: 12/04/2022] Open
Abstract
The BioSamples database at EMBL-EBI is the central institutional repository for sample metadata storage and connection to EMBL-EBI archives and other resources. The technical improvements to our infrastructure described in our last update have enabled us to scale and accommodate an increasing number of communities, resulting in a higher number of submissions and more heterogeneous data. The BioSamples database now has a valuable set of features and processes to improve data quality in BioSamples, and in particular enriching metadata content and following FAIR principles. In this manuscript, we describe how BioSamples in 2021 handles requirements from our community of users through exemplar use cases: increased findability of samples and improved data management practices support the goals of the ReSOLUTE project, how the plant community benefits from being able to link genotypic to phenotypic information, and we highlight how cumulatively those improvements contribute to more complex multi-omics data integration supporting COVID-19 research. Finally, we present underlying technical features used as pillars throughout those use cases and how they are reused for expanded engagement with communities such as FAIRplus and the Global Alliance for Genomics and Health. Availability: The BioSamples database is freely available at http://www.ebi.ac.uk/biosamples. Content is distributed under the EMBL-EBI Terms of Use available at https://www.ebi.ac.uk/about/terms-of-use. The BioSamples code is available at https://github.com/EBIBioSamples/biosamples-v4 and distributed under the Apache 2.0 license.
Collapse
Affiliation(s)
- Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Isuru Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Fuqi Xu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
20
|
Kerimov N, Hayhurst JD, Peikova K, Manning JR, Walter P, Kolberg L, Samoviča M, Sakthivel MP, Kuzmin I, Trevanion SJ, Burdett T, Jupp S, Parkinson H, Papatheodorou I, Yates AD, Zerbino DR, Alasoo K. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat Genet 2021; 53:1290-1299. [PMID: 34493866 PMCID: PMC8423625 DOI: 10.1038/s41588-021-00924-w] [Citation(s) in RCA: 128] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 07/26/2021] [Indexed: 12/15/2022]
Abstract
Many gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified. In the present study, we present the eQTL Catalogue ( https://www.ebi.ac.uk/eqtl ), a resource of quality-controlled, uniformly re-computed gene expression and splicing QTLs from 21 studies. We find that, for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies. Although most QTLs were shared between most bulk tissues, we identified a greater diversity of cell-type-specific QTLs from purified cell types, a subset of which also manifested as new disease co-localizations. Our summary statistics are freely available to enable the systematic interpretation of human GWAS associations across many cell types and tissues.
Collapse
Affiliation(s)
- Nurlan Kerimov
- Institute of Computer Science, University of Tartu, Tartu, Estonia
- Open Targets, Wellcome Genome Campus, Cambridge, UK
| | - James D Hayhurst
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Kateryna Peikova
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Jonathan R Manning
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Peter Walter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Liis Kolberg
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Marija Samoviča
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Manoj Pandian Sakthivel
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Stephen J Trevanion
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Tony Burdett
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Simon Jupp
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Helen Parkinson
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Irene Papatheodorou
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Andrew D Yates
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Daniel R Zerbino
- Open Targets, Wellcome Genome Campus, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
| | - Kaur Alasoo
- Institute of Computer Science, University of Tartu, Tartu, Estonia.
- Open Targets, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
21
|
Burdett T, Knight A. COVID-19: the need to address health inequalities. Perspect Public Health 2021; 142:18-19. [PMID: 34044649 PMCID: PMC8755919 DOI: 10.1177/17579139211011497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- T Burdett
- Senior Academic in Integrated Health Care, Department of Nursing Science, Faculty of Health and Social Sciences, Bournemouth Gateway Building, Bournemouth University, Bournemouth, Dorset BH8 8GP, UK
| | - A Knight
- Senior Lecturer, Adult Nursing, Department of Nursing Science, Bournemouth University, Poole, UK
| |
Collapse
|
22
|
Harrison PW, Ahamed A, Aslam R, Alako BTF, Burgin J, Buso N, Courtot M, Fan J, Gupta D, Haseeb M, Holt S, Ibrahim T, Ivanov E, Jayathilaka S, Balavenkataraman Kadhirvelu V, Kumar M, Lopez R, Kay S, Leinonen R, Liu X, O'Cathail C, Pakseresht A, Park Y, Pesant S, Rahman N, Rajan J, Sokolov A, Vijayaraja S, Waheed Z, Zyoud A, Burdett T, Cochrane G. The European Nucleotide Archive in 2020. Nucleic Acids Res 2021; 49:D82-D85. [PMID: 33175160 PMCID: PMC7778925 DOI: 10.1093/nar/gkaa1028] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 10/20/2020] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alisha Ahamed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raheela Aslam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicola Buso
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Talal Ibrahim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Youngmi Park
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephane Pesant
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
23
|
Ghoussaini M, Mountjoy E, Carmona M, Peat G, Schmidt E, Hercules A, Fumis L, Miranda A, Carvalho-Silva D, Buniello A, Burdett T, Hayhurst J, Baker J, Ferrer J, Gonzalez-Uriarte A, Jupp S, Karim M, Koscielny G, Machlitt-Northen S, Malangone C, Pendlington ZM, Roncaglia P, Suveges D, Wright D, Vrousgou O, Papa E, Parkinson H, MacArthur JAL, Todd J, Barrett JC, Schwartzentruber J, Hulcoop D, Ochoa D, McDonagh EM, Dunham I. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res 2021; 49:D1311-D1320. [PMID: 33045747 PMCID: PMC7778936 DOI: 10.1093/nar/gkaa840] [Citation(s) in RCA: 208] [Impact Index Per Article: 69.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/16/2020] [Accepted: 09/17/2020] [Indexed: 01/22/2023] Open
Abstract
Open Targets Genetics (https://genetics.opentargets.org) is an open-access integrative resource that aggregates human GWAS and functional genomics data including gene expression, protein abundance, chromatin interaction and conformation data from a wide range of cell types and tissues to make robust connections between GWAS-associated loci, variants and likely causal genes. This enables systematic identification and prioritisation of likely causal variants and genes across all published trait-associated loci. In this paper, we describe the public resources we aggregate, the technology and analyses we use, and the functionality that the portal offers. Open Targets Genetics can be searched by variant, gene or study/phenotype. It offers tools that enable users to prioritise causal variants and genes at disease-associated loci and access systematic cross-disease and disease-molecular trait colocalization analysis across 92 cell types and tissues including the eQTL Catalogue. Data visualizations such as Manhattan-like plots, regional plots, credible sets overlap between studies and PheWAS plots enable users to explore GWAS signals in depth. The integrated data is made available through the web portal, for bulk download and via a GraphQL API, and the software is open source. Applications of this integrated data include identification of novel targets for drug discovery and drug repurposing.
Collapse
Affiliation(s)
- Maya Ghoussaini
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Edward Mountjoy
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Miguel Carmona
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Gareth Peat
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Ellen M Schmidt
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Andrew Hercules
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Luca Fumis
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Alfredo Miranda
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Denise Carvalho-Silva
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Annalisa Buniello
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Tony Burdett
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - James Hayhurst
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Jarrod Baker
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Javier Ferrer
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Asier Gonzalez-Uriarte
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Simon Jupp
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Mohd Anisul Karim
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Gautier Koscielny
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- GlaxoSmithKline plc, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, UK
| | - Sandra Machlitt-Northen
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- GlaxoSmithKline plc, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, UK
| | - Cinzia Malangone
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Zoe May Pendlington
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Paola Roncaglia
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Daniel Suveges
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Daniel Wright
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Olga Vrousgou
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Eliseo Papa
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- Systems Biology, Biogen, Cambridge, MA 02142, USA
| | - Helen Parkinson
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Jacqueline A L MacArthur
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - John A Todd
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, NIHR Oxford Biomedical Research Centre, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Jeffrey C Barrett
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Jeremy Schwartzentruber
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - David G Hulcoop
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- GlaxoSmithKline plc, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, UK
| | - David Ochoa
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Ellen M McDonagh
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Ian Dunham
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
24
|
Knight A, Burdett T. Achieving integrated care: the need for digital empowerment. Perspect Public Health 2020; 141:15-16. [PMID: 33079012 PMCID: PMC7770209 DOI: 10.1177/1757913920921422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- A Knight
- Senior Lecturer, Adult Nursing, Department of Nursing Science, Bournemouth University, Bournemouth House, 19 Christchurch Road, Bournemouth, BH1 3LH, UK
| | - T Burdett
- Senior Lecturer, Integrated Care, Department of Nursing Science, Bournemouth University, UK
| |
Collapse
|
25
|
Amid C, Alako BTF, Balavenkataraman Kadhirvelu V, Burdett T, Burgin J, Fan J, Harrison PW, Holt S, Hussein A, Ivanov E, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martinez-Villacorta J, Milano A, Pakseresht A, Rahman N, Rajan J, Reddy K, Richards E, Smirnov D, Sokolov A, Vijayaraja S, Cochrane G. The European Nucleotide Archive in 2019. Nucleic Acids Res 2020; 48:D70-D76. [PMID: 31722421 PMCID: PMC7145635 DOI: 10.1093/nar/gkz1063] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 10/25/2019] [Accepted: 11/07/2019] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.
Collapse
Affiliation(s)
- Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josue Martinez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edward Richards
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
26
|
Courtot M, Cherubin L, Faulconbridge A, Vaughan D, Green M, Richardson D, Harrison P, Whetzel PL, Parkinson H, Burdett T. BioSamples database: an updated sample metadata hub. Nucleic Acids Res 2020; 47:D1172-D1178. [PMID: 30407529 PMCID: PMC6323949 DOI: 10.1093/nar/gky1061] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 10/18/2018] [Indexed: 12/23/2022] Open
Abstract
The BioSamples database at EMBL-EBI provides a central hub for sample metadata storage and linkage to other EMBL-EBI resources. BioSamples has recently undergone major changes, both in terms of data content and supporting infrastructure. The data content has more than doubled from around 2 million samples in 2014 to just over 5 million samples in 2018. Fast, reciprocal data exchange was fully established between sister Biosample databases and other INSDC partners, enabling a worldwide common representation and centralization of sample metadata. The BioSamples platform has been upgraded to accommodate anticipated increases in the number of submissions via GA4GH driver projects such as the Human Cell Atlas and the EGA, as well as from mirroring of NCBI dbGaP data. The BioSamples database is now the authoritative repository for all INSDC sample metadata, an ELIXIR Deposition Database for Biomolecular Data and the EMBL-EBI sample metadata hub. To support faster turnaround for sample submission, and to increase scalability and resilience, we have upgraded the BioSamples database backend storage, APIs and user interface. Finally, the website has been redesigned to allow search and retrieval of records based on specific filters, such as ‘disease’ or ‘organism’. These changes are targeted at answering current use cases as well as providing functionalities for future emerging and anticipated developments. Availability: The BioSamples database is freely available at http://www.ebi.ac.uk/biosamples. Content is distributed under the EMBL-EBI Terms of Use available at https://www.ebi.ac.uk/about/terms-of-use.
Collapse
Affiliation(s)
| | - Luca Cherubin
- EMBL-EBI, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | | | | | - Matthew Green
- EMBL-EBI, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | | | | | | | | | - Tony Burdett
- EMBL-EBI, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|
27
|
Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, McMahon A, Morales J, Mountjoy E, Sollis E, Suveges D, Vrousgou O, Whetzel PL, Amode R, Guillen JA, Riat HS, Trevanion SJ, Hall P, Junkins H, Flicek P, Burdett T, Hindorff LA, Cunningham F, Parkinson H. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res 2020; 47:D1005-D1012. [PMID: 30445434 PMCID: PMC6323933 DOI: 10.1093/nar/gky1120] [Citation(s) in RCA: 2277] [Impact Index Per Article: 569.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 10/25/2018] [Indexed: 02/06/2023] Open
Abstract
The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.
Collapse
Affiliation(s)
- Annalisa Buniello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Open Targets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jacqueline A L MacArthur
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maria Cerezo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Laura W Harris
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Hayhurst
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cinzia Malangone
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aoife McMahon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joannella Morales
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edward Mountjoy
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK.,JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Wellcome Centre for Human Genetics, University of Oxford, NIHR Oxford Biomedical Research Centre, Nuffield Department of Medicine, Oxford, UK
| | - Elliot Sollis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Suveges
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Olga Vrousgou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Open Targets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Patricia L Whetzel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jose A Guillen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Harpreet S Riat
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peggy Hall
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Heather Junkins
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lucia A Hindorff
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
28
|
Morales J, Welter D, Bowler EH, Cerezo M, Harris LW, McMahon AC, Hall P, Junkins HA, Milano A, Hastings E, Malangone C, Buniello A, Burdett T, Flicek P, Parkinson H, Cunningham F, Hindorff LA, MacArthur JAL. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol 2018; 19:21. [PMID: 29448949 PMCID: PMC5815218 DOI: 10.1186/s13059-018-1396-2] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 01/19/2018] [Indexed: 12/23/2022] Open
Abstract
The accurate description of ancestry is essential to interpret, access, and integrate human genomics data, and to ensure that these benefit individuals from all ancestral backgrounds. However, there are no established guidelines for the representation of ancestry information. Here we describe a framework for the accurate and standardized description of sample ancestry, and validate it by application to the NHGRI-EBI GWAS Catalog. We confirm known biases and gaps in diversity, and find that African and Hispanic or Latin American ancestry populations contribute a disproportionately high number of associations. It is our hope that widespread adoption of this framework will lead to improved analysis, interpretation, and integration of human genomics data.
Collapse
Affiliation(s)
- Joannella Morales
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Danielle Welter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Emily H Bowler
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Maria Cerezo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Laura W Harris
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Aoife C McMahon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Peggy Hall
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892-9305, USA
| | - Heather A Junkins
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892-9305, USA
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Emma Hastings
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Cinzia Malangone
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Annalisa Buniello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lucia A Hindorff
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892-9305, USA
| | - Jacqueline A L MacArthur
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
29
|
McMurry JA, Juty N, Blomberg N, Burdett T, Conlin T, Conte N, Courtot M, Deck J, Dumontier M, Fellows DK, Gonzalez-Beltran A, Gormanns P, Grethe J, Hastings J, Hériché JK, Hermjakob H, Ison JC, Jimenez RC, Jupp S, Kunze J, Laibe C, Le Novère N, Malone J, Martin MJ, McEntyre JR, Morris C, Muilu J, Müller W, Rocca-Serra P, Sansone SA, Sariyar M, Snoep JL, Soiland-Reyes S, Stanford NJ, Swainston N, Washington N, Williams AR, Wimalaratne SM, Winfree LM, Wolstencroft K, Goble C, Mungall CJ, Haendel MA, Parkinson H. Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data. PLoS Biol 2017; 15:e2001414. [PMID: 28662064 PMCID: PMC5490878 DOI: 10.1371/journal.pbio.2001414] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.
Collapse
Affiliation(s)
- Julie A. McMurry
- Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Nick Juty
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Niklas Blomberg
- ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Tony Burdett
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Tom Conlin
- Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Nathalie Conte
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Mélanie Courtot
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - John Deck
- Berkeley Natural History Museums, University of California at Berkeley, Berkely, California, United States of America
| | - Michel Dumontier
- Institute of Data Science, Maastricht University, Maastricht, the Netherlands
| | - Donal K. Fellows
- School of Computer Science, The University of Manchester, Manchester, United Kingdom
| | | | - Philipp Gormanns
- Institute of Experimental Genetics, Helmholtz Centre Munich, German Research Center for Environmental Health, Neuherberg, Germany
| | - Jeffrey Grethe
- Center for Research in Biological Systems, University of California San Diego, La Jolla, California, United States of America
| | | | | | - Henning Hermjakob
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Jon C. Ison
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Rafael C. Jimenez
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Simon Jupp
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - John Kunze
- California Digital Library, Oakland, California, United States of America
| | - Camille Laibe
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | | | - James Malone
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Maria Jesus Martin
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Johanna R. McEntyre
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Chris Morris
- Science and Technology Facilities Council, Daresbury Laboratory, Warrington, United Kingdom
| | - Juha Muilu
- Genomics Coordination Center, Department of Genetics, University Medical Center Groningen and Groningen Bioinformatics Center, University of Groningen, Groningen, the Netherlands
| | - Wolfgang Müller
- Scientific Databases and Visualization at Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | | | | | - Murat Sariyar
- Institute for Medical Informatics, Bern University of Applied Sciences, Engineering and Information Technology, Bern, Switzerland
| | - Jacky L. Snoep
- Manchester Institute of Biology, University of Manchester, Manchester, United Kingdom
- Department of Biochemistry, Stellenbosch University, Stellenbosch, South Africa
| | - Stian Soiland-Reyes
- School of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Natalie J. Stanford
- School of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Neil Swainston
- Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals, University of Manchester, Manchester, United Kingdom
| | - Nicole Washington
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Alan R. Williams
- School of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Sarala M. Wimalaratne
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Lilly M. Winfree
- Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Katherine Wolstencroft
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, the Netherlands
| | - Carole Goble
- School of Computer Science, The University of Manchester, Manchester, United Kingdom
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Melissa A. Haendel
- Department of Medical Informatics and Epidemiology and OHSU Library, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Helen Parkinson
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
30
|
De Sousa PA, Steeg R, Wachter E, Bruce K, King J, Hoeve M, Khadun S, McConnachie G, Holder J, Kurtz A, Seltmann S, Dewender J, Reimann S, Stacey G, O'Shea O, Chapman C, Healy L, Zimmermann H, Bolton B, Rawat T, Atkin I, Veiga A, Kuebler B, Serano BM, Saric T, Hescheler J, Brüstle O, Peitz M, Thiele C, Geijsen N, Holst B, Clausen C, Lako M, Armstrong L, Gupta SK, Kvist AJ, Hicks R, Jonebring A, Brolén G, Ebneth A, Cabrera-Socorro A, Foerch P, Geraerts M, Stummann TC, Harmon S, George C, Streeter I, Clarke L, Parkinson H, Harrison PW, Faulconbridge A, Cherubin L, Burdett T, Trigueros C, Patel MJ, Lucas C, Hardy B, Predan R, Dokler J, Brajnik M, Keminer O, Pless O, Gribbon P, Claussen C, Ringwald A, Kreisel B, Courtney A, Allsopp TE. Rapid establishment of the European Bank for induced Pluripotent Stem Cells (EBiSC) - the Hot Start experience. Stem Cell Res 2017; 20:105-114. [PMID: 28334554 DOI: 10.1016/j.scr.2017.03.002] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Revised: 02/17/2017] [Accepted: 03/03/2017] [Indexed: 10/20/2022] Open
Abstract
A fast track "Hot Start" process was implemented to launch the European Bank for Induced Pluripotent Stem Cells (EBiSC) to provide early release of a range of established control and disease linked human induced pluripotent stem cell (hiPSC) lines. Established practice amongst consortium members was surveyed to arrive at harmonised and publically accessible Standard Operations Procedures (SOPs) for tissue procurement, bio-sample tracking, iPSC expansion, cryopreservation, qualification and distribution to the research community. These were implemented to create a quality managed foundational collection of lines and associated data made available for distribution. Here we report on the successful outcome of this experience and work flow for banking and facilitating access to an otherwise disparate European resource, with lessons to benefit the international research community. ETOC: The report focuses on the EBiSC experience of rapidly establishing an operational capacity to procure, bank and distribute a foundational collection of established hiPSC lines. It validates the feasibility and defines the challenges of harnessing and integrating the capability and productivity of centres across Europe using commonly available resources currently in the field.
Collapse
Affiliation(s)
- Paul A De Sousa
- Centre for Clinical Brain Sciences, Chancellors Building, 49 Little France Crescent, University of Edinburgh, Edinburgh EH16 4SB, UK; Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK.
| | - Rachel Steeg
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Elisabeth Wachter
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Kevin Bruce
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Jason King
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Marieke Hoeve
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Shalinee Khadun
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - George McConnachie
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Julie Holder
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Andreas Kurtz
- Charité - Universitätsmedizin Berlin, Berlin-Brandenburg Center for Regenerative Therapies, Augustenburger Platz, Berlin 13353, Germany
| | - Stefanie Seltmann
- Charité - Universitätsmedizin Berlin, Berlin-Brandenburg Center for Regenerative Therapies, Augustenburger Platz, Berlin 13353, Germany
| | - Johannes Dewender
- Charité - Universitätsmedizin Berlin, Berlin-Brandenburg Center for Regenerative Therapies, Augustenburger Platz, Berlin 13353, Germany
| | - Sascha Reimann
- Charité - Universitätsmedizin Berlin, Berlin-Brandenburg Center for Regenerative Therapies, Augustenburger Platz, Berlin 13353, Germany
| | - Glyn Stacey
- UK Stem Cell Bank, Division of Advanced Therapies, National Institute for Biological Standards and Control, Medicines and Healthcare Products Regulatory Authority, Blanche Lane, South Mimms, Hertfordshire, ENG 3GQ, UK
| | - Orla O'Shea
- UK Stem Cell Bank, Division of Advanced Therapies, National Institute for Biological Standards and Control, Medicines and Healthcare Products Regulatory Authority, Blanche Lane, South Mimms, Hertfordshire, ENG 3GQ, UK
| | - Charlotte Chapman
- UK Stem Cell Bank, Division of Advanced Therapies, National Institute for Biological Standards and Control, Medicines and Healthcare Products Regulatory Authority, Blanche Lane, South Mimms, Hertfordshire, ENG 3GQ, UK
| | - Lyn Healy
- UK Stem Cell Bank, Division of Advanced Therapies, National Institute for Biological Standards and Control, Medicines and Healthcare Products Regulatory Authority, Blanche Lane, South Mimms, Hertfordshire, ENG 3GQ, UK
| | - Heiko Zimmermann
- Fraunhofer Institute for Biomedical Engineering (IBMT), Josef-von-Fraunhofer-Weg 1, 66280 Sulzbach, Germany; Molecular & Cellular Biotechnology/Nanotechnology, Saarland University, Campus, 66123 Saarbrücken, Germany
| | - Bryan Bolton
- European Collection of Authenticated Cell Cultures, Public Health England, Porton Down, Salisbury SP4 0JG, UK
| | - Trisha Rawat
- European Collection of Authenticated Cell Cultures, Public Health England, Porton Down, Salisbury SP4 0JG, UK
| | - Isobel Atkin
- European Collection of Authenticated Cell Cultures, Public Health England, Porton Down, Salisbury SP4 0JG, UK
| | - Anna Veiga
- Barcelona Stem Cell Bank, Centre for Regenerative Medicine in Barcelona, C/Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Bernd Kuebler
- Barcelona Stem Cell Bank, Centre for Regenerative Medicine in Barcelona, C/Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Blanca Miranda Serano
- Andalusian Public Health Care System, Avda Conocimiento sn, 18100 Armilla, Granada, Spain
| | - Tomo Saric
- Centre for Physiology and Pathophysiology, Institute for Neurophysiology, Medical Faculty, University of Cologne, 50931 Cologne, Germany
| | - Jürgen Hescheler
- Centre for Physiology and Pathophysiology, Institute for Neurophysiology, Medical Faculty, University of Cologne, 50931 Cologne, Germany
| | - Oliver Brüstle
- Institute of Reconstructive Neurobiology, LIFE & BRAIN Centre, University of Bonn, Sigmund-Freud-Strasse 25, 53105 Bonn, Germany
| | - Michael Peitz
- Institute of Reconstructive Neurobiology, LIFE & BRAIN Centre, University of Bonn, Sigmund-Freud-Strasse 25, 53105 Bonn, Germany
| | - Cornelia Thiele
- Institute of Reconstructive Neurobiology, LIFE & BRAIN Centre, University of Bonn, Sigmund-Freud-Strasse 25, 53105 Bonn, Germany
| | - Niels Geijsen
- Hubrecht Institute for developmental biology and stem cell research, Royal Netherlands Academy of Arts and Sciences (KNAW), Utrecht University, Department of Clinical Sciences of Companion Animals and UMC Utrecht, 3584CT Utrecht, The Netherlands
| | - Bjørn Holst
- Bioneer A/S, Kogle Alle 2, DK-2970 Hørsholm, Denmark
| | | | - Majlinda Lako
- Institute for Genetic Medicine, University of Newcastle, Newcastle NE1 3BZ, United Kingdom
| | - Lyle Armstrong
- Institute for Genetic Medicine, University of Newcastle, Newcastle NE1 3BZ, United Kingdom
| | - Shailesh K Gupta
- AstraZeneca, R&D, Innovative Medicines, Discovery Sciences, Reagents and Assay Development, HC3006, Pepparedsleden 1, SE-431 83 Mölndal, Sweden
| | - Alexander J Kvist
- AstraZeneca, R&D, Innovative Medicines, Discovery Sciences, Reagents and Assay Development, HC3006, Pepparedsleden 1, SE-431 83 Mölndal, Sweden
| | - Ryan Hicks
- AstraZeneca, R&D, Innovative Medicines, Discovery Sciences, Reagents and Assay Development, HC3006, Pepparedsleden 1, SE-431 83 Mölndal, Sweden
| | - Anna Jonebring
- AstraZeneca, R&D, Innovative Medicines, Discovery Sciences, Reagents and Assay Development, HC3006, Pepparedsleden 1, SE-431 83 Mölndal, Sweden
| | - Gabriella Brolén
- AstraZeneca, R&D, Innovative Medicines, Discovery Sciences, Reagents and Assay Development, HC3006, Pepparedsleden 1, SE-431 83 Mölndal, Sweden
| | - Andreas Ebneth
- Janssen Research & Development (A Division of Janssen Pharmaceutica N.V), Neuroscience Therapeutic Area, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Alfredo Cabrera-Socorro
- Janssen Research & Development (A Division of Janssen Pharmaceutica N.V), Neuroscience Therapeutic Area, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Patrik Foerch
- UCB Biopharma (since May 2014), Discovery Research, Chemin du Foriest, Braine l'Alleud B-1420, Belgium
| | - Martine Geraerts
- UCB Biopharma (since May 2014), Discovery Research, Chemin du Foriest, Braine l'Alleud B-1420, Belgium
| | | | - Shawn Harmon
- University of Edinburgh School of Law, Old College, South Bridge, Edinburgh EH8 9YL, UK
| | - Carol George
- University of Edinburgh School of Law, Old College, South Bridge, Edinburgh EH8 9YL, UK
| | - Ian Streeter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam Faulconbridge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Luca Cherubin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cesar Trigueros
- Inbiomed, P° Mikeletegi, 81, 20009 San Sebastián, Gipuzkoa, Spain
| | - Minal J Patel
- Cellular Generation and Phenotyping (CGaP) facility, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinston CB10 1SA, UK
| | - Christa Lucas
- Cellular Generation and Phenotyping (CGaP) facility, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinston CB10 1SA, UK
| | - Barry Hardy
- Douglas Connect, Technology Park Basel, Hochbergerstrasse 60C, 4057 Basel, Switzerland
| | - Rok Predan
- Douglas Connect, Technology Park Basel, Hochbergerstrasse 60C, 4057 Basel, Switzerland
| | - Joh Dokler
- Douglas Connect, Technology Park Basel, Hochbergerstrasse 60C, 4057 Basel, Switzerland
| | - Maja Brajnik
- Douglas Connect, Technology Park Basel, Hochbergerstrasse 60C, 4057 Basel, Switzerland
| | - Oliver Keminer
- Fraunhofer IME ScreeningPort, Schnackenburgallee 114, D-22525 Hamburg, Germany
| | - Ole Pless
- Fraunhofer IME ScreeningPort, Schnackenburgallee 114, D-22525 Hamburg, Germany
| | - Philip Gribbon
- Fraunhofer IME ScreeningPort, Schnackenburgallee 114, D-22525 Hamburg, Germany
| | - Carsten Claussen
- Fraunhofer IME ScreeningPort, Schnackenburgallee 114, D-22525 Hamburg, Germany
| | | | - Beate Kreisel
- ARTTIC, 58A rue du Dessous des Berges, F-75013 Paris, France
| | - Aidan Courtney
- Roslin Cells Ltd(1), Head office, Nine Edinburgh Bioquarter, 9 Little France Rd, Edinburgh EH16 4UX, UK; EBiSC banking facility, Babraham Research Campus, B260 Meditrina, Cambridge CB22 3AT, UK
| | - Timothy E Allsopp
- Pfizer Ltd (Neusentis), The Portway Building, Granta Park, Great Abington, Cambridge, CB21 6GS, UK
| |
Collapse
|
31
|
MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, Junkins H, McMahon A, Milano A, Morales J, Pendlington ZM, Welter D, Burdett T, Hindorff L, Flicek P, Cunningham F, Parkinson H. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 2016; 45:D896-D901. [PMID: 27899670 PMCID: PMC5210590 DOI: 10.1093/nar/gkw1133] [Citation(s) in RCA: 1388] [Impact Index Per Article: 173.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 11/02/2016] [Indexed: 02/02/2023] Open
Abstract
The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation interface. These developments have improved the data release frequency by increasing automation of curation and providing scaling improvements. The range of available Catalog data has also been extended with structured ancestry and recruitment information added for all studies. The infrastructure improvements also support scaling for larger arrays, exome and sequencing studies, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future.
Collapse
Affiliation(s)
- Jacqueline MacArthur
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Emily Bowler
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Maria Cerezo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Peggy Hall
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Emma Hastings
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Heather Junkins
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Aoife McMahon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Joannella Morales
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Zoe May Pendlington
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Danielle Welter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lucia Hindorff
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
32
|
Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R, Hasan S, Karamanis N, Maguire M, Papa E, Pierleoni A, Pignatelli M, Platt T, Rowland F, Wankar P, Bento AP, Burdett T, Fabregat A, Forbes S, Gaulton A, Gonzalez CY, Hermjakob H, Hersey A, Jupe S, Kafkas Ş, Keays M, Leroy C, Lopez FJ, Magarinos MP, Malone J, McEntyre J, Munoz-Pomer Fuentes A, O'Donovan C, Papatheodorou I, Parkinson H, Palka B, Paschall J, Petryszak R, Pratanwanich N, Sarntivijal S, Saunders G, Sidiropoulos K, Smith T, Sondka Z, Stegle O, Tang YA, Turner E, Vaughan B, Vrousgou O, Watkins X, Martin MJ, Sanseau P, Vamathevan J, Birney E, Barrett J, Dunham I. Open Targets: a platform for therapeutic target identification and validation. Nucleic Acids Res 2016; 45:D985-D994. [PMID: 27899665 PMCID: PMC5210543 DOI: 10.1093/nar/gkw1055] [Citation(s) in RCA: 267] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Revised: 10/19/2016] [Accepted: 11/03/2016] [Indexed: 01/16/2023] Open
Abstract
We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.
Collapse
Affiliation(s)
- Gautier Koscielny
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK .,GSK, Medicines Research Center, Gunnels Wood Road, Stevenage, SG1 2NY, UK
| | - Peter An
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Biogen, Cambridge, MA 02142, USA
| | - Denise Carvalho-Silva
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jennifer A Cham
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Luca Fumis
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Rippa Gasparyan
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Biogen, Cambridge, MA 02142, USA
| | - Samiul Hasan
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,GSK, Medicines Research Center, Gunnels Wood Road, Stevenage, SG1 2NY, UK
| | - Nikiforos Karamanis
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Michael Maguire
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Eliseo Papa
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Biogen, Cambridge, MA 02142, USA
| | - Andrea Pierleoni
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Miguel Pignatelli
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Theo Platt
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Biogen, Cambridge, MA 02142, USA
| | - Francis Rowland
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Priyanka Wankar
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Biogen, Cambridge, MA 02142, USA
| | - A Patrícia Bento
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tony Burdett
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Antonio Fabregat
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Simon Forbes
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Anna Gaulton
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Cristina Yenyxe Gonzalez
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,National Center for Protein Research, No. 38, Life Science Park Road, Changping District, 102206 Beijing, China
| | - Anne Hersey
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Steven Jupe
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Şenay Kafkas
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Maria Keays
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Catherine Leroy
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Francisco-Javier Lopez
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Maria Paula Magarinos
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - James Malone
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Johanna McEntyre
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Alfonso Munoz-Pomer Fuentes
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Claire O'Donovan
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Irene Papatheodorou
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Helen Parkinson
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Barbara Palka
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Justin Paschall
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Robert Petryszak
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Naruemon Pratanwanich
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sirarat Sarntivijal
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Gary Saunders
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Konstantinos Sidiropoulos
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Thomas Smith
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Zbyslaw Sondka
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Oliver Stegle
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Y Amy Tang
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Edward Turner
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Brendan Vaughan
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Olga Vrousgou
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Xavier Watkins
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Maria-Jesus Martin
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Philippe Sanseau
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,GSK, Medicines Research Center, Gunnels Wood Road, Stevenage, SG1 2NY, UK
| | - Jessica Vamathevan
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ewan Birney
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jeffrey Barrett
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Ian Dunham
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK .,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
33
|
Adebayo S, McLeod K, Tudose I, Osumi-Sutherland D, Burdett T, Baldock R, Burger A, Parkinson H. PhenoImageShare: an image annotation and query infrastructure. J Biomed Semantics 2016; 7:35. [PMID: 27267125 PMCID: PMC4896029 DOI: 10.1186/s13326-016-0072-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Accepted: 05/05/2016] [Indexed: 01/12/2023] Open
Abstract
Background High throughput imaging is now available to many groups and it is possible to generate a large quantity of high quality images quickly. Managing this data, consistently annotating it, or making it available to the community are all challenges that come with these methods. Results PhenoImageShare provides an ontology-enabled lightweight image data query, annotation service and a single point of access backed by a Solr server for programmatic access to an integrated image collection enabling improved community access. PhenoImageShare also provides an easy to use online image annotation tool with functionality to draw regions of interest on images and to annotate them with terms from an autosuggest-enabled ontology-lookup widget. The provenance of each image, and annotation, is kept and links to original resources are provided. The semantic and intuitive search interface is species and imaging technology neutral. PhenoImageShare now provides access to annotation for over 100,000 images for 2 species. Conclusion The PhenoImageShare platform provides underlying infrastructure for both programmatic access and user-facing tools for biologists enabling the query and annotation of federated images. PhenoImageShare is accessible online at http://www.phenoimageshare.org.
Collapse
Affiliation(s)
- Solomon Adebayo
- MRC Human Genetics Unit, IGMM, University of Edinburgh, Crewe Road, Edinburgh, UK
| | - Kenneth McLeod
- Department of Computer Science, Heriot-Watt University, Edinburgh, UK
| | - Ilinca Tudose
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK.
| | - David Osumi-Sutherland
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Tony Burdett
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Richard Baldock
- MRC Human Genetics Unit, IGMM, University of Edinburgh, Crewe Road, Edinburgh, UK
| | - Albert Burger
- Department of Computer Science, Heriot-Watt University, Edinburgh, UK
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| |
Collapse
|
34
|
Jupp S, Malone J, Burdett T, Heriche JK, Williams E, Ellenberg J, Parkinson H, Rustici G. The cellular microscopy phenotype ontology. J Biomed Semantics 2016; 7:28. [PMID: 27195102 PMCID: PMC4870745 DOI: 10.1186/s13326-016-0074-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 05/10/2016] [Indexed: 11/17/2022] Open
Abstract
Background Phenotypic data derived from high content screening is currently annotated using free-text, thus preventing the integration of independent datasets, including those generated in different biological domains, such as cell lines, mouse and human tissues. Description We present the Cellular Microscopy Phenotype Ontology (CMPO), a species neutral ontology for describing phenotypic observations relating to the whole cell, cellular components, cellular processes and cell populations. CMPO is compatible with related ontology efforts, allowing for future cross-species integration of phenotypic data. CMPO was developed following a curator-driven approach where phenotype data were annotated by expert biologists following the Entity-Quality (EQ) pattern. These EQs were subsequently transformed into new CMPO terms following an established post composition process. Conclusion CMPO is currently being utilized to annotate phenotypes associated with high content screening datasets stored in several image repositories including the Image Data Repository (IDR), MitoSys project database and the Cellular Phenotype Database to facilitate data browsing and discoverability.
Collapse
Affiliation(s)
- Simon Jupp
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton Cambridge, CB10 1SD UK
| | - James Malone
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton Cambridge, CB10 1SD UK
| | - Tony Burdett
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton Cambridge, CB10 1SD UK
| | - Jean-Karim Heriche
- European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Eleanor Williams
- Centre for Gene Regulation and Expression, University of Dundee, Dundee, DD1 5EH UK
| | - Jan Ellenberg
- European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton Cambridge, CB10 1SD UK
| | - Gabriella Rustici
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton Cambridge, CB10 1SD UK
| |
Collapse
|
35
|
Jupp S, Burdett T, Welter D, Sarntivijai S, Parkinson H, Malone J. Webulous and the Webulous Google Add-On--a web service and application for ontology building from templates. J Biomed Semantics 2016; 7:17. [PMID: 27042287 PMCID: PMC4818523 DOI: 10.1186/s13326-016-0055-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Accepted: 03/11/2016] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Authoring bio-ontologies is a task that has traditionally been undertaken by skilled experts trained in understanding complex languages such as the Web Ontology Language (OWL), in tools designed for such experts. As requests for new terms are made, the need for expert ontologists represents a bottleneck in the development process. Furthermore, the ability to rigorously enforce ontology design patterns in large, collaboratively developed ontologies is difficult with existing ontology authoring software. DESCRIPTION We present Webulous, an application suite for supporting ontology creation by design patterns. Webulous provides infrastructure to specify templates for populating ontology design patterns that get transformed into OWL assertions in a target ontology. Webulous provides programmatic access to the template server and a client application has been developed for Google Sheets that allows templates to be loaded, populated and resubmitted to the Webulous server for processing. CONCLUSIONS The development and delivery of ontologies to the community requires software support that goes beyond the ontology editor. Building ontologies by design patterns and providing simple mechanisms for the addition of new content helps reduce the overall cost and effort required to develop an ontology. The Webulous system provides support for this process and is used as part of the development of several ontologies at the European Bioinformatics Institute.
Collapse
Affiliation(s)
- Simon Jupp
- European Bioinformatics Institute (EMBL-EBI),European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Tony Burdett
- European Bioinformatics Institute (EMBL-EBI),European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Danielle Welter
- European Bioinformatics Institute (EMBL-EBI),European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Sirarat Sarntivijai
- European Bioinformatics Institute (EMBL-EBI),European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI),European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - James Malone
- European Bioinformatics Institute (EMBL-EBI),European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
36
|
Petryszak R, Keays M, Tang YA, Fonseca NA, Barrera E, Burdett T, Füllgrabe A, Fuentes AMP, Jupp S, Koskinen S, Mannion O, Huerta L, Megy K, Snow C, Williams E, Barzine M, Hastings E, Weisser H, Wright J, Jaiswal P, Huber W, Choudhary J, Parkinson HE, Brazma A. Expression Atlas update--an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res 2016; 44:D746-52. [PMID: 26481351 PMCID: PMC4702781 DOI: 10.1093/nar/gkv1045] [Citation(s) in RCA: 396] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2015] [Revised: 09/25/2015] [Accepted: 09/29/2015] [Indexed: 11/12/2022] Open
Abstract
Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons-estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: 'enrichment' in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.
Collapse
Affiliation(s)
- Robert Petryszak
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Maria Keays
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Y Amy Tang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Nuno A Fonseca
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Elisabet Barrera
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Anja Füllgrabe
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | | | - Simon Jupp
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Satu Koskinen
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Oliver Mannion
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Laura Huerta
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Karine Megy
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Catherine Snow
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Eleanor Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Mitra Barzine
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Emma Hastings
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | | | | | | | - Wolfgang Huber
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | | | - Helen E Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, UK
| |
Collapse
|
37
|
Kolesnikov N, Hastings E, Keays M, Melnichuk O, Tang YA, Williams E, Dylag M, Kurbatova N, Brandizi M, Burdett T, Megy K, Pilicheva E, Rustici G, Tikhonov A, Parkinson H, Petryszak R, Sarkans U, Brazma A. ArrayExpress update--simplifying data submissions. Nucleic Acids Res 2014; 43:D1113-6. [PMID: 25361974 PMCID: PMC4383899 DOI: 10.1093/nar/gku1057] [Citation(s) in RCA: 499] [Impact Index Per Article: 49.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42 000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.
Collapse
Affiliation(s)
- Nikolay Kolesnikov
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Emma Hastings
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Maria Keays
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Olga Melnichuk
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Y Amy Tang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Eleanor Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Miroslaw Dylag
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Natalja Kurbatova
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Marco Brandizi
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Karyn Megy
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Ekaterina Pilicheva
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Gabriella Rustici
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK School of Biological Sciences, Cambridge Systems Biology Centre, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Andrew Tikhonov
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Robert Petryszak
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Ugis Sarkans
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| |
Collapse
|
38
|
Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 2014; 42:D1001-6. [PMID: 24316577 PMCID: PMC3965119 DOI: 10.1093/nar/gkt1229] [Citation(s) in RCA: 1981] [Impact Index Per Article: 198.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Revised: 11/06/2013] [Accepted: 11/07/2013] [Indexed: 12/15/2022] Open
Abstract
The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100,000 single-nucleotide polymorphisms (SNPs) and all SNP-trait associations with P <1 × 10(-5). The Catalog includes 1751 curated publications of 11 912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs' chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.
Collapse
Affiliation(s)
- Danielle Welter
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jacqueline MacArthur
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Joannella Morales
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Tony Burdett
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Peggy Hall
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Heather Junkins
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alan Klemm
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Paul Flicek
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Teri Manolio
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Lucia Hindorff
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
39
|
Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 2013. [PMID: 24316577 DOI: 10.1093/nar/gkt1229.] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100,000 single-nucleotide polymorphisms (SNPs) and all SNP-trait associations with P <1 × 10(-5). The Catalog includes 1751 curated publications of 11 912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs' chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.
Collapse
Affiliation(s)
- Danielle Welter
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK, Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA and Division of Policy, Communication and Education, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Petryszak R, Burdett T, Fiorelli B, Fonseca NA, Gonzalez-Porta M, Hastings E, Huber W, Jupp S, Keays M, Kryvych N, McMurry J, Marioni JC, Malone J, Megy K, Rustici G, Tang AY, Taubert J, Williams E, Mannion O, Parkinson HE, Brazma A. Expression Atlas update--a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments. Nucleic Acids Res 2013; 42:D926-32. [PMID: 24304889 PMCID: PMC3964963 DOI: 10.1093/nar/gkt1270] [Citation(s) in RCA: 251] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.
Collapse
Affiliation(s)
- Robert Petryszak
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Faulconbridge A, Burdett T, Brandizi M, Gostev M, Pereira R, Vasant D, Sarkans U, Brazma A, Parkinson H. Updates to BioSamples database at European Bioinformatics Institute. Nucleic Acids Res 2013; 42:D50-2. [PMID: 24265224 PMCID: PMC3965081 DOI: 10.1093/nar/gkt1081] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI's databases and provides links back to assay databases. Sample groups are used to manage related samples, e.g. those from an experimental submission, or a single reference collection. Infrastructural improvements include a new user interface with ontological and key word queries, a new query API, a new data submission API, complete RDF data download and a supporting SPARQL endpoint, accessioning at the point of submission to the European Nucleotide Archive and European Genotype Phenotype Archives and improved query response times.
Collapse
Affiliation(s)
- Adam Faulconbridge
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Rustici G, Kolesnikov N, Brandizi M, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Ison J, Keays M, Kurbatova N, Malone J, Mani R, Mupo A, Pedro Pereira R, Pilicheva E, Rung J, Sharma A, Tang YA, Ternent T, Tikhonov A, Welter D, Williams E, Brazma A, Parkinson H, Sarkans U. ArrayExpress update--trends in database growth and links to data analysis tools. Nucleic Acids Res 2012. [PMID: 23193272 PMCID: PMC3531147 DOI: 10.1093/nar/gks1174] [Citation(s) in RCA: 299] [Impact Index Per Article: 24.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.
Collapse
Affiliation(s)
- Gabriella Rustici
- Functional Genomics Team, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Abstract
Motivation: Meta-analysis of large gene expression datasets obtained from public repositories requires consistently annotated data. Curation of such experiments, however, is an expert activity which involves repetitive manipulation of text. Existing tools for automated curation are few, which bottleneck the analysis pipeline. Results: We present MageComet, a web application for biologists and annotators that facilitates the re-annotation of gene expression experiments in MAGE-TAB format. It incorporates data mining, automatic annotation, use of ontologies and data validation to improve the consistency and quality of experimental meta-data from the ArrayExpress Repository. Availability and implementation: Source and tutorials for MageComet are openly available at goo.gl/8LQPR under the GNU GPL v3 licenses. An implementation can be found at goo.gl/IdCuA Contact:parkinson@ebi.ac.uk or xue.vin@gmail.com
Collapse
Affiliation(s)
- Vincent Xue
- Department of Computer Science, Hunter College, City University of New York, NY 10065, USA.
| | | | | | | | | | | |
Collapse
|
44
|
Chen X, Burdett T, Desjardins C, Xu Y, Schwarzschild M. 3.245 EFFECTS OF URATE OXIDASE TRANSGENE OR KNOCKOUT IN A MOUSE MODEL OF PARKINSON'S DISEASE. Parkinsonism Relat Disord 2012. [DOI: 10.1016/s1353-8020(11)70917-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
45
|
Kapushesky M, Adamusiak T, Burdett T, Culhane A, Farne A, Filippov A, Holloway E, Klebanov A, Kryvych N, Kurbatova N, Kurnosov P, Malone J, Melnichuk O, Petryszak R, Pultsin N, Rustici G, Tikhonov A, Travillian RS, Williams E, Zorin A, Parkinson H, Brazma A. Gene Expression Atlas update--a value-added database of microarray and sequencing-based functional genomics experiments. Nucleic Acids Res 2011; 40:D1077-81. [PMID: 22064864 PMCID: PMC3245177 DOI: 10.1093/nar/gkr913] [Citation(s) in RCA: 124] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Gene Expression Atlas (http://www.ebi.ac.uk/gxa) is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions. The content of this database derives from curation, re-annotation and statistical analysis of selected data from the ArrayExpress Archive and the European Nucleotide Archive. A simple interface allows the user to query for differential gene expression either by gene names or attributes or by biological conditions, e.g. diseases, organism parts or cell types. Since our previous report we made 20 monthly releases and, as of Release 11.08 (August 2011), the database supports 19 species, which contains expression data measured for 19 014 biological conditions in 136 551 assays from 5598 independent studies.
Collapse
Affiliation(s)
- Misha Kapushesky
- European Bioinformatics Institute, EMBL, Hinxton, UK and Dana-Farber Cancer Institute, Boston, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Travillian RS, Adamusiak T, Burdett T, Gruenberger M, Hancock J, Mallon AM, Malone J, Schofield P, Parkinson H. Anatomy ontologies and potential users: bridging the gap. J Biomed Semantics 2011; 2 Suppl 4:S3. [PMID: 21995944 PMCID: PMC3194170 DOI: 10.1186/2041-1480-2-s4-s3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Motivation To evaluate how well current anatomical ontologies fit the way real-world users apply anatomy terms in their data annotations. Methods Annotations from three diverse multi-species public-domain datasets provided a set of use cases for matching anatomical terms in two major anatomical ontologies (the Foundational Model of Anatomy and Uberon), using two lexical-matching applications (Zooma and Ontology Mapper). Results Approximately 1500 terms were identified; Uberon/Zooma mappings provided 286 matches, compared to the control and Ontology Mapper returned 319 matches. For the Foundational Model of Anatomy, Zooma returned 312 matches, and Ontology Mapper returned 397. Conclusions Our results indicate that for our datasets the anatomical entities or concepts are embedded in user-generated complex terms, and while lexical mapping works, anatomy ontologies do not provide the majority of terms users supply when annotating data. Provision of searchable cross-products for compositional terms is a key requirement for using ontologies.
Collapse
|
47
|
Adamusiak T, Burdett T, Kurbatova N, Joeri van der Velde K, Abeygunawardena N, Antonakaki D, Kapushesky M, Parkinson H, Swertz MA. OntoCAT--simple ontology search and integration in Java, R and REST/JavaScript. BMC Bioinformatics 2011; 12:218. [PMID: 21619703 PMCID: PMC3129328 DOI: 10.1186/1471-2105-12-218] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Accepted: 05/29/2011] [Indexed: 11/10/2022] Open
Abstract
Background Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. Results OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. Conclusions OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. Availability http://www.ontocat.org
Collapse
Affiliation(s)
- Tomasz Adamusiak
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD, UK.
| | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Parkinson H, Sarkans U, Kolesnikov N, Abeygunawardena N, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Holloway E, Kurbatova N, Lukk M, Malone J, Mani R, Pilicheva E, Rustici G, Sharma A, Williams E, Adamusiak T, Brandizi M, Sklyar N, Brazma A. ArrayExpress update--an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res 2010; 39:D1002-4. [PMID: 21071405 PMCID: PMC3013660 DOI: 10.1093/nar/gkq1040] [Citation(s) in RCA: 271] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress) is one of the three international public repositories of functional genomics data supporting publications. It includes data generated by sequencing or array-based technologies. Data are submitted by users and imported directly from the NCBI Gene Expression Omnibus. The ArrayExpress Archive is closely integrated with the Gene Expression Atlas and the sequence databases at the European Bioinformatics Institute. Advanced queries provided via ontology enabled interfaces include queries based on technology and sample attributes such as disease, cell types and anatomy.
Collapse
Affiliation(s)
- Helen Parkinson
- Functional Genomics Team, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Shankar R, Parkinson H, Burdett T, Hastings E, Liu J, Miller M, Srinivasa R, White J, Brazma A, Sherlock G, Stoeckert CJ, Ball CA. Annotare--a tool for annotating high-throughput biomedical investigations and resulting data. Bioinformatics 2010; 26:2470-1. [PMID: 20733062 PMCID: PMC2944206 DOI: 10.1093/bioinformatics/btq462] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary: Computational methods in molecular biology will increasingly depend on standards-based annotations that describe biological experiments in an unambiguous manner. Annotare is a software tool that enables biologists to easily annotate their high-throughput experiments, biomaterials and data in a standards-compliant way that facilitates meaningful search and analysis. Availability and Implementation: Annotare is available from http://code.google.com/p/annotare/ under the terms of the open-source MIT License (http://www.opensource.org/licenses/mit-license.php). It has been tested on both Mac and Windows. Contact:rshankar@stanford.edu
Collapse
Affiliation(s)
- Ravi Shankar
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305-5307, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Parkinson H, Kapushesky M, Kolesnikov N, Rustici G, Shojatalab M, Abeygunawardena N, Berube H, Dylag M, Emam I, Farne A, Holloway E, Lukk M, Malone J, Mani R, Pilicheva E, Rayner TF, Rezwan F, Sharma A, Williams E, Bradley XZ, Adamusiak T, Brandizi M, Burdett T, Coulson R, Krestyaninova M, Kurnosov P, Maguire E, Neogi SG, Rocca-Serra P, Sansone SA, Sklyar N, Zhao M, Sarkans U, Brazma A. ArrayExpress update--from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res 2009; 37:D868-72. [PMID: 19015125 PMCID: PMC2686529 DOI: 10.1093/nar/gkn889] [Citation(s) in RCA: 346] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Revised: 10/17/2008] [Accepted: 10/20/2008] [Indexed: 11/13/2022] Open
Abstract
ArrayExpress http://www.ebi.ac.uk/arrayexpress consists of three components: the ArrayExpress Repository--a public archive of functional genomics experiments and supporting data, the ArrayExpress Warehouse--a database of gene expression profiles and other bio-measurements and the ArrayExpress Atlas--a new summary database and meta-analytical tool of ranked gene expression across multiple experiments and different biological conditions. The Repository contains data from over 6000 experiments comprising approximately 200,000 assays, and the database doubles in size every 15 months. The majority of the data are array based, but other data types are included, most recently-ultra high-throughput sequencing transcriptomics and epigenetic data. The Warehouse and Atlas allow users to query for differentially expressed genes by gene names and properties, experimental conditions and sample properties, or a combination of both. In this update, we describe the ArrayExpress developments over the last two years.
Collapse
Affiliation(s)
- Helen Parkinson
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Misha Kapushesky
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Nikolay Kolesnikov
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Gabriella Rustici
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Mohammad Shojatalab
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Niran Abeygunawardena
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Hugo Berube
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Miroslaw Dylag
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Ibrahim Emam
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Anna Farne
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Ele Holloway
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Margus Lukk
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - James Malone
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Roby Mani
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Ekaterina Pilicheva
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Tim F. Rayner
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Faisal Rezwan
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Anjan Sharma
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Eleanor Williams
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Xiangqun Zheng Bradley
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Tomasz Adamusiak
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Marco Brandizi
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Tony Burdett
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Richard Coulson
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Maria Krestyaninova
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Pavel Kurnosov
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Eamonn Maguire
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Sudeshna Guha Neogi
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Philippe Rocca-Serra
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Susanna-Assunta Sansone
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Nataliya Sklyar
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Mengyao Zhao
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Ugis Sarkans
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| | - Alvis Brazma
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK, Institute for Information Technology, National Research Council Canada, Ottawa, Ontario, Canada, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, BioComputation Laboratory, University of Hertfordshire, Hatfield and Wellcome Trust Sanger Institute, Hinxton, UK
| |
Collapse
|