1
|
Saha S, Chatzimichali EA, Matthews DA, Bessant C. PITDB: a database of translated genomic elements. Nucleic Acids Res 2019; 46:D1223-D1228. [PMID: 30053269 PMCID: PMC5753392 DOI: 10.1093/nar/gkx906] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 09/28/2017] [Indexed: 12/02/2022] Open
Abstract
PITDB is a freely available database of translated genomic elements (TGEs) that have been observed in PIT (proteomics informed by transcriptomics) experiments. In PIT, a sample is analyzed using both RNA-seq transcriptomics and proteomic mass spectrometry. Transcripts assembled from RNA-seq reads are used to create a library of sample-specific amino acid sequences against which the acquired mass spectra are searched, permitting detection of any TGE, not just those in canonical proteome databases. At the time of writing, PITDB contains over 74 000 distinct TGEs from four species, supported by more than 600 000 peptide spectrum matches. The database, accessible via http://pitdb.org, provides supporting evidence for each TGE, often from multiple experiments and an indication of the confidence in the TGE’s observation and its type, ranging from known protein (exact match to a UniProt protein sequence), through multiple types of protein variant including various splice isoforms, to a putative novel molecule. PITDB’s modern web interface allows TGEs to be viewed individually or by species or experiment, and downloaded for further analysis. PITDB is for bench scientists seeking to share their PIT results, for researchers investigating novel genome products in model organisms and for those wishing to construct proteomes for lesser studied species.
Collapse
Affiliation(s)
- Shyamasree Saha
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End, London E1 4NS, UK
| | - Eleni A Chatzimichali
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End, London E1 4NS, UK
| | - David A Matthews
- School of Cellular and Molecular Medicine, University of Bristol, University Walk, Bristol BS8 1TD, UK
| | - Conrad Bessant
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End, London E1 4NS, UK.,Centre for Computational Biology, Life Science Institute, Queen Mary University of London, Mile End, London E1 4NS, UK
| |
Collapse
|
2
|
Chatzimichali EA, Brent S, Hutton B, Perrett D, Wright CF, Bevan AP, Hurles ME, Firth HV, Swaminathan GJ. Facilitating collaboration in rare genetic disorders through effective matchmaking in DECIPHER. Hum Mutat 2015. [PMID: 26220709 PMCID: PMC4832335 DOI: 10.1002/humu.22842] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
DECIPHER (https://decipher.sanger.ac.uk) is a web‐based platform for secure deposition, analysis, and sharing of plausibly pathogenic genomic variants from well‐phenotyped patients suffering from genetic disorders. DECIPHER aids clinical interpretation of these rare sequence and copy‐number variants by providing tools for variant analysis and identification of other patients exhibiting similar genotype–phenotype characteristics. DECIPHER also provides mechanisms to encourage collaboration among a global community of clinical centers and researchers, as well as exchange of information between clinicians and researchers within a consortium, to accelerate discovery and diagnosis. DECIPHER has contributed to matchmaking efforts by enabling the global clinical genetics community to identify many previously undiagnosed syndromes and new disease genes, and has facilitated the publication of over 700 peer‐reviewed scientific publications since 2004. At the time of writing, DECIPHER contains anonymized data from ∼250 registered centers on more than 51,500 patients (∼18000 patients with consent for data sharing and ∼25000 anonymized records shared privately). In this paper, we describe salient features of the platform, with special emphasis on the tools and processes that aid interpretation, sharing, and effective matchmaking with other data held in the database and that make DECIPHER an invaluable clinical and research resource.
Collapse
Affiliation(s)
- Eleni A Chatzimichali
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Simon Brent
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Benjamin Hutton
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Daniel Perrett
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Caroline F Wright
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Andrew P Bevan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Matthew E Hurles
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Helen V Firth
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.,Cambridge University Department of Medical Genetics, Addenbrooke's Hospital, Cambridge, CB2 2QQ, United Kingdom
| | - Ganesh J Swaminathan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| |
Collapse
|
3
|
Wright CF, Fitzgerald TW, Jones WD, Clayton S, McRae JF, van Kogelenberg M, King DA, Ambridge K, Barrett DM, Bayzetinova T, Bevan AP, Bragin E, Chatzimichali EA, Gribble S, Jones P, Krishnappa N, Mason LE, Miller R, Morley KI, Parthiban V, Prigmore E, Rajan D, Sifrim A, Swaminathan GJ, Tivey AR, Middleton A, Parker M, Carter NP, Barrett JC, Hurles ME, FitzPatrick DR, Firth HV. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 2015; 385:1305-14. [PMID: 25529582 PMCID: PMC4392068 DOI: 10.1016/s0140-6736(14)61705-0] [Citation(s) in RCA: 527] [Impact Index Per Article: 58.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
BACKGROUND Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. METHODS The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. FINDINGS Around 80,000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. INTERPRETATION Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene-phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. FUNDING Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health.
Collapse
Affiliation(s)
- Caroline F Wright
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.
| | - Tomas W Fitzgerald
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Wendy D Jones
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Stephen Clayton
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Jeremy F McRae
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Daniel A King
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Kirsty Ambridge
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Daniel M Barrett
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Tanya Bayzetinova
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - A Paul Bevan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Eugene Bragin
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Susan Gribble
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Philip Jones
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Laura E Mason
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Ray Miller
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Katherine I Morley
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK; Institute of Psychiatry, King's College London, London, UK; Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC, Australia
| | - Vijaya Parthiban
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Elena Prigmore
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Diana Rajan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Alejandro Sifrim
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | | | - Adrian R Tivey
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Anna Middleton
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Michael Parker
- The Ethox Centre, Nuffield Department of Population Health University of Oxford, Old Road Campus, Oxford, UK
| | - Nigel P Carter
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Jeffrey C Barrett
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Matthew E Hurles
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - David R FitzPatrick
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, WGH, Edinburgh, UK
| | - Helen V Firth
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK; Cambridge University Hospitals Foundation Trust, Addenbrooke's Hospital, Cambridge, UK
| |
Collapse
|
4
|
Bragin E, Chatzimichali EA, Wright CF, Hurles ME, Firth HV, Bevan AP, Swaminathan GJ. DECIPHER: database for the interpretation of phenotype-linked plausibly pathogenic sequence and copy-number variation. Nucleic Acids Res 2013; 42:D993-D1000. [PMID: 24150940 PMCID: PMC3965078 DOI: 10.1093/nar/gkt937] [Citation(s) in RCA: 152] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The DECIPHER database (https://decipher.sanger.ac.uk/) is an accessible online repository of genetic variation with associated phenotypes that facilitates the identification and interpretation of pathogenic genetic variation in patients with rare disorders. Contributing to DECIPHER is an international consortium of >200 academic clinical centres of genetic medicine and ≥1600 clinical geneticists and diagnostic laboratory scientists. Information integrated from a variety of bioinformatics resources, coupled with visualization tools, provides a comprehensive set of tools to identify other patients with similar genotype–phenotype characteristics and highlights potentially pathogenic genes. In a significant development, we have extended DECIPHER from a database of just copy-number variants to allow upload, annotation and analysis of sequence variants such as single nucleotide variants (SNVs) and InDels. Other notable developments in DECIPHER include a purpose-built, customizable and interactive genome browser to aid combined visualization and interpretation of sequence and copy-number variation against informative datasets of pathogenic and population variation. We have also introduced several new features to our deposition and analysis interface. This article provides an update to the DECIPHER database, an earlier instance of which has been described elsewhere [Swaminathan et al. (2012) DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders. Hum. Mol. Genet., 21, R37–R44].
Collapse
Affiliation(s)
- Eugene Bragin
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK and Cambridge University Department of Medical Genetics, Addenbrooke's Hospital, Cambridge CB2 2QQ, UK
| | | | | | | | | | | | | |
Collapse
|
5
|
Swaminathan GJ, Bragin E, Chatzimichali EA, Corpas M, Bevan AP, Wright CF, Carter NP, Hurles ME, Firth HV. DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders. Hum Mol Genet 2012; 21:R37-44. [PMID: 22962312 DOI: 10.1093/hmg/dds362] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Patients with developmental disorders often harbour sub-microscopic deletions or duplications that lead to a disruption of normal gene expression or perturbation in the copy number of dosage-sensitive genes. Clinical interpretation for such patients in isolation is hindered by the rarity and novelty of such disorders. The DECIPHER project (https://decipher.sanger.ac.uk) was established in 2004 as an accessible online repository of genomic and associated phenotypic data with the primary goal of aiding the clinical interpretation of rare copy-number variants (CNVs). DECIPHER integrates information from a variety of bioinformatics resources and uses visualization tools to identify potential disease genes within a CNV. A two-tier access system permits clinicians and clinical scientists to maintain confidential linked anonymous records of phenotypes and CNVs for their patients that, with informed consent, can subsequently be shared with the wider clinical genetics and research communities. Advances in next-generation sequencing technologies are making it practical and affordable to sequence the whole exome/genome of patients who display features suggestive of a genetic disorder. This approach enables the identification of smaller intragenic mutations including single-nucleotide variants that are not accessible even with high-resolution genomic array analysis. This article briefly summarizes the current status and achievements of the DECIPHER project and looks ahead to the opportunities and challenges of jointly analysing structural and sequence variation in the human genome.
Collapse
Affiliation(s)
- Ganesh J Swaminathan
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | | | | | | | | | | | | | | | | |
Collapse
|