101
|
Li Y, Mei S, Zhang X, Peng X, Liu G, Tao H, Wu H, Jiang S, Xiong Y, Li F. Identification of genome-wide copy number variations among diverse pig breeds by array CGH. BMC Genomics 2012; 13:725. [PMID: 23265576 PMCID: PMC3573951 DOI: 10.1186/1471-2164-13-725] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2012] [Accepted: 12/19/2012] [Indexed: 11/15/2022] Open
Abstract
Background Recent studies have shown that copy number variation (CNV) in mammalian genomes contributes to phenotypic diversity, including health and disease status. In domestic pigs, CNV has been catalogued by several reports, but the extent of CNV and the phenotypic effects are far from clear. The goal of this study was to identify CNV regions (CNVRs) in pigs based on array comparative genome hybridization (aCGH). Results Here a custom-made tiling oligo-nucleotide array was used with a median probe spacing of 2506 bp for screening 12 pigs including 3 Chinese native pigs (one Chinese Erhualian, one Tongcheng and one Yangxin pig), 5 European pigs (one Large White, one Pietrain, one White Duroc and two Landrace pigs), 2 synthetic pigs (Chinese new line DIV pigs) and 2 crossbred pigs (Landrace × DIV pigs) with a Duroc pig as the reference. Two hundred and fifty-nine CNVRs across chromosomes 1–18 and X were identified, with an average size of 65.07 kb and a median size of 98.74 kb, covering 16.85 Mb or 0.74% of the whole genome. Concerning copy number status, 93 (35.91%) CNVRs were called as gains, 140 (54.05%) were called as losses and the remaining 26 (10.04%) were called as both gains and losses. Of all detected CNVRs, 171 (66.02%) and 34 (13.13%) CNVRs directly overlapped with Sus scrofa duplicated sequences and pig QTLs, respectively. The CNVRs encompassed 372 full length Ensembl transcripts. Two CNVRs identified by aCGH were validated using real-time quantitative PCR (qPCR). Conclusions Using 720 K array CGH (aCGH) we described a map of porcine CNVs which facilitated the identification of structural variations for important phenotypes and the assessment of the genetic diversity of pigs.
Collapse
Affiliation(s)
- Yan Li
- Key Laboratory of Pig Genetics and Breeding of Ministry of Agriculture & Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, PR China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
102
|
Meding S, Martin K, Gustafsson OJR, Eddes JS, Hack S, Oehler MK, Hoffmann P. Tryptic peptide reference data sets for MALDI imaging mass spectrometry on formalin-fixed ovarian cancer tissues. J Proteome Res 2012; 12:308-15. [PMID: 23214983 DOI: 10.1021/pr300996x] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
MALDI imaging mass spectrometry is a powerful tool for morphology-based proteomic tissue analysis. However, peptide identification is still a major challenge due to low S/N ratios, low mass accuracy and difficulties in correlating observed m/z species with peptide identities. To address this, we have analyzed tryptic digests of formalin-fixed paraffin-embedded tissue microarray cores, from 31 ovarian cancer patients, by LC-MS/MS. The sample preparation closely resembled the MALDI imaging workflow in order to create representative reference data sets containing peptides also observable in MALDI imaging experiments. This resulted in 3844 distinct peptide sequences, at a false discovery rate of 1%, for the entire cohort and an average of 982 distinct peptide sequences per sample. From this, a total of 840 proteins and, on average, 297 proteins per sample could be inferred. To support the efforts of the Chromosome-centric Human Proteome Project Consortium, we have annotated these proteins with their respective chromosome location. In the presented work, the benefit of using a large cohort of data sets was exemplified by correct identification of several m/z species observed in a MALDI imaging experiment. The tryptic peptide data sets generated will facilitate peptide identification in future MALDI imaging studies on ovarian cancer.
Collapse
Affiliation(s)
- Stephan Meding
- Adelaide Proteomics Centre, School of Molecular and Biomedical Science, The University of Adelaide, SA 5005, Adelaide, Australia
| | | | | | | | | | | | | |
Collapse
|
103
|
Barshir R, Basha O, Eluk A, Smoly IY, Lan A, Yeger-Lotem E. The TissueNet database of human tissue protein-protein interactions. Nucleic Acids Res 2012. [PMID: 23193266 PMCID: PMC3531115 DOI: 10.1093/nar/gks1198] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Knowledge of protein–protein interactions (PPIs) is important for identifying the functions of proteins and the processes they are involved in. Although data of human PPIs are easily accessible through several public databases, these databases do not specify the human tissues in which these PPIs take place. The TissueNet database of human tissue PPIs (http://netbio.bgu.ac.il/tissuenet/) associates each interaction with human tissues that express both pair mates. This was achieved by integrating current data of experimentally detected PPIs with extensive data of gene and protein expression across 16 main human tissues. Users can query TissueNet using a protein and retrieve its PPI partners per tissue, or using a PPI and retrieve the tissues expressing both pair mates. The graphical representation of the output highlights tissue-specific and tissue-wide PPIs. Thus, TissueNet provides a unique platform for assessing the roles of human proteins and their interactions across tissues.
Collapse
Affiliation(s)
- Ruth Barshir
- Department of Clinical Biochemistry, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | | | | | | | | | | |
Collapse
|
104
|
Alcántara R, Onwubiko J, Cao H, Matos PD, Cham JA, Jacobsen J, Holliday GL, Fischer JD, Rahman SA, Jassal B, Goujon M, Rowland F, Velankar S, López R, Overington JP, Kleywegt GJ, Hermjakob H, O'Donovan C, Martín MJ, Thornton JM, Steinbeck C. The EBI enzyme portal. Nucleic Acids Res 2012; 41:D773-80. [PMID: 23175605 PMCID: PMC3531056 DOI: 10.1093/nar/gks1112] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The availability of comprehensive information about enzymes plays an important role in answering questions relevant to interdisciplinary fields such as biochemistry, enzymology, biofuels, bioengineering and drug discovery. At the EMBL European Bioinformatics Institute, we have developed an enzyme portal (http://www.ebi.ac.uk/enzymeportal) to provide this wealth of information on enzymes from multiple in-house resources addressing particular data classes: protein sequence and structure, reactions, pathways and small molecules. The fact that these data reside in separate databases makes information discovery cumbersome. The main goal of the portal is to simplify this process for end users.
Collapse
Affiliation(s)
- Rafael Alcántara
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
105
|
Gray KA, Daugherty LC, Gordon SM, Seal RL, Wright MW, Bruford EA. Genenames.org: the HGNC resources in 2013. Nucleic Acids Res 2012; 41:D545-52. [PMID: 23161694 PMCID: PMC3531211 DOI: 10.1093/nar/gks1066] [Citation(s) in RCA: 193] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
The HUGO Gene Nomenclature Committee situated at the European Bioinformatics Institute assigns unique symbols and names to human genes. Since 2011, the data within our database has expanded largely owing to an increase in naming pseudogenes and non-coding RNA genes, and we now have >33,500 approved symbols. Our gene families and groups have also increased to nearly 500, with ∼45% of our gene entries associated to at least one family or group. We have also redesigned the HUGO Gene Nomenclature Committee website http://www.genenames.org creating a constant look and feel across the site and improving usability and readability for our users. The site provides a public access portal to our database with no restrictions imposed on access or the use of the data. Within this article, we review our online resources and data with particular emphasis on the updates to our website.
Collapse
Affiliation(s)
- Kristian A Gray
- HUGO Gene Nomenclature Committee, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
| | | | | | | | | | | |
Collapse
|
106
|
|
107
|
Zhang R, Yao F, Gao F, Abou-Samra AB. Nrac, a novel nutritionally-regulated adipose and cardiac-enriched gene. PLoS One 2012; 7:e46254. [PMID: 23029450 PMCID: PMC3459823 DOI: 10.1371/journal.pone.0046254] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Accepted: 08/31/2012] [Indexed: 01/05/2023] Open
Abstract
Obesity increases the risk of multiple diseases, such as type 2 diabetes and coronary heart diseases, and therefore the current obesity epidemic poses a major public health issue. Therapeutic approaches are urgently needed to treat obesity as well as its complications. Plasma-membrane proteins with restricted tissue distributions are attractive drug targets, because of their accessibility to various drug delivery mechanisms and potentially alleviated side effects. To identify genes involved in metabolism, we performed RNA-Seq on fat in mice treated with a high-fat diet or fasting. Here we show that the gene A530016L24Rik (human ortholog C14orf180), named Nrac, is a novel nutritionally-regulated adipose and cardiac-enriched gene. Nrac is expressed specifically and abundantly in fat and the heart. Both fasting and obesity reduced Nrac expression in white adipose tissue, and fasting reduced its expression in brown fat. Nrac is localized to the plasma membrane, and highly induced during adipocyte differentiation. Nrac is therefore a novel adipocyte marker and has potential functions in metabolism.
Collapse
Affiliation(s)
- Ren Zhang
- Center for Molecular Medicine and Genetics and Endocrine Division, School of Medicine, Wayne State University, Detroit, Michigan, United States of America.
| | | | | | | |
Collapse
|
108
|
Mohammad F, Flight RM, Harrison BJ, Petruska JC, Rouchka EC. AbsIDconvert: an absolute approach for converting genetic identifiers at different granularities. BMC Bioinformatics 2012; 13:229. [PMID: 22967011 PMCID: PMC3554462 DOI: 10.1186/1471-2105-13-229] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Accepted: 08/09/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND High-throughput molecular biology techniques yield vast amounts of data, often by detecting small portions of ribonucleotides corresponding to specific identifiers. Existing bioinformatic methodologies categorize and compare these elements using inferred descriptive annotation given this sequence information irrespective of the fact that it may not be representative of the identifier as a whole. RESULTS All annotations, no matter the granularity, can be aligned to genomic sequences and therefore annotated by genomic intervals. We have developed AbsIDconvert, a methodology for converting between genomic identifiers by first mapping them onto a common universal coordinate system using an interval tree which is subsequently queried for overlapping identifiers. AbsIDconvert has many potential uses, including gene identifier conversion, identification of features within a genomic region, and cross-species comparisons. The utility is demonstrated in three case studies: 1) comparative genomic study mapping plasmodium gene sequences to corresponding human and mosquito transcriptional regions; 2) cross-species study of Incyte clone sequences; and 3) analysis of human Ensembl transcripts mapped by Affymetrix®; and Agilent microarray probes. AbsIDconvert currently supports ID conversion of 53 species for a given list of input identifiers, genomic sequence, or genome intervals. CONCLUSION AbsIDconvert provides an efficient and reliable mechanism for conversion between identifier domains of interest. The flexibility of this tool allows for custom definition identifier domains contingent upon the availability and determination of a genomic mapping interval. As the genomes and the sequences for genetic elements are further refined, this tool will become increasingly useful and accurate. AbsIDconvert is freely available as a web application or downloadable as a virtual machine at: http://bioinformatics.louisville.edu/abid/.
Collapse
Affiliation(s)
- Fahim Mohammad
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY 40292, USA
| | | | | | | | | |
Collapse
|
109
|
Zhang SJ, Liu CJ, Shi M, Kong L, Chen JY, Zhou WZ, Zhu X, Yu P, Wang J, Yang X, Hou N, Ye Z, Zhang R, Xiao R, Zhang X, Li CY. RhesusBase: a knowledgebase for the monkey research community. Nucleic Acids Res 2012; 41:D892-905. [PMID: 22965133 PMCID: PMC3531163 DOI: 10.1093/nar/gks835] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Although the rhesus macaque is a unique model for the translational study of human diseases, currently its use in biomedical research is still in its infant stage due to error-prone gene structures and limited annotations. Here, we present RhesusBase for the monkey research community (http://www.rhesusbase.org). We performed strand-specific RNA-Seq studies in 10 macaque tissues and generated 1.2 billion 90-bp paired-end reads, covering >97.4% of the putative exon in macaque transcripts annotated by Ensembl. We found that at least 28.7% of the macaque transcripts were previously mis-annotated, mainly due to incorrect exon–intron boundaries, incomplete untranslated regions (UTRs) and missed exons. Compared with the previous gene models, the revised transcripts show clearer sequence motifs near splicing junctions and the end of UTRs, as well as cleaner patterns of exon–intron distribution for expression tags and cross-species conservation scores. Strikingly, 1292 exon–intron boundary revisions between coding exons corrected the previously mis-annotated open reading frames. The revised gene models were experimentally verified in randomly selected cases. We further integrated functional genomics annotations from >60 categories of public and in-house resources and developed an online accessible database. User-friendly interfaces were developed to update, retrieve, visualize and download the RhesusBase meta-data, providing a ‘one-stop’ resource for the monkey research community.
Collapse
Affiliation(s)
- Shi-Jian Zhang
- Institute of Molecular Medicine, Peking University, Beijing, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
110
|
Thomson JM, Bowles V, Choi JW, Basu U, Meng Y, Stothard P, Moore S. The identification of candidate genes and SNP markers for classical bovine spongiform encephalopathy susceptibility. Prion 2012; 6:461-9. [PMID: 22918267 DOI: 10.4161/pri.21866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Classical bovine spongiform encephalopathy is a transmissible prion disease that is fatal to cattle and is a human health risk due to its association with a strain of Creutzfeldt-Jakob disease (vCJD). Mutations to the coding region of the prion gene (PRNP) have been associated with susceptibility to transmissible spongiform encephalopathies in mammals including bovines and humans. Additional loci such as the retinoic acid receptor beta (RARB) and stathmin like 2 (STMN2) have also been associated with disease risk. The objective of this study was to refine previously identified regions associated with BSE susceptibility and to identify positional candidate genes and genetic variation that may be involved with the progression of classical BSE. The samples included 739 samples of either BSE infected animals (522 animals) or non-infected controls (207 animals). These were tested using a custom SNP array designed to narrow previously identified regions of importance in bovine genome. Thirty one single nucleotide polymorphisms were identified at p < 0.05 and a minor allele frequency greater than 5%. The chromosomal regions identified and the positional and functional candidate genes and regulatory elements identified within these regions warrant further research.
Collapse
Affiliation(s)
- Jennifer M Thomson
- Department of Agricultural, Food, and Nutritional Science, University of Alberta, Edmonton, AB Canada
| | | | | | | | | | | | | |
Collapse
|
111
|
Essaghir A, Demoulin JB. A minimal connected network of transcription factors regulated in human tumors and its application to the quest for universal cancer biomarkers. PLoS One 2012; 7:e39666. [PMID: 22761861 PMCID: PMC3382591 DOI: 10.1371/journal.pone.0039666] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Accepted: 05/25/2012] [Indexed: 12/19/2022] Open
Abstract
A universal cancer biomarker candidate for diagnosis is supposed to distinguish, within a broad range of tumors, between healthy and diseased patients. Recently published studies have explored the universal usefulness of some biomarkers in human tumors. In this study, we present an integrative approach to search for potential common cancer biomarkers. Using the TFactS web-tool with a catalogue of experimentally established gene regulations, we could predict transcription factors (TFs) regulated in 305 different human cancer cell lines covering a large panel of tumor types. We also identified chromosomal regions having significant copy number variation (CNV) in these cell lines. Within the scope of TFactS catalogue, 88 TFs whose activity status were explained by their gene expressions and CNVs were identified. Their minimal connected network (MCN) of protein-protein interactions forms a significant module within the human curated TF proteome. Functional analysis of the proteins included in this MCN revealed enrichment in cancer pathways as well as inflammation. The ten most central proteins in MCN are TFs that trans-regulate 157 known genes encoding secreted and transmembrane proteins. In publicly available collections of gene expression data from 8,525 patient tissues, 86 genes were differentially regulated in cancer compared to inflammatory diseases and controls. From TCGA cancer gene expression data sets, 50 genes were significantly associated to patient survival in at least one tumor type. Enrichment analysis shows that these genes mechanistically interact in common cancer pathways. Among these cancer biomarker candidates, TFRC, MET and VEGFA are commonly amplified genes in tumors and their encoded proteins stained positive in more than 80% of malignancies from public databases. They are linked to angiogenesis and hypoxia, which are common in cancer. They could be interesting for further investigations in cancer diagnostic strategies.
Collapse
Affiliation(s)
- Ahmed Essaghir
- de Duve institute, Université Catholique de Louvain, Brussels, Belgium.
| | | |
Collapse
|
112
|
Martínez-Barnetche J, Gómez-Barreto RE, Ovilla-Muñoz M, Téllez-Sosa J, López DEG, Dinglasan RR, Mohien CU, MacCallum RM, Redmond SN, Gibbons JG, Rokas A, Machado CA, Cazares-Raga FE, González-Cerón L, Hernández-Martínez S, López MHR. Transcriptome of the adult female malaria mosquito vector Anopheles albimanus. BMC Genomics 2012; 13:207. [PMID: 22646700 PMCID: PMC3442982 DOI: 10.1186/1471-2164-13-207] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Accepted: 05/30/2012] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Human Malaria is transmitted by mosquitoes of the genus Anopheles. Transmission is a complex phenomenon involving biological and environmental factors of humans, parasites and mosquitoes. Among more than 500 anopheline species, only a few species from different branches of the mosquito evolutionary tree transmit malaria, suggesting that their vectorial capacity has evolved independently. Anopheles albimanus (subgenus Nyssorhynchus) is an important malaria vector in the Americas. The divergence time between Anopheles gambiae, the main malaria vector in Africa, and the Neotropical vectors has been estimated to be 100 My. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to explore the mosquito biology beyond the An. gambiae complex. RESULTS We sequenced the transcriptome of the An. albimanus adult female. By combining Sanger, 454 and Illumina sequences from cDNA libraries derived from the midgut, cuticular fat body, dorsal vessel, salivary gland and whole body, we generated a single, high-quality assembly containing 16,669 transcripts, 92% of which mapped to the An. darlingi genome and covered 90% of the core eukaryotic genome. Bidirectional comparisons between the An. gambiae, An. darlingi and An. albimanus predicted proteomes allowed the identification of 3,772 putative orthologs. More than half of the transcripts had a match to proteins in other insect vectors and had an InterPro annotation. We identified several protein families that may be relevant to the study of Plasmodium-mosquito interaction. An open source transcript annotation browser called GDAV (Genome-Delinked Annotation Viewer) was developed to facilitate public access to the data generated by this and future transcriptome projects. CONCLUSIONS We have explored the adult female transcriptome of one important New World malaria vector, An. albimanus. We identified protein-coding transcripts involved in biological processes that may be relevant to the Plasmodium lifecycle and can serve as the starting point for searching targets for novel control strategies. Our data increase the available genomic information regarding An. albimanus several hundred-fold, and will facilitate molecular research in medical entomology, evolutionary biology, genomics and proteomics of anopheline mosquito vectors. The data reported in this manuscript is accessible to the community via the VectorBase website (http://www.vectorbase.org/Other/AdditionalOrganisms/).
Collapse
Affiliation(s)
- Jesús Martínez-Barnetche
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| | - Rosa E Gómez-Barreto
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| | - Marbella Ovilla-Muñoz
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| | - Juan Téllez-Sosa
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| | - David E García López
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| | - Rhoel R Dinglasan
- Johns Hopkins Bloomberg School of Public Health. Department of Molecular Microbiology & Immunology, Johns Hopkins Malaria Research Institute, Baltimore, MD, 21205, USA
| | - Ceereena Ubaida Mohien
- Johns Hopkins Bloomberg School of Public Health. Department of Molecular Microbiology & Immunology, Johns Hopkins Malaria Research Institute, Baltimore, MD, 21205, USA
- Department of Molecular & Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Robert M MacCallum
- Division of Cell and Molecular Biology, Department of Life Sciences, Imperial College London, London, United Kingdom
| | - Seth N Redmond
- Pasteur Institut, 28 Rue Du Docteur Roux, Paris, 75015, France
| | - John G Gibbons
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Carlos A Machado
- Department of Biology, University of Maryland, College Park, MD, USA
| | - Febe E Cazares-Raga
- Departamento de Infectómica y Patogénesis Molecular, Cinvestav-IPN, México, DF, México
| | - Lilia González-Cerón
- Centro Regional de Investigación en Salud Pública, Instituto Nacional de Salud Pública, Tapachula, Chiapas, México
| | - Salvador Hernández-Martínez
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| | - Mario H Rodríguez López
- Centro de Investigación sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, México
| |
Collapse
|
113
|
Gadaleta E, Cutts RJ, Sangaralingam A, Lemoine NR, Chelala C. An Integrated Systems Approach to the Study of Pancreatic Cancer. SYSTEMS BIOLOGY IN CANCER RESEARCH AND DRUG DISCOVERY 2012:83-111. [DOI: 10.1007/978-94-007-4819-4_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
114
|
Di Génova A, Aravena A, Zapata L, González M, Maass A, Iturra P. SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar050. [PMID: 22120661 PMCID: PMC3225076 DOI: 10.1093/database/bar050] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL:http://genomicasalmones.dim.uchile.cl/
Collapse
Affiliation(s)
- Alex Di Génova
- Laboratory of Bioinformatics and Mathematics of the Genome, Center for Mathematical Modeling (UMI 2807 CNRS) and Center for Genome Regulation (Fondap 15090007), University of Chile, Santiago, Chile
| | | | | | | | | | | |
Collapse
|
115
|
Kasprzyk A. BioMart: driving a paradigm change in biological data management. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar049. [PMID: 22083790 PMCID: PMC3215098 DOI: 10.1093/database/bar049] [Citation(s) in RCA: 235] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Arek Kasprzyk
- *Corresponding author: Tel: 647 258 4321; Fax: 647-258-4321;
| |
Collapse
|
116
|
Kapushesky M, Adamusiak T, Burdett T, Culhane A, Farne A, Filippov A, Holloway E, Klebanov A, Kryvych N, Kurbatova N, Kurnosov P, Malone J, Melnichuk O, Petryszak R, Pultsin N, Rustici G, Tikhonov A, Travillian RS, Williams E, Zorin A, Parkinson H, Brazma A. Gene Expression Atlas update--a value-added database of microarray and sequencing-based functional genomics experiments. Nucleic Acids Res 2011; 40:D1077-81. [PMID: 22064864 PMCID: PMC3245177 DOI: 10.1093/nar/gkr913] [Citation(s) in RCA: 124] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Gene Expression Atlas (http://www.ebi.ac.uk/gxa) is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions. The content of this database derives from curation, re-annotation and statistical analysis of selected data from the ArrayExpress Archive and the European Nucleotide Archive. A simple interface allows the user to query for differential gene expression either by gene names or attributes or by biological conditions, e.g. diseases, organism parts or cell types. Since our previous report we made 20 monthly releases and, as of Release 11.08 (August 2011), the database supports 19 species, which contains expression data measured for 19 014 biological conditions in 136 551 assays from 5598 independent studies.
Collapse
Affiliation(s)
- Misha Kapushesky
- European Bioinformatics Institute, EMBL, Hinxton, UK and Dana-Farber Cancer Institute, Boston, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
117
|
Haw RA, Croft D, Yung CK, Ndegwa N, D'Eustachio P, Hermjakob H, Stein LD. The Reactome BioMart. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar031. [PMID: 22012987 PMCID: PMC3197281 DOI: 10.1093/database/bar031] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Reactome is an open source, expert-authored, manually curated and peer-reviewed database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Reactome BioMart provides biologists and bioinformaticians with a single web interface for performing simple or elaborate queries of the Reactome database, aggregating data from different sources and providing an opportunity to integrate experimental and computational results with information relating to biological pathways. Database URL:http://www.reactome.org
Collapse
Affiliation(s)
- Robin A Haw
- Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada
| | | | | | | | | | | | | |
Collapse
|
118
|
Zhang J, Baran J, Cros A, Guberman JM, Haider S, Hsu J, Liang Y, Rivkin E, Wang J, Whitty B, Wong-Erasmus M, Yao L, Kasprzyk A. International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data. Database (Oxford) 2011; 2011:bar026. [PMID: 21930502 PMCID: PMC3263593 DOI: 10.1093/database/bar026] [Citation(s) in RCA: 402] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Revised: 05/11/2011] [Accepted: 05/17/2011] [Indexed: 11/22/2022]
Abstract
The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genomic abnormalities in 50 different cancer types. To make this data available, the ICGC has created the ICGC Data Portal. Powered by the BioMart software, the Data Portal allows each ICGC member institution to manage and maintain its own databases locally, while seamlessly presenting all the data in a single access point for users. The Data Portal currently contains data from 24 cancer projects, including ICGC, The Cancer Genome Atlas (TCGA), Johns Hopkins University, and the Tumor Sequencing Project. It consists of 3478 genomes and 13 cancer types and subtypes. Available open access data types include simple somatic mutations, copy number alterations, structural rearrangements, gene expression, microRNAs, DNA methylation and exon junctions. Additionally, simple germline variations are available as controlled access data. The Data Portal uses a web-based graphical user interface (GUI) to offer researchers multiple ways to quickly and easily search and analyze the available data. The web interface can assist in constructing complicated queries across multiple data sets. Several application programming interfaces are also available for programmatic access. Here we describe the organization, functionality, and capabilities of the ICGC Data Portal.
Collapse
Affiliation(s)
- Junjun Zhang
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Joachim Baran
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - A. Cros
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Jonathan M. Guberman
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Syed Haider
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Jack Hsu
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Yong Liang
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Elena Rivkin
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Jianxin Wang
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Brett Whitty
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Marie Wong-Erasmus
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Long Yao
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Arek Kasprzyk
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| |
Collapse
|
119
|
Zhang J, Haider S, Baran J, Cros A, Guberman JM, Hsu J, Liang Y, Yao L, Kasprzyk A. BioMart: a data federation framework for large collaborative projects. Database (Oxford) 2011; 2011:bar038. [PMID: 21930506 PMCID: PMC3175789 DOI: 10.1093/database/bar038] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Revised: 07/26/2011] [Accepted: 07/27/2011] [Indexed: 01/20/2023]
Abstract
BioMart is a freely available, open source, federated database system that provides a unified access to disparate, geographically distributed data sources. It is designed to be data agnostic and platform independent, such that existing databases can easily be incorporated into the BioMart framework. BioMart allows databases hosted on different servers to be presented seamlessly to users, facilitating collaborative projects between different research groups. BioMart contains several levels of query optimization to efficiently manage large data sets and offers a diverse selection of graphical user interfaces and application programming interfaces to ensure that queries can be performed in whatever manner is most convenient for the user. The software has now been adopted by a large number of different biological databases spanning a wide range of data types and providing a rich source of annotation available to bioinformaticians and biologists alike.
Collapse
Affiliation(s)
- Junjun Zhang
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Syed Haider
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Joachim Baran
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Anthony Cros
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Jonathan M. Guberman
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Jack Hsu
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Yong Liang
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Long Yao
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | - Arek Kasprzyk
- Ontario Institute for Cancer Research, Toronto, Informatics and Biocomputing, Ontario M5G 0A3, Canada and Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| |
Collapse
|
120
|
Jones P, Binns D, McMenamin C, McAnulla C, Hunter S. The InterPro BioMart: federated query and web service access to the InterPro Resource. Database (Oxford) 2011; 2011:bar033. [PMID: 21785143 PMCID: PMC3170169 DOI: 10.1093/database/bar033] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2011] [Revised: 05/31/2011] [Accepted: 06/29/2011] [Indexed: 11/13/2022]
Abstract
The InterPro BioMart provides users with query-optimized access to predictions of family classification, protein domains and functional sites, based on a broad spectrum of integrated computational models ('signatures') that are generated by the InterPro member databases: Gene3D, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. These predictions are provided for all protein sequences from both the UniProt Knowledge Base and the UniParc protein sequence archive. The InterPro BioMart is supplementary to the primary InterPro web interface (http://www.ebi.ac.uk/interpro), providing a web service and the ability to build complex, custom queries that can efficiently return thousands of rows of data in a variety of formats. This article describes the information available from the InterPro BioMart and illustrates its utility with examples of how to build queries that return useful biological information. Database URL: http://www.ebi.ac.uk/interpro/biomart/martview.
Collapse
Affiliation(s)
- Philip Jones
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | | | | | | | | |
Collapse
|