1
|
Tian Z, Cao Z, Yang E, Li J, Liao D, Wang F, Wang T, Zhang Z, Zhang H, Jiang X, Li X, Luo P. Quantitative proteomic and phosphoproteomic analyses of the hippocampus reveal the involvement of NMDAR1 signaling in repetitive mild traumatic brain injury. Neural Regen Res 2023; 18:2711-2719. [PMID: 37449635 DOI: 10.4103/1673-5374.374654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023] Open
Abstract
The cumulative damage caused by repetitive mild traumatic brain injury can cause long-term neurodegeneration leading to cognitive impairment. This cognitive impairment is thought to result specifically from damage to the hippocampus. In this study, we detected cognitive impairment in mice 6 weeks after repetitive mild traumatic brain injury using the novel object recognition test and the Morris water maze test. Immunofluorescence staining showed that p-tau expression was increased in the hippocampus after repetitive mild traumatic brain injury. Golgi staining showed a significant decrease in the total density of neuronal dendritic spines in the hippocampus, as well as in the density of mature dendritic spines. To investigate the specific molecular mechanisms underlying cognitive impairment due to hippocampal damage, we performed proteomic and phosphoproteomic analyses of the hippocampus with and without repetitive mild traumatic brain injury. The differentially expressed proteins were mainly enriched in inflammation, immunity, and coagulation, suggesting that non-neuronal cells are involved in the pathological changes that occur in the hippocampus in the chronic stage after repetitive mild traumatic brain injury. In contrast, differentially expressed phosphorylated proteins were mainly enriched in pathways related to neuronal function and structure, which is more consistent with neurodegeneration. We identified N-methyl-D-aspartate receptor 1 as a hub molecule involved in the response to repetitive mild traumatic brain injury , and western blotting showed that, while N-methyl-D-aspartate receptor 1 expression was not altered in the hippocampus after repetitive mild traumatic brain injury, its phosphorylation level was significantly increased, which is consistent with the omics results. Administration of GRP78608, an N-methyl-D-aspartate receptor 1 antagonist, to the hippocampus markedly improved repetitive mild traumatic brain injury-induced cognitive impairment. In conclusion, our findings suggest that N-methyl-D-aspartate receptor 1 signaling in the hippocampus is involved in cognitive impairment in the chronic stage after repetitive mild traumatic brain injury and may be a potential target for intervention and treatment.
Collapse
Affiliation(s)
- Zhicheng Tian
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Zixuan Cao
- The Sixth Regiment, School of Basic Medicine, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Erwan Yang
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Juan Li
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Dan Liao
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Fei Wang
- Department of Neurobiology, School of Basic Medicine, Fourth Military Medical University, Xi'an; Medical Experiment Center, Shaanxi University of Chinese Medicine, Xianyang, Shaanxi Province, China
| | - Taozhi Wang
- Department of Neurobiology, School of Basic Medicine, Fourth Military Medical University, Xi'an, Shaanxi Province; Department of Anesthesiology, The Second Hospital of Jilin University, Jilin University, Changchun, Jilin Province, China
| | - Zhuoyuan Zhang
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University; School of Life Science, Northwest University, Xi'an, Shaanxi Province, China
| | - Haofuzi Zhang
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Xiaofan Jiang
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Xin Li
- Department of Anesthesiology, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| | - Peng Luo
- Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, China
| |
Collapse
|
2
|
Bao Y, Marini S, Tamura T, Kamada M, Maegawa S, Hosokawa H, Song J, Akutsu T. Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features. Brief Bioinform 2020; 20:1669-1684. [PMID: 29860277 PMCID: PMC6917222 DOI: 10.1093/bib/bby041] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 04/16/2018] [Indexed: 12/20/2022] Open
Abstract
As one of the few irreversible protein posttranslational modifications, proteolytic cleavage is involved in nearly all aspects of cellular activities, ranging from gene regulation to cell life-cycle regulation. Among the various protease-specific types of proteolytic cleavage, cleavages by casapses/granzyme B are considered as essential in the initiation and execution of programmed cell death and inflammation processes. Although a number of substrates for both types of proteolytic cleavage have been experimentally identified, the complete repertoire of caspases and granzyme B substrates remains to be fully characterized. To tackle this issue and complement experimental efforts for substrate identification, systematic bioinformatics studies of known cleavage sites provide important insights into caspase/granzyme B substrate specificity, and facilitate the discovery of novel substrates. In this article, we review and benchmark 12 state-of-the-art sequence-based bioinformatics approaches and tools for caspases/granzyme B cleavage prediction. We evaluate and compare these methods in terms of their input/output, algorithms used, prediction performance, validation methods and software availability and utility. In addition, we construct independent data sets consisting of caspases/granzyme B substrates from different species and accordingly assess the predictive power of these different predictors for the identification of cleavage sites. We find that the prediction results are highly variable among different predictors. Furthermore, we experimentally validate the predictions of a case study by performing caspase cleavage assay. We anticipate that this comprehensive review and survey analysis will provide an insightful resource for biologists and bioinformaticians who are interested in using and/or developing tools for caspase/granzyme B cleavage prediction.
Collapse
Affiliation(s)
- Yu Bao
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Simone Marini
- Department of Computational Medicine and Bioinformatics, University of Michigan, 1241 E. Catherine St., 5940 Buhl, Ann Arbor 48109-5618, USA
| | - Takeyuki Tamura
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Mayumi Kamada
- Graduate School of Medicine, Kyoto University, Sakyo-ku, Kyoto 606-8507, Japan
| | - Shingo Maegawa
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Hiroshi Hosokawa
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash Centre for Data Science and ARC Centre of Excellence in Advance Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| |
Collapse
|
3
|
Mondal SK, Roy S. Genome-wide sequential, evolutionary, organizational and expression analyses of phenylpropanoid biosynthesis associated MYB domain transcription factors in Arabidopsis. J Biomol Struct Dyn 2017; 36:1577-1601. [PMID: 28490275 DOI: 10.1080/07391102.2017.1329099] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The MYB gene family represents one of the largest groups of transcription factors in plants. Recent evidences have also demonstrated key role of MYB transcription factors in regulating the expression of major genes involved in the biosynthesis of phenylpropanoid compounds which confer biotic and abiotic stress tolerance in plant species. However, no comprehensive genome-wide analysis of the phenylpropanoid pathway-associated MYB transcription factors has been reported thus far. In this study, 11 Arabidopsis MYB proteins, such as MYB3, MYB4, MYB7, MYB11, MYB12, MYB32, MYB75, MYB90, MYB111, MYB113, and MYB114 were initially identified considering their reported regulatory function in phenylpropanoid biosynthesis pathway. Subsequent genome-wide analysis have identified the corresponding homologues from Glycine max, Vigna radiata, Oryza sativa, and Zea mays, while homologous of Arabidopsis MYB75, MYB90, MYB113, and MYB114 were not detected in rice and maize genomes. The identified MYB proteins were classified into three groups (I-III) based on phylogeny. Sequence and domain analysis revealed presence of two conserved DNA binding MYB domains in the selected MYB proteins. Promoter analysis indicated presence of cis-regulatory elements related to light signaling, development, and stress response. Expression analysis of selected Arabidopsis MYB genes revealed their function in plant development and abiotic stress response, consistent with gene ontology annotations. Together, these results provide a useful framework for further experimental studies for the functional characterization of the target MYB genes in the context of regulation of phenylpropanoid biosynthesis and plant stress response.
Collapse
Affiliation(s)
- Sunil Kanti Mondal
- a Department of Biotechnology , The University of Burdwan , Burdwan , 713104 , West Bengal , India
| | - Sujit Roy
- b Department of Botany, UGC Centre of Advanced Studies , The University of Burdwan , Burdwan , 713104 , West Bengal , India
| |
Collapse
|
4
|
Dunston CR, Herbert R, Griffiths HR. Improving T cell-induced response to subunit vaccines: opportunities for a proteomic systems approach. ACTA ACUST UNITED AC 2015; 67:290-9. [PMID: 25708693 DOI: 10.1111/jphp.12383] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Accepted: 11/23/2014] [Indexed: 11/30/2022]
Abstract
UNLABELLED Prophylactic vaccines are an effective strategy to prevent development of many infectious diseases. With new and re-emerging infections posing increasing risks to food stocks and the health of the population in general, there is a need to improve the rationale of vaccine development. One key challenge lies in development of an effective T cell-induced response to subunit vaccines at specific sites and in different populations. OBJECTIVES In this review, we consider how a proteomic systems-based approach can be used to identify putative novel vaccine targets, may be adopted to characterise subunit vaccines and adjuvants fully. KEY FINDINGS Despite the extensive potential for proteomics to aid our understanding of subunit vaccine nature, little work has been reported on identifying MHC 1-binding peptides for subunit vaccines generating T cell responses in the literature to date. SUMMARY In combination with predictive and structural biology approaches to mapping antigen presentation, proteomics offers a powerful and as yet un-tapped addition to the armoury of vaccine discovery to predict T-cell subset responses and improve vaccine design strategies.
Collapse
Affiliation(s)
- Christopher R Dunston
- Life & Health Sciences, Aston University, Birmingham, West Midlands, UK; Mologic, Bedford Technology Park, Thurleigh, Bedfordshire, MK44 2YP
| | | | | |
Collapse
|
5
|
Quantitative proteomics reveals the kinetics of trypsin-catalyzed protein digestion. Anal Bioanal Chem 2014; 406:6247-56. [DOI: 10.1007/s00216-014-8071-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2014] [Revised: 07/14/2014] [Accepted: 07/25/2014] [Indexed: 11/25/2022]
|
6
|
Switzar L, Giera M, Niessen WMA. Protein Digestion: An Overview of the Available Techniques and Recent Developments. J Proteome Res 2013; 12:1067-77. [DOI: 10.1021/pr301201x] [Citation(s) in RCA: 164] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Linda Switzar
- AIMMS Division of BioMolecular
Analysis, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1083, 1081 HV, Amsterdam, The Netherlands
| | - Martin Giera
- Division of Molecular Cell Physiology,
Faculty of Earth and Life Sciences, VU University Amsterdam, De Boelelaan 1085, 1081 HV, Amsterdam, The Netherlands
- Biomolecular Mass Spectrometry
Unit, Department of Parasitology, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| | - Wilfried M. A. Niessen
- AIMMS Division of BioMolecular
Analysis, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1083, 1081 HV, Amsterdam, The Netherlands
- hyphen MassSpec, de Wetstraat 8, 2332 XT Leiden, The Netherlands
| |
Collapse
|
7
|
Santos CS, Pinheiro M, Silva AI, Egas C, Vasconcelos MW. Searching for resistance genes to Bursaphelenchus xylophilus using high throughput screening. BMC Genomics 2012; 13:599. [PMID: 23134679 PMCID: PMC3542250 DOI: 10.1186/1471-2164-13-599] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2012] [Accepted: 10/30/2012] [Indexed: 11/01/2022] Open
Abstract
BACKGROUND Pine wilt disease (PWD), caused by the pinewood nematode (PWN; Bursaphelenchus xylophilus), damages and kills pine trees and is causing serious economic damage worldwide. Although the ecological mechanism of infestation is well described, the plant's molecular response to the pathogen is not well known. This is due mainly to the lack of genomic information and the complexity of the disease. High throughput sequencing is now an efficient approach for detecting the expression of genes in non-model organisms, thus providing valuable information in spite of the lack of the genome sequence. In an attempt to unravel genes potentially involved in the pine defense against the pathogen, we hereby report the high throughput comparative sequence analysis of infested and non-infested stems of Pinus pinaster (very susceptible to PWN) and Pinus pinea (less susceptible to PWN). RESULTS Four cDNA libraries from infested and non-infested stems of P. pinaster and P. pinea were sequenced in a full 454 GS FLX run, producing a total of 2,083,698 reads. The putative amino acid sequences encoded by the assembled transcripts were annotated according to Gene Ontology, to assign Pinus contigs into Biological Processes, Cellular Components and Molecular Functions categories. Most of the annotated transcripts corresponded to Picea genes-25.4-39.7%, whereas a smaller percentage, matched Pinus genes, 1.8-12.8%, probably a consequence of more public genomic information available for Picea than for Pinus. The comparative transcriptome analysis showed that when P. pinaster was infested with PWN, the genes malate dehydrogenase, ABA, water deficit stress related genes and PAR1 were highly expressed, while in PWN-infested P. pinea, the highly expressed genes were ricin B-related lectin, and genes belonging to the SNARE and high mobility group families. Quantitative PCR experiments confirmed the differential gene expression between the two pine species. CONCLUSIONS Defense-related genes triggered by nematode infestation were detected in both P. pinaster and P. pinea transcriptomes utilizing 454 pyrosequencing technology. P. pinaster showed higher abundance of genes related to transcriptional regulation, terpenoid secondary metabolism (including some with nematicidal activity) and pathogen attack. P. pinea showed higher abundance of genes related to oxidative stress and higher levels of expression in general of stress responsive genes. This study provides essential information about the molecular defense mechanisms utilized by P. pinaster and P. pinea against PWN infestation and contributes to a better understanding of PWD.
Collapse
Affiliation(s)
- Carla S Santos
- CBQF – Centro de Biotecnologia e Química Fina, Escola Superior de Biotecnologia, Centro Regional do Porto da Universidade Católica Portuguesa, Rua Dr. António Bernardino Almeida, Porto, 4200-072, Portugal
| | - Miguel Pinheiro
- Bioinformatics Unit, Biocant, Parque Tecnológico de Cantanhede, Núcleo 04, Lote 03, Cantanhede, 3060-197, Portugal
| | - Ana I Silva
- CBQF – Centro de Biotecnologia e Química Fina, Escola Superior de Biotecnologia, Centro Regional do Porto da Universidade Católica Portuguesa, Rua Dr. António Bernardino Almeida, Porto, 4200-072, Portugal
| | - Conceição Egas
- Advanced Services Unit, Biocant, Parque Tecnológico de Cantanhede, Núcleo 04, Lote 03, Cantanhede, 3060-197, Portugal
| | - Marta W Vasconcelos
- CBQF – Centro de Biotecnologia e Química Fina, Escola Superior de Biotecnologia, Centro Regional do Porto da Universidade Católica Portuguesa, Rua Dr. António Bernardino Almeida, Porto, 4200-072, Portugal
| |
Collapse
|
8
|
Chen MS, Wang GJ, Wang RL, Wang J, Song SQ, Xu ZF. Analysis of expressed sequence tags from biodiesel plant Jatropha curcas embryos at different developmental stages. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2011; 181:696-700. [PMID: 21958712 DOI: 10.1016/j.plantsci.2011.08.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2010] [Revised: 08/07/2011] [Accepted: 08/10/2011] [Indexed: 05/04/2023]
Abstract
Jatropha curcas is considered a potential biodiesel feedstock plant whose seeds contain up to 40% oil. However, little is currently known about the seed biology of Jatropha. Therefore, it would be valuable to understand the mechanisms of development and lipid metabolism in Jatropha seeds. In the present study, three cDNA libraries were constructed with mRNA from Jatropha embryos at different stages of seed development. A total of 9844 expressed sequence tags (ESTs) were produced from these libraries, from which 1070 contigs and 3595 singletons were obtained. One hundred and seven unigenes were found to be differentially expressed in the three cDNA libraries of Jatropha embryos, indicating that these genes may play key roles in seed development. We have identified 59 and 61 unigenes that might be involved in the development and lipid metabolism in Jatropha seeds, respectively. Some of these genes may also play important roles in embryogenesis, morphogenesis, defense response and adaptive mechanisms in plants.
Collapse
Affiliation(s)
- Mao-Sheng Chen
- Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, 88 Xuefu Road, Kunming, Yunnan, China
| | | | | | | | | | | |
Collapse
|
9
|
Morris JH, Apeltsin L, Newman AM, Baumbach J, Wittkop T, Su G, Bader GD, Ferrin TE. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics 2011; 12:436. [PMID: 22070249 PMCID: PMC3262844 DOI: 10.1186/1471-2105-12-436] [Citation(s) in RCA: 389] [Impact Index Per Article: 29.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2011] [Accepted: 11/09/2011] [Indexed: 12/02/2022] Open
Abstract
Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin clusterMaker provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the clusterMaker plugin. clusterMaker is available via the Cytoscape plugin manager.
Collapse
Affiliation(s)
- John H Morris
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, USA.
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Bettencourt R, Pinheiro M, Egas C, Gomes P, Afonso M, Shank T, Santos RS. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus. BMC Genomics 2010; 11:559. [PMID: 20937131 PMCID: PMC3091708 DOI: 10.1186/1471-2164-11-559] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Accepted: 10/11/2010] [Indexed: 01/03/2023] Open
Abstract
Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR) and their RNA transcription level by quantitative PCR (qPCR) experiments. Conclusions We have established the first tissue transcriptional analysis of a deep-sea hydrothermal vent animal and generated a searchable catalog of genes that provides a direct method of identifying and retrieving vast numbers of novel coding sequences which can be applied in gene expression profiling experiments from a non-conventional model organism. This provides the most comprehensive sequence resource for identifying novel genes currently available for a deep-sea vent organism, in particular, genes putatively involved in immune and inflammatory reactions in vent mussels. The characterization of the B. azoricus transcriptome will facilitate research into biological processes underlying physiological adaptations to hydrothermal vent environments and will provide a basis for expanding our understanding of genes putatively involved in adaptations processes during post-capture long term acclimatization experiments, at "sea-level" conditions, using B. azoricus as a model organism.
Collapse
Affiliation(s)
- Raul Bettencourt
- Department of Oceanography and Fisheries, University of the Azores, 9901-861 Horta, Portugal.
| | | | | | | | | | | | | |
Collapse
|
11
|
Camon E, Barrell D, Brooksbank C, Magrane M, Apweiler R. The Gene Ontology Annotation (GOA) Project--Application of GO in SWISS-PROT, TrEMBL and InterPro. Comp Funct Genomics 2010; 4:71-4. [PMID: 18629103 PMCID: PMC2447390 DOI: 10.1002/cfg.235] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2002] [Accepted: 11/22/2002] [Indexed: 11/07/2022] Open
Affiliation(s)
- Evelyn Camon
- European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | |
Collapse
|
12
|
Liang Y, Rahman MH, Strelkov SE, Kav NNV. Developmentally induced changes in the sclerotial proteome of Sclerotinia sclerotiorum. Fungal Biol 2010; 114:619-27. [PMID: 20943173 DOI: 10.1016/j.funbio.2010.05.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Revised: 05/07/2010] [Accepted: 05/11/2010] [Indexed: 12/01/2022]
Abstract
Sclerotinia sclerotiorum (Lib.) de Bary is a necrotrophic fungal phytopathogen with a broad host range. The fungus produces sclerotia, long-term survival and dissemination structures that serve as the primary source of inoculum during seasonal crop infection cycles. Herein, we report the first proteomics-based analysis of sclerotial development. A total of 88 protein spots were observed by two-dimensional gel electrophoresis (2-DE) to exhibit significant temporal differences in abundance at three representative stages of sclerotial development, and the identities of these proteins were established using LC-MS/MS. The proteins were classified into several functional categories including metabolism, energy, transcription and protein fate, cell defense, differentiation, and proteins with as of yet unknown functions. In addition, proteins involved in the process of melanogenesis were found to be differentially abundant during sclerotial development, as was the development-specific protein, Ssp. This study provides a starting point towards achieving a comprehensive understanding of the proteins and molecular events associated with sclerotial development.
Collapse
Affiliation(s)
- Yue Liang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada
| | | | | | | |
Collapse
|
13
|
Liang Y, Strelkov SE, Kav NNV. The Proteome of Liquid Sclerotial Exudates from Sclerotinia sclerotiorum. J Proteome Res 2010; 9:3290-8. [DOI: 10.1021/pr900942w] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Yue Liang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta, Canada
| | - Stephen E. Strelkov
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta, Canada
| | - Nat N. V. Kav
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
14
|
Zhao YM, Basu U, Dodson MV, Basarb JA, Guan LL. Proteome differences associated with fat accumulation in bovine subcutaneous adipose tissues. Proteome Sci 2010; 8:14. [PMID: 20298566 PMCID: PMC2853513 DOI: 10.1186/1477-5956-8-14] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2009] [Accepted: 03/18/2010] [Indexed: 01/03/2023] Open
Abstract
Background The fat components of red meat products have been of interest to researchers due to the health aspects of excess fat consumption by humans. We hypothesized that differences in protein expression have an impact on adipose tissue formation during beef cattle development and growth. Therefore, in this study we evaluated the differences in the discernable proteome of subcutaneous adipose tissues of 35 beef crossbred steers [Charolais × Red Angus (CHAR) (n = 13) and Hereford × Angus (HEAN) (n = 22)] with different back fat (BF) thicknesses. The goal was to identify specific protein markers that could be associated with adipose tissue formation in beef cows. Results Approximately 541-580 protein spots were detected and compared in each crossbred group, and 33 and 36 protein spots showed expression differences between tissues with high and low BF thicknesses from HEAN and CHAR crossbed, respectively. The annexin 1 protein was highly expressed in both crossbred steers that had a higher BF thickness (p < 0.05) and this was further validated by a western blot analysis. In 13 tissues of CHAR animals and 22 tissues of HEAN animals, the relative expression of annexin 1 was significantly different (p < 0.05) between tissues with high and low BF thicknesses. Conclusion The increased expression of annexin 1 protein has been found to be associated with higher BF thickness in both crossbred steers. This result lays the foundation for future studies to develop the protein marker for assessing animals with different BF thickness.
Collapse
Affiliation(s)
- Yong Mei Zhao
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada.,Department of Life Science, Xi'an University of Arts and Science, Shaanxi, Xi'an710065, PR China
| | - Urmila Basu
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada
| | - Michael V Dodson
- Department of Animal Sciences, Washington State University, PO Box 646310, Pullman, Washington, 99164, USA
| | - John A Basarb
- Alberta Agriculture and Rural Development, Lacombe Research Centre, Lacombe, AB, T4L1W1, Canada
| | - Le Luo Guan
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada
| |
Collapse
|
15
|
Liang Y, Strelkov SE, Kav NNV. Oxalic acid-mediated stress responses in Brassica napus L. Proteomics 2009; 9:3156-73. [PMID: 19526549 DOI: 10.1002/pmic.200800966] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Oxalic acid (OA) occurs extensively in nature and plays diverse roles, especially in pathogenic processes involving various plant pathogens. However, proteome changes and modifications of signaling and oxidative network of plants in response to OA are not well understood. In order to investigate the responses of Brassica napus toward OA, a proteome analysis was conducted employing 2-DE with MS/MS. A total of 37 proteins were identified as responding to OA stress, of which 13 were up-regulated and 24 were down-regulated. These proteins were categorized into several functional groups including protein processing, RNA processing, photosynthesis, signal transduction, stress response, and redox homeostasis. Investigation of the effect of OA on phytohormone signaling and oxidative responses revealed that jasmonic acid-, ethylene-, and abscisic acid-mediated signaling pathways appear to increase at later time points, whereas those pathways mediated by salicylic acid appear to be suppressed. Moreover, the activities of the antioxidant enzymes catalase, peroxidase, superoxide dismutase and oxalic acid oxidase, but not NADPH oxidase, were suppressed by OA stress. Our findings are discussed within the context of the proposed role(s) of OA during infection by Sclerotinia sclerotiorum and subsequent disease progression.
Collapse
Affiliation(s)
- Yue Liang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alta, Canada
| | | | | |
Collapse
|
16
|
Varshavsky R, Horn D, Linial M. Global considerations in hierarchical clustering reveal meaningful patterns in data. PLoS One 2008; 3:e2247. [PMID: 18493326 PMCID: PMC2375056 DOI: 10.1371/journal.pone.0002247] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2007] [Accepted: 03/31/2008] [Indexed: 11/18/2022] Open
Abstract
Background A hierarchy, characterized by tree-like relationships, is a natural method of organizing data in various domains. When considering an unsupervised machine learning routine, such as clustering, a bottom-up hierarchical (BU, agglomerative) algorithm is used as a default and is often the only method applied. Methodology/Principal Findings We show that hierarchical clustering that involve global considerations, such as top-down (TD, divisive), or glocal (global-local) algorithms are better suited to reveal meaningful patterns in the data. This is demonstrated, by testing the correspondence between the results of several algorithms (TD, glocal and BU) and the correct annotations provided by experts. The correspondence was tested in multiple domains including gene expression experiments, stock trade records and functional protein families. The performance of each of the algorithms is evaluated by statistical criteria that are assigned to clusters (nodes of the hierarchy tree) based on expert-labeled data. Whereas TD algorithms perform better on global patterns, BU algorithms perform well and are advantageous when finer granularity of the data is sought. In addition, a novel TD algorithm that is based on genuine density of the data points is presented and is shown to outperform other divisive and agglomerative methods. Application of the algorithm to more than 500 protein sequences belonging to ion-channels illustrates the potential of the method for inferring overlooked functional annotations. ClustTree, a graphical Matlab toolbox for applying various hierarchical clustering algorithms and testing their quality is made available. Conclusions Although currently rarely used, global approaches, in particular, TD or glocal algorithms, should be considered in the exploratory process of clustering. In general, applying unsupervised clustering methods can leverage the quality of manually-created mapping of proteins families. As demonstrated, it can also provide insights in erroneous and missed annotations.
Collapse
Affiliation(s)
- Roy Varshavsky
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | | | | |
Collapse
|
17
|
Mulder NJ, Fleischmann W, Kanapin A, Apweiler R. InterPro as a new tool for complete genome analysis: An example of comparative analysis. Biophysics (Nagoya-shi) 2006. [DOI: 10.1134/s0006350906040117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
18
|
Felipe MSS, Andrade RV, Arraes FBM, Nicola AM, Maranhão AQ, Torres FAG, Silva-Pereira I, Poças-Fonseca MJ, Campos EG, Moraes LMP, Andrade PA, Tavares AHFP, Silva SS, Kyaw CM, Souza DP, Pereira M, Jesuíno RSA, Andrade EV, Parente JA, Oliveira GS, Barbosa MS, Martins NF, Fachin AL, Cardoso RS, Passos GAS, Almeida NF, Walter MEMT, Soares CMA, Carvalho MJA, Brígido MM. Transcriptional Profiles of the Human Pathogenic Fungus Paracoccidioides brasiliensis in Mycelium and Yeast Cells. J Biol Chem 2005; 280:24706-14. [PMID: 15849188 DOI: 10.1074/jbc.m500625200] [Citation(s) in RCA: 128] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Paracoccidioides brasiliensis is the causative agent of paracoccidioidomycosis, a disease that affects 10 million individuals in Latin America. This report depicts the results of the analysis of 6,022 assembled groups from mycelium and yeast phase expressed sequence tags, covering about 80% of the estimated genome of this dimorphic, thermo-regulated fungus. The data provide a comprehensive view of the fungal metabolism, including overexpressed transcripts, stage-specific genes, and also those that are up- or down-regulated as assessed by in silico electronic subtraction and cDNA microarrays. Also, a significant differential expression pattern in mycelium and yeast cells was detected, which was confirmed by Northern blot analysis, providing insights into differential metabolic adaptations. The overall transcriptome analysis provided information about sequences related to the cell cycle, stress response, drug resistance, and signal transduction pathways of the pathogen. Novel P. brasiliensis genes have been identified, probably corresponding to proteins that should be addressed as virulence factor candidates and potential new drug targets.
Collapse
Affiliation(s)
- Maria Sueli S Felipe
- Departamento de Biologia Celular, Universidade de Brasília, 70910-900, Brasília, DF, Brazil.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Simpson F, Martin S, Evans TM, Kerr M, James DE, Parton RG, Teasdale RD, Wicking C. A Novel Hook-Related Protein Family and the Characterization of Hook-Related Protein 1. Traffic 2005; 6:442-58. [PMID: 15882442 DOI: 10.1111/j.1600-0854.2005.00289.x] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The spatial organization of organelles within a cell is dependent on microtubules. Recently, members of the Hook family of proteins have been proposed to function in linking organelles to microtubules. We report the identification of a completely novel protein family, the Hook-related protein (HkRP) family, from which the Hook proteins have diverged. Bioinformatic analysis of the HkRP family revealed several conserved domains, including a unique C-terminal HkRP domain. The central region of each protein is comprised of an extensive coiled-coil domain, and the N-terminus contains a putative microtubule-binding domain. This domain has been shown to bind microtubules in the Hook protein and show that the HkRP1 protein is microtubule-associated. While endogenous HkRP1 has no distinct organelle association, expression of the C-terminal membrane-binding domain suggests a function of the HkRP1 in early endosome. Ultrastructural studies reveal that expression of the C-terminal HkRP1 domain causes an accumulation of internal membranes with an electron-dense coat. Co-localization studies show a concomitant redistribution of the early endosome marker sorting-nexin 1 but not the early endosome antigen-1 (EEA1). The steady-state distribution of the epidermal growth factor receptor is also specifically disrupted by expression of the C-terminal domain. We propose that HkRP1 is involved in the process of tubulation of sorting nexin-1 positive membranes from early endosome subdomains.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Blotting, Western
- Brain/metabolism
- Caenorhabditis elegans
- Carrier Proteins/metabolism
- Cell Membrane/metabolism
- Cloning, Molecular
- Computational Biology
- Cricetinae
- Cytosol/metabolism
- DNA, Complementary/metabolism
- Databases, Genetic
- Drosophila melanogaster
- Dynamins/metabolism
- Exons
- Gene Expression Regulation, Developmental
- Gene Library
- Green Fluorescent Proteins/metabolism
- HeLa Cells
- Humans
- Immunohistochemistry
- Immunoprecipitation
- In Situ Hybridization
- Membrane Proteins/chemistry
- Mice
- Microscopy, Confocal
- Microscopy, Electron
- Microscopy, Fluorescence
- Microtubule-Associated Proteins/chemistry
- Microtubule-Associated Proteins/genetics
- Microtubule-Associated Proteins/physiology
- Microtubules/metabolism
- Microtubules/ultrastructure
- Models, Genetic
- Molecular Sequence Data
- Multigene Family
- Paclitaxel/chemistry
- Protein Binding
- Protein Structure, Tertiary
- Rats
- Sequence Analysis, DNA
- Tubulin/metabolism
- Vesicular Transport Proteins/metabolism
Collapse
Affiliation(s)
- Fiona Simpson
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Kang L, Chen X, Zhou Y, Liu B, Zheng W, Li R, Wang J, Yu J. The analysis of large-scale gene expression correlated to the phase changes of the migratory locust. Proc Natl Acad Sci U S A 2004; 101:17611-5. [PMID: 15591108 PMCID: PMC535406 DOI: 10.1073/pnas.0407753101] [Citation(s) in RCA: 167] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2004] [Indexed: 11/18/2022] Open
Abstract
The migratory locust is one of the most notorious agricultural pests that undergo a well known reversible, density-dependent phase transition from the solitary to the gregarious. To demonstrate the underlying molecular mechanisms of the phase change, we generated 76,012 ESTs from the whole body and dissected organs in the two phases. Comparing 12,161 unigene clusters, we identified 532 genes as phase-related (P < 0.01). Comprehensive assessment of the phase-related expression revealed that, whereas most of the genes in various categories from hind legs and the midgut are down-regulated in the gregarious phase, several gene classes in the head are impressively up-regulated, including those with peptidase, receptor, and oxygen-binding activities and those related to development, cell growth, and responses to external stimuli. Among them, a superfamily of proteins, the JHPH super-family, which includes juvenile hormone-binding protein, hexamerins, prophenoloxidase, and hemocyanins, were highly expressed in the heads of the gregarious hoppers and hind legs of the solitary hoppers. Quantitative PCR experiments confirmed in part the EST results. These differentially regulated genes have strong functional implications that numerous molecular activities are involved in phase plasticity. This study provides ample molecular markers and genomic information on hemimetabolous insects and insights into the genetic and molecular mechanisms of phase changes in locusts.
Collapse
Affiliation(s)
- Le Kang
- National Laboratory of Integrated Management of Insect Pests and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100080, China.
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Shishkin SS, Kovalyov LI, Kovalyova MA. Proteomic studies of human and other vertebrate muscle proteins. BIOCHEMISTRY (MOSCOW) 2004; 69:1283-98. [PMID: 15627382 DOI: 10.1007/s10541-005-0074-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
This review summarizes results of some systemic studies of muscle proteins of humans and some other vertebrates. The studies, started after introduction of two-dimensional gel electrophoresis of O'Farrell, were significantly extended during development of proteomics, a special branch of functional genomics. Special attention is paid to analysis of characteristic features of strategy for practical realization of the systemic approach during three main stages of these studies: pre-genomic, genomic (with organizational registration of proteomics), and post-genomic characterized by active use of structural genomics data. Proteomic technologies play an important role in detection of changes in isoforms of various muscle proteins (myosins, troponins, etc.). These changes possibly reflecting tissue specificity of gene expression may underline functional state of muscle tissues under normal and pathological conditions, and such proteomic analysis is now used in various fields of medicine.
Collapse
Affiliation(s)
- S S Shishkin
- Bach Institute of Biochemistry, Russian Academy of Sciences, Moscow 119071, Russia.
| | | | | |
Collapse
|
22
|
Proteomic studies of human and other vertebrate muscle proteins. BIOCHEMISTRY (MOSCOW) 2004. [DOI: 10.1007/pl00021771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
|
23
|
Cristillo AD, Nie L, Macri MJ, Bierer BE. Cloning and characterization of N4WBP5A, an inducible, cyclosporine-sensitive, Nedd4-binding protein in human T lymphocytes. J Biol Chem 2003; 278:34587-97. [PMID: 12796489 DOI: 10.1074/jbc.m304723200] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have cloned and characterized a human cDNA, designated N4WBP5A, that belongs to the family of Nedd4-binding proteins. We originally identified N4WBP5A as an unknown expressed sequence tag (AA770150) represented in a cDNA microarray analysis that was up-regulated upon activation of T cells and inhibited by cell treatment with the calcineurin phosphatase inhibitors, cyclosporine (CsA) and tacrolimus (FK506). The predicted N4WBP5A amino acid sequence of 242 amino acid residues reveals an open reading frame of 729 nucleotides with a corresponding molecular mass of 27.1 kDa. Detection of N4WBP5A mRNA by reverse transcription-PCR was consistent with the induction of N4WBP5A following mitogenic stimulation of T lymphocytes and inhibition by CsA. Immunoblot analysis revealed endogenous N4WBP5A protein to be up-regulated following T cell activation and inhibited by CsA. This regulation of N4WBP5A mRNA expression differed from that of its homologue (51% identical; 65% similar) N4WBP5. Like N4WBP5, however, expression of epitope-tagged N4WBP5A indicated that the protein is localized predominantly to the Golgi network. Here we show by co-precipitation experiments that N4WBP5A interacts with the WW domains of Nedd4, an E3 ubiquitin ligase. Taken together, our data suggest that N4WBP5A may play a regulatory role in modulating Nedd4 activity at the level of the Golgi apparatus in T lymphocytes.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Blotting, Northern
- COS Cells
- Calcium-Binding Proteins/chemistry
- Carrier Proteins/chemistry
- Carrier Proteins/genetics
- Cells, Cultured
- Cloning, Molecular
- Cyclosporine/pharmacology
- DNA, Complementary/metabolism
- Electrophoresis, Polyacrylamide Gel
- Endoplasmic Reticulum/metabolism
- Endosomal Sorting Complexes Required for Transport
- Epitopes
- Expressed Sequence Tags
- Golgi Apparatus/metabolism
- HeLa Cells
- Humans
- Immunoblotting
- Immunosuppressive Agents/pharmacology
- Ligases/chemistry
- Lymphocyte Activation
- Membrane Proteins
- Microscopy, Fluorescence
- Mitochondria/metabolism
- Models, Genetic
- Molecular Sequence Data
- Nedd4 Ubiquitin Protein Ligases
- Oligonucleotide Array Sequence Analysis
- Open Reading Frames
- Protein Binding
- Protein Structure, Tertiary
- RNA, Messenger/metabolism
- Reverse Transcriptase Polymerase Chain Reaction
- Sequence Homology, Amino Acid
- T-Lymphocytes/metabolism
- Tacrolimus/pharmacology
- Time Factors
- Tissue Distribution
- Transfection
- Ubiquitin-Protein Ligases
- Up-Regulation
Collapse
Affiliation(s)
- Anthony D Cristillo
- Laboratory of Lymphocyte Biology, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | |
Collapse
|
24
|
Affiliation(s)
- Erik van Nimwegen
- Center for Studies in Physics and Biology, the Rockefeller University, 1230 York Avenue, New York, NY 12001, USA.
| |
Collapse
|
25
|
Affiliation(s)
- Nicola J. Mulder
- The EMBL Outstation European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge
| | - Rolf Apweiler
- The EMBL Outstation European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge
| |
Collapse
|
26
|
Abstract
NEWT is a new taxonomy portal to the SWISS-PROT protein sequence knowledgebase. It contains taxonomy data, which is updated daily, for the complete set of species represented in SWISS-PROT, as well as those stored at the NCBI. Users can navigate through the taxonomy tree and access corresponding SWISS-PROT protein entries. In addition, a manually curated selection of external links allows access to specific information on selected species. NEWT is available at http://www.ebi.ac.uk/newt/.
Collapse
Affiliation(s)
- I Q H Phan
- Swiss Institute of Bioinformatics, Geneva, Switzerland. European Bioinformatics Institute, Cambridge, UK
| | | | | | | |
Collapse
|
27
|
Semple CAM. The comparative proteomics of ubiquitination in mouse. Genome Res 2003; 13:1389-94. [PMID: 12819137 PMCID: PMC403670 DOI: 10.1101/gr.980303] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2002] [Accepted: 03/06/2003] [Indexed: 11/24/2022]
Abstract
Ubiquitination is a common posttranslational modification in eukaryotic cells, influencing many fundamental cellular processes. Defects in ubiquitination and the processes it mediates are involved in many human disease states. The ubiquitination of a substrate involves four classes of enzymes:a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), a ubiquitin protein ligase (E3), and a de-ubiquitinating enzyme (DUB). A substantial number of E1s (four), E2s (13), E3s (97), and DUBs (six) that were previously unknown in the mouse are included in the FANTOM2 Representative Transcript and Protein Set (RTPS). Many of the genes encoding these proteins will constitute promising candidates for involvement in disease. In addition, the RTPS provides the basis for the most comprehensive survey of ubiquitination-associated proteins across eukaryotes undertaken to date. Comparisons of these proteins across human and other organisms suggest that eukaryotic evolution has been associated with an increase in the number and diversity of E3s (possessing either zinc-finger RING, F-box, or HECT domains) and DUBs (containing the ubiquitin thiolesterase family 2 domain). These increases in numbers are too large to be accounted for by the presence of fragmentary proteins in the data sets examined. Much of this innovation appears to have been associated with the emergence of multicellular organisms, and subsequently of vertebrates, increasing the opportunity for complex regulation of ubiquitination-mediated cellular and developmental processes.
Collapse
|
28
|
Grimmond SM, Miranda KC, Yuan Z, Davis MJ, Hume DA, Yagi K, Tominaga N, Bono H, Hayashizaki Y, Okazaki Y, Teasdale RD. The mouse secretome: functional classification of the proteins secreted into the extracellular environment. Genome Res 2003; 13:1350-9. [PMID: 12819133 PMCID: PMC403661 DOI: 10.1101/gr.983703] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2002] [Accepted: 04/17/2003] [Indexed: 11/25/2022]
Abstract
We have developed a computational strategy to identify the set of soluble proteins secreted into the extracellular environment of a cell. Within the protein sequences predominantly derived from the RIKEN representative transcript and protein set, we identified 2033 unique soluble proteins that are potentially secreted from the cell. These proteins contain a signal peptide required for entry into the secretory pathway and lack any transmembrane domains or intracellular localization signals. This class of proteins, which we have termed the mouse secretome, included >500 novel proteins and 92 proteins <100 amino acids in length. Functional analysis of the secretome included identification of human orthologs, functional units based on InterPro and SCOP Superfamily predictions, and expression of the proteins within the RIKEN READ microarray database. To highlight the utility of this information, we discuss the CUB domain-containing protein family.
Collapse
Affiliation(s)
- Sean M Grimmond
- Institute for Molecular Bioscience and ARC Special Research Centre for Functional and Applied Genomics, University of Queensland, St. Lucia 4072, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Abstract
Whole genome sequencing of the free-living nematode Caenorhabditis elegans is a prominent achievement in genomics and uncovers the existence of enormous known and unknown gene products. Characterization and linking of all gene products are the next challenging theme of biology. Genome-wide researches are already progressing on C. elegans and the fruits of these efforts are accessible through the internet. To link the sequence-function relationship, proteomic research has been applied to provide comprehensive information of the worm proteins. In addition to 2-dimensional gel electrophoresis for visualization of the proteome, recent advances in liquid chromatography (LC)-based technologies have allowed the large-scale analysis of proteins and are at cutting-edge of high-throughput analysis of focused proteome.
Collapse
Affiliation(s)
- Hiroyuki Kaji
- Department of Chemistry, Graduate School of Science, Tokyo Metropolitan University, Minami-osawa 1-1, Hachioji, Tokyo 192-0397, Japan
| | | |
Collapse
|
30
|
Camon E, Magrane M, Barrell D, Binns D, Fleischmann W, Kersey P, Mulder N, Oinn T, Maslen J, Cox A, Apweiler R. The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res 2003; 13:662-72. [PMID: 12654719 PMCID: PMC430163 DOI: 10.1101/gr.461403] [Citation(s) in RCA: 255] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Gene Ontology Annotation (GOA) is a project run by the European Bioinformatics Institute (EBI) that aims to provide assignments of terms from the Gene Ontology (GO) resource to gene products in a number of its databases (http://www.ebi.ac.uk/GOA). In the first stage of this project, GO assignments have been applied to a data set representing the complete human proteome by a combination of electronic mappings and manual curation. This vocabulary has also been applied to the nonredundant proteome sets for all other completely sequenced organisms as well as to proteins from a wide range of organisms where the proteome is not yet complete.
Collapse
Affiliation(s)
- Evelyn Camon
- EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Kriventseva EV, Servant F, Apweiler R. Improvements to CluSTr: the database of SWISS-PROT+TrEMBL protein clusters. Nucleic Acids Res 2003; 31:388-9. [PMID: 12520029 PMCID: PMC165482 DOI: 10.1093/nar/gkg035] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The CluSTr database (http://www.ebi.ac.uk/clustr/) offers an automatic classification of SWISS-PROT+TrEMBL proteins into groups of related proteins. The clustering is based on analysis of all pair-wise sequence comparisons between proteins using the Smith-Waterman algorithm. The analysis, carried out on different levels of protein similarity, yields a hierarchical organization of clusters. Information about domain content of the clustered proteins is provided via the InterPro resource. The introduced InterPro 'condensed graphical view' simplifies the visual analysis of represented domain architectures. Integrated applications allow users to visualize and edit multiple alignments and build sequence divergence trees. Links to the relevant structural data in Protein Data Bank (PDB) and Homology derived Secondary Structure of Proteins (HSSP) are also provided.
Collapse
Affiliation(s)
- E V Kriventseva
- EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | |
Collapse
|
32
|
Buchan DWA, Rison SCG, Bray JE, Lee D, Pearl F, Thornton JM, Orengo CA. Gene3D: structural assignments for the biologist and bioinformaticist alike. Nucleic Acids Res 2003; 31:469-73. [PMID: 12520054 PMCID: PMC165498 DOI: 10.1093/nar/gkg051] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Gene3D database (http://www.biochem.ucl.ac.uk/bsm/cath_new/Gene3D/) provides structural assignments for genes within complete genomes. These are available via the internet from either the World Wide Web or FTP. Assignments are made using PSI-BLAST and subsequently processed using the DRange protocol. The DRange protocol is an empirically benchmarked method for assessing the validity of structural assignments made using sequence searching methods where appropriate assignment statistics are collected and made available. Gene3D links assignments to their appropriate entries in relevent structural and classification resources (PDBsum, CATH database and the Dictionary of Homologous Superfamilies). Release 2.0 of Gene3D includes 62 genomes, 2 eukaryotes, 10 archaea and 40 bacteria. Currently, structural assignments can be made for between 30 and 40 percent of any given genome. In any genome, around half of those genes assigned a structural domain are assigned a single domain and the other half of the genes are assigned multiple structural domains. Gene3D is linked to the CATH database and is updated with each new update of CATH.
Collapse
Affiliation(s)
- Daniel W A Buchan
- Biomolecular Structure and Modelling Group, Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK
| | | | | | | | | | | | | |
Collapse
|
33
|
|
34
|
Pruess M, Fleischmann W, Kanapin A, Karavidopoulou Y, Kersey P, Kriventseva E, Mittard V, Mulder N, Phan I, Servant F, Apweiler R. The Proteome Analysis database: a tool for the in silico analysis of whole proteomes. Nucleic Acids Res 2003; 31:414-7. [PMID: 12520037 PMCID: PMC165552 DOI: 10.1093/nar/gkg105] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Proteome Analysis database (http://www.ebi.ac.uk/proteome/) has been developed by the Sequence Database Group at EBI utilizing existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archeae and eukaryotes. Three main projects are used, InterPro, CluSTr and GO Slim, to give an overview on families, domains, sites, and functions of the proteins from each of the complete genomes. Complete proteome analysis is available for a total of 89 proteome sets. A specifically designed application enables InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.
Collapse
Affiliation(s)
- Manuela Pruess
- EMBL Outstation, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Pruess M, Apweiler R. Bioinformatics Resources for In Silico Proteome Analysis. J Biomed Biotechnol 2003; 2003:231-236. [PMID: 14615630 PMCID: PMC514268 DOI: 10.1155/s1110724303209219] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2002] [Accepted: 12/10/2002] [Indexed: 11/17/2022] Open
Abstract
In the growing field of proteomics, tools for the in silico analysis of proteins and even of whole proteomes are of crucial importance to make best use of the accumulating amount of data. To utilise this data for healthcare and drug development, first the characteristics of proteomes of entire species-mainly the human-have to be understood, before secondly differentiation between individuals can be surveyed. Specialised databases about nucleic acid sequences, protein sequences, protein tertiary structure, genome analysis, and proteome analysis represent useful resources for analysis, characterisation, and classification of protein sequences. Different from most proteomics tools focusing on similarity searches, structure analysis and prediction, detection of specific regions, alignments, data mining, 2D PAGE analysis, or protein modelling, respectively, comprehensive databases like the proteome analysis database benefit from the information stored in different databases and make use of different protein analysis tools to provide computational analysis of whole proteomes.
Collapse
Affiliation(s)
- Manuela Pruess
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf Apweiler
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
36
|
Schneider D, Liu Y, Gerstein M, Engelman DM. Thermostability of membrane protein helix-helix interaction elucidated by statistical analysis. FEBS Lett 2002; 532:231-6. [PMID: 12459496 DOI: 10.1016/s0014-5793(02)03687-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
A prerequisite for the survival of (micro)organisms at high temperatures is an adaptation of protein stability to extreme environmental conditions. In contrast to soluble proteins, where many factors have already been identified, the mechanisms by which the thermostability of membrane proteins is enhanced are almost unknown. The hydrophobic membrane environment constrains possible stabilizing factors for transmembrane domains, so that a difference might be expected between soluble and membrane proteins. Here we present sequence analysis of predicted transmembrane helices of the genomes from eight thermophilic and 12 mesophilic organisms. A comparison of the amino acid compositions indicates that more polar residues can be found in the transmembrane helices of thermophilic organisms. Particularly, the amino acids aspartic acid and glutamic acid replace the corresponding amides. Cysteine residues are found to be significantly decreased by about 70% in thermophilic membrane domains suggesting a non-specific function of most cysteine residues in transmembrane domains of mesophilic organisms. By a pair-motif analysis of the two sets of transmembrane helices, we found that the small residues glycine and serine contribute more to transmembrane helix-helix interactions in thermophilic organisms. This may result in a tighter packing of the helices allowing more hydrogen bond formation.
Collapse
Affiliation(s)
- Dirk Schneider
- Department of Molecular Biophysics and Biochemistry, Yale University, P.O. Box 208114, New Haven, CT 06520-8114, USA
| | | | | | | |
Collapse
|
37
|
Grützner F, Roest Crollius H, Lütjens G, Jaillon O, Weissenbach J, Ropers HH, Haaf T. Four-hundred million years of conserved synteny of human Xp and Xq genes on three Tetraodon chromosomes. Genome Res 2002; 12:1316-22. [PMID: 12213768 PMCID: PMC186653 DOI: 10.1101/gr.222402] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The freshwater pufferfish Tetraodon nigroviridis (TNI) has become highly attractive as a compact reference vertebrate genome for gene finding and validation. We have mapped genes, which are more or less evenly spaced on the human chromosomes 9 and X, on Tetraodon chromosomes using fluorescence in situ hybridization (FISH), to establish syntenic relationships between Tetraodon and other key vertebrate genomes. PufferFISH revealed that the human X is an orthologous mosaic of three Tetraodon chromosomes. More than 350 million years ago, an ancestral vertebrate autosome shared orthologous Xp and Xq genes with Tetraodon chromosomes 1 and 7. The shuffled order of Xp and Xq orthologs on their syntenic Tetraodon chromosomes can be explained by the prevalence of evolutionary inversions. The Tetraodon 2 orthologous genes are clustered in human Xp11 and represent a recent addition to the eutherian X sex chromosome. The human chromosome 9 and the avian Z sex chromosome show a much lower degree of synteny conservation in the pufferfish than the human X chromosome. We propose that a special selection process during vertebrate evolution has shaped a highly conserved array(s) of X-linked genes long before the X was used as a mammalian sex chromosome and many X chromosomal genes were recruited for reproduction and/or the development of cognitive abilities. [Sequence data reported in this paper have been deposited in GenBank and assigned the following accession no: AJ308098.]
Collapse
Affiliation(s)
- Frank Grützner
- Comparative Genomics Group, Research School of Biological Sciences, Australian National University, Canberra, ACT 2601, Australia
| | | | | | | | | | | | | |
Collapse
|
38
|
Fassler J, Landsman D, Acharya A, Moll JR, Bonovich M, Vinson C. B-ZIP proteins encoded by the Drosophila genome: evaluation of potential dimerization partners. Genome Res 2002; 12:1190-200. [PMID: 12176927 PMCID: PMC186634 DOI: 10.1101/gr.67902] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The basic region-leucine zipper (B-ZIP) (bZIP) protein motif dimerizes to bind specific DNA sequences. We have identified 27 B-ZIP proteins in the recently sequenced Drosophila melanogaster genome. The dimerization specificity of these 27 B-ZIP proteins was evaluated using two structural criteria: (1) the presence of attractive or repulsive interhelical g<-->e' electrostatic interactions and (2) the presence of polar or charged amino acids in the 'a' and 'd' positions of the hydrophobic interface. None of the B-ZIP proteins contain only aliphatic amino acids in the'a' and 'd' position. Only six of the Drosophila B-ZIP proteins contain a "canonical" hydrophobic interface like the yeast GCN4, and the mammalian JUN, ATF2, CREB, C/EBP, and PAR leucine zippers, characterized by asparagine in the second 'a' position. Twelve leucine zippers contain polar amino acids in the first, third, and fourth 'a' positions. Circular dichroism spectroscopy, used to monitor thermal denaturations of a heterodimerizing leucine zipper system containing either valine (V) or asparagine (N) in the 'a' position, indicates that the V-N interaction is 2.3 kcal/mole less stable than an N-N interaction and 5.3 kcal/mole less stable than a V-V interaction. Thus, we propose that the presence of polar amino acids in novel positions of the 'a' position of Drosophila B-ZIP proteins has led to leucine zippers that homodimerize rather than heterodimerize.
Collapse
Affiliation(s)
- Jan Fassler
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20814, USA
| | | | | | | | | | | |
Collapse
|
39
|
Abstract
Some molecules, particularly aromatics, have high molar extinction coefficients at wavelengths in the damaging ultraviolet radiation region of the spectrum between 200 and 400 nm. Thus, under a UV radiation flux in which these wavelengths are represented, it could be argued that a selection pressure would exist for a UV transparent biochemistry in which they were not represented. This hypothesis is explored using data made available from proteomics, focusing particularly on tryptophan, against which a selection pressure could exist on present-day Earth as a result of its absorbance shoulder at wavelengths greater than 290 nm. The abundance of tryptophan in whole proteomes is lower than expected from the degeneracy of the genetic code. A lower usage of tryptophan is found in the cytochrome c oxidase polypeptide I of UV-exposed organisms compared to nocturnal and subterranean organisms, but not in ATP synthase chain A. Examination of the amino acid composition of photolyase, an enzyme that requires exposure to light to function, shows that the tryptophan abundances exceed those of the total proteome of most organisms and the abundances expected from the degeneracy of the genetic code. This is also true for cytochrome c oxidase, another enzyme that makes extensive use of the electron transfer properties of tryptophan. We suggest that the selection pressure for the use of tryptophan caused, among other factors, by the uses of delocalised pi-electrons that this aromatic provides in active sites and binding motifs outweighs the selection pressure for UV transparency. This trade-off explains the lack of conclusive evidence for a UV transparent selection pressure. We suggest that this trade-off applies to the stacked pi-electrons of DNA. It offers a solution to the long-standing paradox of why the macromolecule responsible for the faithful replication of information has high absorbance in the damaging UV radiation region of the spectrum.
Collapse
|
40
|
Snel B, Bork P, Huynen MA. The identification of functional modules from the genomic association of genes. Proc Natl Acad Sci U S A 2002; 99:5890-5. [PMID: 11983890 PMCID: PMC122872 DOI: 10.1073/pnas.092632599] [Citation(s) in RCA: 191] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2001] [Indexed: 11/18/2022] Open
Abstract
By combining the pairwise interactions between proteins, as predicted by the conserved co-occurrence of their genes in operons, we obtain protein interaction networks. Here we study the properties of such networks to identify functional modules: sets of proteins that together are involved in a biological process. The complete network contains 3,033 orthologous groups of proteins in 38 genomes. It consists of one giant component, containing 1,611 orthologous groups, and of 516 small disjointed clusters that, on average, contain only 2.7 orthologous groups. These small clusters have a homogeneous functional composition and thus represent functional modules in themselves. Analysis of the giant component reveals that it is a scale-free, small-world network with a high degree of local clustering (C = 0.6). It consists of locally highly connected subclusters that are connected to each other by linker proteins. The linker proteins tend to have multiple functions, or are involved in multiple processes and have an above average probability of being essential. By splitting up the giant component at these linker proteins, we identify 265 subclusters that tend to have a homogeneous functional composition. The rare functional inhomogeneities in our subclusters reflect the mixing of different types of (molecular) functions in a single cellular process, exemplified by subclusters containing both metabolic enzymes as well as the transcription factors that regulate them. Comparative genome analysis, thus, allows identification of a level of functional interaction between that of pairwise interactions, and of the complete genome.
Collapse
Affiliation(s)
- Berend Snel
- European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
| | | | | |
Collapse
|
41
|
Yu J, Hu S, Wang J, Wong GKS, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 2002; 296:79-92. [PMID: 11935017 DOI: 10.1126/science.1068037] [Citation(s) in RCA: 1760] [Impact Index Per Article: 80.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC content of rice coding sequences.
Collapse
MESH Headings
- Arabidopsis/genetics
- Base Composition
- Computational Biology
- Contig Mapping
- DNA Transposable Elements
- DNA, Intergenic
- DNA, Plant/chemistry
- DNA, Plant/genetics
- Databases, Nucleic Acid
- Exons
- Gene Duplication
- Genes, Plant
- Genome, Plant
- Genomics
- Introns
- Molecular Sequence Data
- Oryza/genetics
- Plant Proteins/chemistry
- Plant Proteins/genetics
- Polymorphism, Genetic
- Repetitive Sequences, Nucleic Acid
- Sequence Analysis, DNA
- Sequence Homology, Nucleic Acid
- Software
- Species Specificity
- Synteny
Collapse
Affiliation(s)
- Jun Yu
- Beijing Genomics Institute/Center of Genomics and Bioinformatics, Chinese Academy of Sciences, Beijing 101300, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Buchan DWA, Shepherd AJ, Lee D, Pearl FMG, Rison SCG, Thornton JM, Orengo CA. Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database. Genome Res 2002; 12:503-14. [PMID: 11875040 PMCID: PMC155287 DOI: 10.1101/gr.213802] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We present a novel web-based resource, Gene3D, of precalculated structural assignments to gene sequences and whole genomes. This resource assigns structural domains from the CATH database to whole genes and links these to their curated functional and structural annotations within the CATH domain structure database, the functional Dictionary of Homologous Superfamilies (DHS) and PDBsum. Currently Gene3D provides annotation for 36 complete genomes (two eukaryotes, six archaea, and 28 bacteria). On average, between 30% and 40% of the genes of a given genome can be structurally annotated. Matches to structural domains are found using the profile-based method (PSI-BLAST). and a novel protocol, DRange, is used to resolve conflicts in matches involving different homologous superfamilies.
Collapse
Affiliation(s)
- Daniel W A Buchan
- Biomolecular Structure and Modelling Group, Department of Biochemistry and Molecular Biology, University College London, London, WC1E 6BT, United Kingdom
| | | | | | | | | | | | | |
Collapse
|
43
|
Martinez-Cruz LA, Dreyer MK, Boisvert DC, Yokota H, Martinez-Chantar ML, Kim R, Kim SH. Crystal structure of MJ1247 protein from M. jannaschii at 2.0 A resolution infers a molecular function of 3-hexulose-6-phosphate isomerase. Structure 2002; 10:195-204. [PMID: 11839305 DOI: 10.1016/s0969-2126(02)00701-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The crystal structure of the hypothetical protein MJ1247 from Methanococccus jannaschii at 2 A resolution, a detailed sequence analysis, and biochemical assays infer its molecular function to be 3-hexulose-6-phosphate isomerase (PHI). In the dissimilatory ribulose monophosphate (RuMP) cycle, ribulose-5-phosphate is coupled to formaldehyde by the 3-hexulose-6-phosphate synthase (HPS), yielding hexulose-6-phosphate, which is then isomerized to fructose-6-phosphate by the enzyme 3-hexulose-6-phosphate isomerase. MJ1247 is an alpha/beta structure consisting of a five-stranded parallel beta sheet flanked on both sides by alpha helices, forming a three-layered alpha-beta-alpha sandwich. The fold represents the nucleotide binding motif of a flavodoxin type. MJ1247 is a tetramer in the crystal and in solution and each monomer has a folding similar to the isomerase domain of glucosamine-6-phosphate synthase (GlmS).
Collapse
Affiliation(s)
- Luis Alfonso Martinez-Cruz
- Physical Biosciences Division, Lawrence Berkeley National Laboratory and Department of Chemistry, University of California, Berkeley, 94720, USA.
| | | | | | | | | | | | | |
Collapse
|
44
|
Brooks DJ, Fresco JR. Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol Cell Proteomics 2002; 1:125-31. [PMID: 12096130 DOI: 10.1074/mcp.m100001-mcp200] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Analysis of extant proteomes has the potential of revealing how amino acid frequencies within proteins have evolved over biological time. Evidence is presented here that cysteine, tyrosine, and phenylalanine residues have substantially increased in frequency since the three primary lineages diverged more than three billion years ago. This inference was derived from a comparison of amino acid frequencies within conserved and non-conserved residues of a set of proteins dating to the last universal ancestor in the face of empirical knowledge of the relative mutability of these amino acids. The under-representation of these amino acids within last universal ancestor proteins relative to their modern descendants suggests their late introduction into the genetic code. Thus, it appears that extant ancient proteins contain evidence pertaining to early events in the formation of biological systems.
Collapse
Affiliation(s)
- Dawn J Brooks
- Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544, USA
| | | |
Collapse
|
45
|
Luscombe NM, Qian J, Zhang Z, Johnson T, Gerstein M. The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties. Genome Biol 2002; 3:RESEARCH0040. [PMID: 12186647 PMCID: PMC126234 DOI: 10.1186/gb-2002-3-8-research0040] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2002] [Revised: 04/19/2002] [Accepted: 05/21/2002] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through the analysis of such inventories, it has been shown that different genomes have very different usage of parts; for example, the common folds in the worm are very different from those in Escherichia coli. RESULTS Despite these differences, we find that the genomic occurrence of generalized parts follows a well-known mathematical framework called the power law, with a few parts occurring many times and most occurring only a few times. This observation is true in a wide variety of genomic contexts. Earlier studies found power laws in a few specific cases, such as the occurrence of protein families. Here, we find many further cases of power-law behavior, for example in the occurrence of pseudogenes and in levels of gene expression. We show comprehensively that this behavior applies across many different genomes, for many different types of parts (DNA words, InterPro families, protein superfamilies and folds, pseudogene families and pseudomotifs), and for the many disparate attributes associated with these parts (their functions, interactions and expression levels). CONCLUSIONS Power-law behavior provides a concise mathematical description of an important biological feature: the sheer dominance of a few members over the overall population. We present this behavior in a unified framework and propose that all these observations are connected to an underlying DNA duplication process as genomes evolved to their current state.
Collapse
Affiliation(s)
- Nicholas M Luscombe
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | - Jiang Qian
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | - Zhaolei Zhang
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | - Ted Johnson
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| |
Collapse
|
46
|
Liu Y, Engelman DM, Gerstein M. Genomic analysis of membrane protein families: abundance and conserved motifs. Genome Biol 2002; 3:research0054. [PMID: 12372142 PMCID: PMC134483 DOI: 10.1186/gb-2002-3-10-research0054] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2002] [Revised: 07/26/2002] [Accepted: 08/07/2002] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of protein domain families, and using transmembrane-helix prediction and sequence-similarity searching, we identified a total of 526 well-characterized membrane protein families in 26 recently sequenced genomes. To this we added a clustering of a number of predicted but unclassified membrane proteins, resulting in a total of 637 membrane protein families. RESULTS Analysis of the occurrence and composition of these families revealed several interesting trends. The number of assigned membrane protein domains has an approximately linear relationship to the total number of open reading frames (ORFs) in 26 genomes studied. Caenorhabditis elegans is an apparent outlier, because of its high representation of seven-span transmembrane (7-TM) chemoreceptor families. In all genomes, including that of C. elegans, the number of distinct membrane protein families has a logarithmic relation to the number of ORFs. Glycine, proline, and tyrosine locations tend to be conserved in transmembrane regions within families, whereas isoleucine, valine, and methionine locations are relatively mutable. Analysis of motifs in putative transmembrane helices reveals that GxxxG and GxxxxxxG (which can be written GG4 and GG7, respectively; see Materials and methods) are among the most prevalent. This was noted in earlier studies; we now find these motifs are particularly well conserved in families, however, especially those corresponding to transporters, symporters, and channels. CONCLUSIONS We carried out a genome-wide analysis on patterns of the classified polytopic membrane protein families and analyzed the distribution of conserved amino acids and motifs in the transmembrane helix regions in these families.
Collapse
Affiliation(s)
- Yang Liu
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | - Donald M Engelman
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| |
Collapse
|
47
|
Qian J, Luscombe NM, Gerstein M. Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. J Mol Biol 2001; 313:673-81. [PMID: 11697896 DOI: 10.1006/jmbi.2001.5079] [Citation(s) in RCA: 206] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Global surveys of genomes measure the usage of essential molecular parts, defined here as protein families, superfamilies or folds, in different organisms. Based on surveys of the first 20 completely sequenced genomes, we observe that the occurrence of these parts follows a power-law distribution. That is, the number of distinct parts (F) with a given genomic occurrence (V) decays as F=aV(-b), with a few parts occurring many times and most occurring infrequently. For a given organism, the distributions of families, superfamilies and folds are nearly identical, and this is reflected in the size of the decay exponent b. Moreover, the exponent varies between different organisms, with those of smaller genomes displaying a steeper decay (i.e. larger b). Clearly, the power law indicates a preference to duplicate genes that encode for molecular parts which are already common. Here, we present a minimal, but biologically meaningful model that accurately describes the observed power law. Although the model performs equally well for all three protein classes, we focus on the occurrence of folds in preference to families and superfamilies. This is because folds are comparatively insensitive to the effects of point mutations that can cause a family member to diverge beyond detectable similarity. In the model, genomes evolve through two basic operations: (i) duplication of existing genes; (ii) net flow of new genes. The flow term is closely related to the exponent b and can accommodate considerable gene loss; however, we demonstrate that the observed data is reproduced best with a net inflow, i.e. with more gene gain than loss. Moreover, we show that prokaryotes have much higher rates of gene acquisition than eukaryotes, probably reflecting lateral transfer. A further natural outcome from our model is an estimation of the fold composition of the initial genome, which potentially relates to the common ancestor for modern organisms. Supplementary material pertaining to this work is available from www.partslist.org/powerlaw.
Collapse
Affiliation(s)
- J Qian
- Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8114, USA
| | | | | |
Collapse
|
48
|
Abstract
In the post-genomic era, the new discipline of functional genomics is now facing the challenge of associating a function (as well as estimating its relevance to industrial applications) to about 100,000 microbial, plant or animal genes of known sequence but unknown function. Besides the design of databases, computational methods are increasingly becoming intimately linked with the various experimental approaches. Consequently, bioinformatics is rapidly evolving into independent fields addressing the specific problems of interpreting i) genomic sequences, ii) protein sequences and 3D-structures, as well as iii) transcriptome and macromolecular interaction data. It is thus increasingly difficult for the biologist to choose the computational approaches that perform best in these various areas. This paper attempts to review the most useful developments of the last 2 years.
Collapse
Affiliation(s)
- J M Claverie
- Structural and Genetic Information Laboratory,UMR 1889 CNRS-AVENTIS, 31 Chemin Joseph Aiguier, 13402 Marseille Cedex 20, France.
| | | | | | | |
Collapse
|
49
|
Abstract
Several technical, social, and biological networks were recently found to demonstrate scale-free and small-world behavior instead of random graph characteristics. In this work, the topology of protein domain networks generated with data from the ProDom, Pfam, and Prosite domain databases was studied. It was found that these networks exhibited small-world and scale-free topologies with a high degree of local clustering accompanied by a few long-distance connections. Moreover, these observations apply not only to the complete databases, but also to the domain distributions in proteomes of different organisms. The extent of connectivity among domains reflects the evolutionary complexity of the organisms considered.
Collapse
Affiliation(s)
- S Wuchty
- European Media Laboratory, Heidelberg, Germany.
| |
Collapse
|
50
|
Nakajima T, Matsumoto K, Suto H, Tanaka K, Ebisawa M, Tomita H, Yuki K, Katsunuma T, Akasawa A, Hashida R, Sugita Y, Ogawa H, Ra C, Saito H. Gene expression screening of human mast cells and eosinophils using high-density oligonucleotide probe arrays: abundant expression of major basic protein in mast cells. Blood 2001; 98:1127-34. [PMID: 11493461 DOI: 10.1182/blood.v98.4.1127] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Mast cells (MCs) and eosinophils are thought to play important roles in evoking allergic inflammation. Cell-type--specific gene expression was screened among 12,000 genes in human MCs and eosinophils with the use of high-density oligonucleotide probe arrays. In comparison with other leukocytes, MCs expressed 140 cell-type--specific transcripts, whereas eosinophils expressed only 34. Among the transcripts for expected MC-specific proteins such as tryptase, major basic protein (MBP), which had been thought to be eosinophil specific, was ranked fourth in terms of amounts of increased MC-specific messenger RNA. Mature eosinophils were almost lacking this transcript. MCs obtained from 4 different sources (ie, lung, skin, adult peripheral blood progenitor--derived and cord blood progenitor--derived MCs, and eosinophils) were found to have high protein levels of MBP in their granules with the use of flow cytometric and confocal laser scanning microscopic analyses. The present finding that MCs can produce abundant MBP is crucial because many reports regarding allergic pathogenesis have been based on earlier findings that MBP was almost unique to eosinophils and not produced by MCs. (Blood. 2001;98:1127-1134)
Collapse
Affiliation(s)
- T Nakajima
- Department of Allergy & Immunology, National Children's Medical Research Center, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|