1
|
Molecular Interaction Search Tool (MIST): an integrated resource for mining gene and protein interaction data. Nucleic Acids Res 2019; 46:D567-D574. [PMID: 29155944 PMCID: PMC5753374 DOI: 10.1093/nar/gkx1116] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 10/25/2017] [Indexed: 12/16/2022] Open
Abstract
Model organism and human databases are rich with information about genetic and physical interactions. These data can be used to interpret and guide the analysis of results from new studies and develop new hypotheses. Here, we report the development of the Molecular Interaction Search Tool (MIST; http://fgrtools.hms.harvard.edu/MIST/). The MIST database integrates biological interaction data from yeast, nematode, fly, zebrafish, frog, rat and mouse model systems, as well as human. For individual or short gene lists, the MIST user interface can be used to identify interacting partners based on protein–protein and genetic interaction (GI) data from the species of interest as well as inferred interactions, known as interologs, and to view a corresponding network. The data, interologs and search tools at MIST are also useful for analyzing ‘omics datasets. In addition to describing the integrated database, we also demonstrate how MIST can be used to identify an appropriate cut-off value that balances false positive and negative discovery, and present use-cases for additional types of analysis. Altogether, the MIST database and search tools support visualization and navigation of existing protein and GI data, as well as comparison of new and existing data.
Collapse
|
2
|
Proteomic and Metabolomic Characterization of a Mammalian Cellular Transition from Quiescence to Proliferation. Cell Rep 2018; 20:721-736. [PMID: 28723573 DOI: 10.1016/j.celrep.2017.06.074] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 05/22/2017] [Accepted: 06/25/2017] [Indexed: 12/28/2022] Open
Abstract
There exist similarities and differences in metabolism and physiology between normal proliferative cells and tumor cells. Once a cell enters the cell cycle, metabolic machinery is engaged to facilitate various processes. The kinetics and regulation of these metabolic changes have not been properly evaluated. To correlate the orchestration of these processes with the cell cycle, we analyzed the transition from quiescence to proliferation of a non-malignant murine pro-B lymphocyte cell line in response to IL-3. Using multiplex mass-spectrometry-based proteomics, we show that the transition to proliferation shares features generally attributed to cancer cells: upregulation of glycolysis, lipid metabolism, amino-acid synthesis, and nucleotide synthesis and downregulation of oxidative phosphorylation and the urea cycle. Furthermore, metabolomic profiling of this transition reveals similarities to cancer-related metabolic pathways. In particular, we find that methionine is consumed at a higher rate than that of other essential amino acids, with a potential link to maintenance of the epigenome.
Collapse
|
3
|
Abstract
Normal cellular functioning is maintained by macromolecular machines that control both core and specialized molecular tasks. These machines are in large part multi-subunit protein complexes that undergo regulation at multiple levels, from expression of requisite components to a vast array of post-translational modifications (PTMs). PTMs such as phosphorylation, ubiquitination, and acetylation currently number more than 200,000 in the human proteome and function within all molecular pathways. Here we provide a framework for systematically studying these PTMs in the context of global protein-protein interaction networks. This analytical framework allows insight into which functions specific PTMs tend to cluster in, and furthermore which complexes either single or multiple PTM signaling pathways converge on.
Collapse
|
4
|
FlyRNAi.org-the database of the Drosophila RNAi screening center and transgenic RNAi project: 2017 update. Nucleic Acids Res 2016; 45:D672-D678. [PMID: 27924039 PMCID: PMC5210654 DOI: 10.1093/nar/gkw977] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 10/05/2016] [Accepted: 10/11/2016] [Indexed: 12/11/2022] Open
Abstract
The FlyRNAi database of the Drosophila RNAi Screening Center (DRSC) and Transgenic RNAi Project (TRiP) at Harvard Medical School and associated DRSC/TRiP Functional Genomics Resources website (http://fgr.hms.harvard.edu) serve as a reagent production tracking system, screen data repository, and portal to the community. Through this portal, we make available protocols, online tools, and other resources useful to researchers at all stages of high-throughput functional genomics screening, from assay design and reagent identification to data analysis and interpretation. In this update, we describe recent changes and additions to our website, database and suite of online tools. Recent changes reflect a shift in our focus from a single technology (RNAi) and model species (Drosophila) to the application of additional technologies (e.g. CRISPR) and support of integrated, cross-species approaches to uncovering gene function using functional genomics and other approaches.
Collapse
|
5
|
Separation and identification of phenolic acid and flavonoids from Nerium indicum flowers. Indian J Pharm Sci 2015; 77:91-5. [PMID: 25767323 PMCID: PMC4355888 DOI: 10.4103/0250-474x.151603] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2014] [Revised: 10/24/2014] [Accepted: 01/22/2015] [Indexed: 11/25/2022] Open
Abstract
Four major compounds were separated and identified from the methanol extracts of Nerium indicum flowers (Arali) using HPLC and mass spectral data. Through mass data, the chemical structures were elucidated as: trans5-O-caffeoylquinic acid (1), quercetin-3-O- rutinoside (2), luteolin-5-O-rutinoside (3) and luteolin-7-O-rutinoside (4). In addition, the cis isomers of 5-O-caffeoylquinic acid in Nerium indicum flowers were confirmed by Mass, HPLC and UV. The structures of these compounds confirmed with the help of liquid chromatography mass spectrometry.
Collapse
|
6
|
Combining genetic perturbations and proteomics to examine kinase-phosphatase networks in Drosophila embryos. Dev Cell 2014; 31:114-27. [PMID: 25284370 DOI: 10.1016/j.devcel.2014.07.027] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Revised: 06/24/2014] [Accepted: 07/28/2014] [Indexed: 02/07/2023]
Abstract
Connecting phosphorylation events to kinases and phosphatases is key to understanding the molecular organization and signaling dynamics of networks. We have generated a validated set of transgenic RNA-interference reagents for knockdown and characterization of all protein kinases and phosphatases present during early Drosophila melanogaster development. These genetic tools enable collection of sufficient quantities of embryos depleted of single gene products for proteomics. As a demonstration of an application of the collection, we have used multiplexed isobaric labeling for quantitative proteomics to derive global phosphorylation signatures associated with kinase-depleted embryos to systematically link phosphosites with relevant kinases. We demonstrate how this strategy uncovers kinase consensus motifs and prioritizes phosphoproteins for kinase target validation. We validate this approach by providing auxiliary evidence for Wee kinase-directed regulation of the chromatin regulator Stonewall. Further, we show how correlative phosphorylation at the site level can indicate function, as exemplified by Sterile20-like kinase-dependent regulation of Stat92E.
Collapse
|
7
|
Erratum: Integrating protein-protein interaction networks with phenotypes reveals signs of interactions. Nat Methods 2014. [DOI: 10.1038/nmeth0714-773a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
8
|
A rapid genome-wide microRNA screen identifies miR-14 as a modulator of Hedgehog signaling. Cell Rep 2014; 7:2066-77. [PMID: 24931604 DOI: 10.1016/j.celrep.2014.05.025] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 05/07/2014] [Accepted: 05/12/2014] [Indexed: 12/21/2022] Open
Abstract
MicroRNAs (miRNAs) are small noncoding RNAs that regulate gene expression by binding to sequences within the 3' UTR of mRNAs. Because miRNAs bind to short sequences with partial complementarity, target identification is challenging. To complement the existing target prediction algorithms, we devised a systematic "reverse approach" screening platform that allows the empirical prediction of miRNA-target interactions. Using Drosophila cells, we screened the 3' untranslated regions (3' UTRs) of the Hedgehog pathway genes against a genome-wide miRNA library and identified both predicted and many nonpredicted miRNA-target interactions. We demonstrate that miR-14 is essential for maintaining the proper level of Hedgehog signaling activity by regulating its physiological target, hedgehog. Furthermore, elevated levels of miR-14 suppress Hedgehog signaling activity by cotargeting its apparent nonphysiological targets, patched and smoothened. Altogether, our systematic screening platform is a powerful approach to identifying both physiological and apparent nonphysiological targets of miRNAs, which are relevant in both normal and diseased tissues.
Collapse
|
9
|
Abstract
The Hippo pathway controls metazoan organ growth by regulating cell proliferation and apoptosis. Many components have been identified, but our knowledge of the composition and structure of this pathway is still incomplete. Using existing pathway components as baits, we generated by mass spectrometry a high-confidence Drosophila Hippo protein-protein interaction network (Hippo-PPIN) consisting of 153 proteins and 204 interactions. Depletion of 67% of the proteins by RNA interference regulated the transcriptional coactivator Yorkie (Yki) either positively or negatively. We selected for further characterization a new member of the alpha-arrestin family, Leash, and show that it promotes degradation of Yki through the lysosomal pathway. Given the importance of the Hippo pathway in tumor development, the Hippo-PPIN will contribute to our understanding of this network in both normal growth and cancer.
Collapse
|
10
|
Abstract
Regulation of cell growth is a fundamental process in development and disease that integrates a vast array of extra- and intracellular information. A central player in this process is RNA polymerase I (Pol I), which transcribes ribosomal RNA (rRNA) genes in the nucleolus. Rapidly growing cancer cells are characterized by increased Pol I-mediated transcription and, consequently, nucleolar hypertrophy. To map the genetic network underlying the regulation of nucleolar size and of Pol I-mediated transcription, we performed comparative, genome-wide loss-of-function analyses of nucleolar size in Saccharomyces cerevisiae and Drosophila melanogaster coupled with mass spectrometry-based analyses of the ribosomal DNA (rDNA) promoter. With this approach, we identified a set of conserved and nonconserved molecular complexes that control nucleolar size. Furthermore, we characterized a direct role of the histone information regulator (HIR) complex in repressing rRNA transcription in yeast. Our study provides a full-genome, cross-species analysis of a nuclear subcompartment and shows that this approach can identify conserved molecular modules.
Collapse
|
11
|
Abstract
Phosphate is required for many important cellular processes and having too little phosphate or too much can cause disease and reduce life span in humans. However, the mechanisms underlying homeostatic control of extracellular phosphate levels and cellular effects of phosphate are poorly understood. Here, we establish Drosophila melanogaster as a model system for the study of phosphate effects. We found that Drosophila larval development depends on the availability of phosphate in the medium. Conversely, life span is reduced when adult flies are cultured on high phosphate medium or when hemolymph phosphate is increased in flies with impaired Malpighian tubules. In addition, RNAi-mediated inhibition of MAPK-signaling by knockdown of Ras85D, phl/D-Raf or Dsor1/MEK affects larval development, adult life span and hemolymph phosphate, suggesting that some in vivo effects involve activation of this signaling pathway by phosphate. To identify novel genetic determinants of phosphate responses, we used Drosophila hemocyte-like cultured cells (S2R+) to perform a genome-wide RNAi screen using MAPK activation as the readout. We identified a number of candidate genes potentially important for the cellular response to phosphate. Evaluation of 51 genes in live flies revealed some that affect larval development, adult life span and hemolymph phosphate levels.
Collapse
|
12
|
Abstract
Analysis of high-throughput data increasingly relies on pathway annotation and functional information derived from Gene Ontology. This approach has limitations, in particular for the analysis of network dynamics over time or under different experimental conditions, in which modules within a network rather than complete pathways might respond and change. We report an analysis framework based on protein complexes, which are at the core of network reorganization. We generated a protein complex resource for human, Drosophila, and yeast from the literature and databases of protein-protein interaction networks, with each species having thousands of complexes. We developed COMPLEAT (http://www.flyrnai.org/compleat), a tool for data mining and visualization for complex-based analysis of high-throughput data sets, as well as analysis and integration of heterogeneous proteomics and gene expression data sets. With COMPLEAT, we identified dynamically regulated protein complexes among genome-wide RNA interference data sets that used the abundance of phosphorylated extracellular signal-regulated kinase in cells stimulated with either insulin or epidermal growth factor as the output. The analysis predicted that the Brahma complex participated in the insulin response.
Collapse
|
13
|
Identification of human proteins that modify misfolding and proteotoxicity of pathogenic ataxin-1. PLoS Genet 2012; 8:e1002897. [PMID: 22916034 PMCID: PMC3420947 DOI: 10.1371/journal.pgen.1002897] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 07/02/2012] [Indexed: 02/06/2023] Open
Abstract
Proteins with long, pathogenic polyglutamine (polyQ) sequences have an enhanced propensity to spontaneously misfold and self-assemble into insoluble protein aggregates. Here, we have identified 21 human proteins that influence polyQ-induced ataxin-1 misfolding and proteotoxicity in cell model systems. By analyzing the protein sequences of these modifiers, we discovered a recurrent presence of coiled-coil (CC) domains in ataxin-1 toxicity enhancers, while such domains were not present in suppressors. This suggests that CC domains contribute to the aggregation- and toxicity-promoting effects of modifiers in mammalian cells. We found that the ataxin-1-interacting protein MED15, computationally predicted to possess an N-terminal CC domain, enhances spontaneous ataxin-1 aggregation in cell-based assays, while no such effect was observed with the truncated protein MED15ΔCC, lacking such a domain. Studies with recombinant proteins confirmed these results and demonstrated that the N-terminal CC domain of MED15 (MED15CC) per se is sufficient to promote spontaneous ataxin-1 aggregation in vitro. Moreover, we observed that a hybrid Pum1 protein harboring the MED15CC domain promotes ataxin-1 aggregation in cell model systems. In strong contrast, wild-type Pum1 lacking a CC domain did not stimulate ataxin-1 polymerization. These results suggest that proteins with CC domains are potent enhancers of polyQ-mediated protein misfolding and aggregation in vitro and in vivo.
Collapse
|
14
|
HIPPIE: Integrating protein interaction networks with experiment based quality scores. PLoS One 2012; 7:e31826. [PMID: 22348130 PMCID: PMC3279424 DOI: 10.1371/journal.pone.0031826] [Citation(s) in RCA: 225] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2011] [Accepted: 01/12/2012] [Indexed: 01/03/2023] Open
Abstract
Protein function is often modulated by protein-protein interactions (PPIs) and therefore defining the partners of a protein helps to understand its activity. PPIs can be detected through different experimental approaches and are collected in several expert curated databases. These databases are used by researchers interested in examining detailed information on particular proteins. In many analyses the reliability of the characterization of the interactions becomes important and it might be necessary to select sets of PPIs of different confidence levels. To this goal, we generated HIPPIE (Human Integrated Protein-Protein Interaction rEference), a human PPI dataset with a normalized scoring scheme that integrates multiple experimental PPI datasets. HIPPIE's scoring scheme has been optimized by human experts and a computer algorithm to reflect the amount and quality of evidence for a given PPI and we show that these scores correlate to the quality of the experimental characterization. The HIPPIE web tool (available at http://cbdm.mdc-berlin.de/tools/hippie) allows researchers to do network analyses focused on likely true PPI sets by generating subnetworks around proteins of interest at a specified confidence level.
Collapse
|
15
|
A directed protein interaction network for investigating intracellular signal transduction. Sci Signal 2011; 4:rs8. [PMID: 21900206 DOI: 10.1126/scisignal.2001699] [Citation(s) in RCA: 245] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Cellular signal transduction is a complex process involving protein-protein interactions (PPIs) that transmit information. For example, signals from the plasma membrane may be transduced to transcription factors to regulate gene expression. To obtain a global view of cellular signaling and to predict potential signal modulators, we searched for protein interaction partners of more than 450 signaling-related proteins by means of automated yeast two-hybrid interaction mating. The resulting PPI network connected 1126 proteins through 2626 PPIs. After expansion of this interaction map with publicly available PPI data, we generated a directed network resembling the signal transduction flow between proteins with a naïve Bayesian classifier. We exploited information on the shortest PPI paths from membrane receptors to transcription factors to predict input and output relationships between interacting proteins. Integration of directed PPI with time-resolved protein phosphorylation data revealed network structures that dynamically conveyed information from the activated epidermal growth factor and extracellular signal-regulated kinase (EGF/ERK) signaling cascade to directly associated proteins and more distant proteins in the network. From the model network, we predicted 18 previously unknown modulators of EGF/ERK signaling, which we validated in mammalian cell-based assays. This generic experimental and computational approach provides a framework for elucidating causal connections between signaling proteins and facilitates the identification of proteins that modulate the flow of information in signaling networks.
Collapse
|
16
|
Proteomic and functional genomic landscape of receptor tyrosine kinase and ras to extracellular signal-regulated kinase signaling. Sci Signal 2011; 4:rs10. [PMID: 22028469 DOI: 10.1126/scisignal.2002029] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Characterizing the extent and logic of signaling networks is essential to understanding specificity in such physiological and pathophysiological contexts as cell fate decisions and mechanisms of oncogenesis and resistance to chemotherapy. Cell-based RNA interference (RNAi) screens enable the inference of large numbers of genes that regulate signaling pathways, but these screens cannot provide network structure directly. We describe an integrated network around the canonical receptor tyrosine kinase (RTK)-Ras-extracellular signal-regulated kinase (ERK) signaling pathway, generated by combining parallel genome-wide RNAi screens with protein-protein interaction (PPI) mapping by tandem affinity purification-mass spectrometry. We found that only a small fraction of the total number of PPI or RNAi screen hits was isolated under all conditions tested and that most of these represented the known canonical pathway components, suggesting that much of the core canonical ERK pathway is known. Because most of the newly identified regulators are likely cell type- and RTK-specific, our analysis provides a resource for understanding how output through this clinically relevant pathway is regulated in different contexts. We report in vivo roles for several of the previously unknown regulators, including CG10289 and PpV, the Drosophila orthologs of two components of the serine/threonine-protein phosphatase 6 complex; the Drosophila ortholog of TepIV, a glycophosphatidylinositol-linked protein mutated in human cancers; CG6453, a noncatalytic subunit of glucosidase II; and Rtf1, a histone methyltransferase.
Collapse
|
17
|
An integrative approach to ortholog prediction for disease-focused and other functional studies. BMC Bioinformatics 2011; 12:357. [PMID: 21880147 PMCID: PMC3179972 DOI: 10.1186/1471-2105-12-357] [Citation(s) in RCA: 471] [Impact Index Per Article: 36.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2011] [Accepted: 08/31/2011] [Indexed: 12/12/2022] Open
Abstract
Background Mapping of orthologous genes among species serves an important role in functional genomics by allowing researchers to develop hypotheses about gene function in one species based on what is known about the functions of orthologs in other species. Several tools for predicting orthologous gene relationships are available. However, these tools can give different results and identification of predicted orthologs is not always straightforward. Results We report a simple but effective tool, the Drosophila RNAi Screening Center Integrative Ortholog Prediction Tool (DIOPT; http://www.flyrnai.org/diopt), for rapid identification of orthologs. DIOPT integrates existing approaches, facilitating rapid identification of orthologs among human, mouse, zebrafish, C. elegans, Drosophila, and S. cerevisiae. As compared to individual tools, DIOPT shows increased sensitivity with only a modest decrease in specificity. Moreover, the flexibility built into the DIOPT graphical user interface allows researchers with different goals to appropriately 'cast a wide net' or limit results to highest confidence predictions. DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. This helps users identify the most appropriate matches among multiple possible orthologs. To facilitate using model organisms for functional analysis of human disease-associated genes, we used DIOPT to predict high-confidence orthologs of disease genes in Online Mendelian Inheritance in Man (OMIM) and genes in genome-wide association study (GWAS) data sets. The results are accessible through the DIOPT diseases and traits query tool (DIOPT-DIST; http://www.flyrnai.org/diopt-dist). Conclusions DIOPT and DIOPT-DIST are useful resources for researchers working with model organisms, especially those who are interested in exploiting model organisms such as Drosophila to study the functions of human disease genes.
Collapse
|
18
|
Abstract
Several attempts have been made at systematically mapping protein-protein interaction, or “interactome” networks. However, it remains difficult to assess the quality and coverage of existing datasets. We describe a framework that uses an empirically-based approach to rigorously dissect quality parameters of currently available human interactome maps. Our results indicate that high-throughput yeast two-hybrid (HT-Y2H) interactions for human are superior in precision to literature-curated interactions supported by only a single publication, suggesting that HT-Y2H is suitable to map a significant portion of the human interactome. We estimate that the human interactome contains ~130,000 binary interactions, most of which remain to be mapped. Similar to estimates of DNA sequence data quality and genome size early in the human genome project, estimates of protein interaction data quality and interactome size are critical to establish the magnitude of the task of comprehensive human interactome mapping and to illuminate a path towards this goal.
Collapse
|
19
|
Native and modeled disulfide bonds in proteins: knowledge-based approaches toward structure prediction of disulfide-rich polypeptides. Proteins 2006; 58:866-79. [PMID: 15645448 DOI: 10.1002/prot.20369] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Structure prediction and three-dimensional modeling of disulfide-rich systems are challenging due to the limited number of such folds in the structural databank. We exploit the stereochemical compatibility of substructures in known protein structures to accommodate disulfide bonds in predicting the structures of disulfide-rich polypeptides directly from disulfide connectivity pattern and amino acid sequence in the absence of structural homologs and any other structural information. This knowledge-based approach is illustrated using structure prediction of 40 nonredundant bioactive disulfide-rich polypeptides such as toxins, growth factors, and endothelins available in the structural databank. The polypeptide conformation could be predicted in 35 out of 40 nonredundant entries (87%). Nonhomologous templates could be identified and models could be obtained within 2 A deviation from the query in 29 peptides (72%). This procedure can be accessed from the World Wide Web (http://www.ncbs.res.in/ approximately faculty/mini/dsdbase/dsdbase.html).
Collapse
|
20
|
Genomic analysis of Xenopus organizer function. BMC DEVELOPMENTAL BIOLOGY 2006; 6:27. [PMID: 16756679 PMCID: PMC1513553 DOI: 10.1186/1471-213x-6-27] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2006] [Accepted: 06/06/2006] [Indexed: 11/15/2022]
Abstract
Background Studies of the Xenopus organizer have laid the foundation for our understanding of the conserved signaling pathways that pattern vertebrate embryos during gastrulation. The two primary activities of the organizer, BMP and Wnt inhibition, can regulate a spectrum of genes that pattern essentially all aspects of the embryo during gastrulation. As our knowledge of organizer signaling grows, it is imperative that we begin knitting together our gene-level knowledge into genome-level signaling models. The goal of this paper was to identify complete lists of genes regulated by different aspects of organizer signaling, thereby providing a deeper understanding of the genomic mechanisms that underlie these complex and fundamental signaling events. Results To this end, we ectopically overexpress Noggin and Dkk-1, inhibitors of the BMP and Wnt pathways, respectively, within ventral tissues. After isolating embryonic ventral halves at early and late gastrulation, we analyze the transcriptional response to these molecules within the generated ectopic organizers using oligonucleotide microarrays. An efficient statistical analysis scheme, combined with a new Gene Ontology biological process annotation of the Xenopus genome, allows reliable and faithful clustering of molecules based upon their roles during gastrulation. From this data, we identify new organizer-related expression patterns for 19 genes. Moreover, our data sub-divides organizer genes into separate head and trunk organizing groups, which each show distinct responses to Noggin and Dkk-1 activity during gastrulation. Conclusion Our data provides a genomic view of the cohorts of genes that respond to Noggin and Dkk-1 activity, allowing us to separate the role of each in organizer function. These patterns demonstrate a model where BMP inhibition plays a largely inductive role during early developmental stages, thereby initiating the suites of genes needed to pattern dorsal tissues. Meanwhile, Wnt inhibition acts later during gastrulation, and is essential for maintenance of organizer gene expression throughout gastrulation, a role which may depend on its ability to block the expression of a host of ventral, posterior, and lateral fate-specifying factors.
Collapse
|
21
|
GOPET: a tool for automated predictions of Gene Ontology terms. BMC Bioinformatics 2006; 7:161. [PMID: 16549020 PMCID: PMC1434778 DOI: 10.1186/1471-2105-7-161] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2005] [Accepted: 03/20/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Vast progress in sequencing projects has called for annotation on a large scale. A Number of methods have been developed to address this challenging task. These methods, however, either apply to specific subsets, or their predictions are not formalised, or they do not provide precise confidence values for their predictions. DESCRIPTION We recently established a learning system for automated annotation, trained with a broad variety of different organisms to predict the standardised annotation terms from Gene Ontology (GO). Now, this method has been made available to the public via our web-service GOPET (Gene Ontology term Prediction and Evaluation Tool). It supplies annotation for sequences of any organism. For each predicted term an appropriate confidence value is provided. The basic method had been developed for predicting molecular function GO-terms. It is now expanded to predict biological process terms. This web service is available via http://genius.embnet.dkfz-heidelberg.de/menu/biounit/open-husar CONCLUSION Our web service gives experimental researchers as well as the bioinformatics community a valuable sequence annotation device. Additionally, GOPET also provides less significant annotation data which may serve as an extended discovery platform for the user.
Collapse
|
22
|
Global gene expression profiling and cluster analysis in Xenopus laevis. Mech Dev 2005; 122:441-75. [PMID: 15763214 DOI: 10.1016/j.mod.2004.11.007] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2004] [Revised: 10/04/2004] [Accepted: 11/07/2004] [Indexed: 01/12/2023]
Abstract
We have undertaken a large-scale microarray gene expression analysis using cDNAs corresponding to 21,000 Xenopus laevis ESTs. mRNAs from 37 samples, including embryos and adult organs, were profiled. Cluster analysis of embryos of different stages was carried out and revealed expected affinities between gastrulae and neurulae, as well as between advanced neurulae and tadpoles, while egg and feeding larvae were clearly separated. Cluster analysis of adult organs showed some unexpected tissue-relatedness, e.g. kidney is more related to endodermal than to mesodermal tissues and the brain is separated from other neuroectodermal derivatives. Cluster analysis of genes revealed major phases of co-ordinate gene expression between egg and adult stages. During the maternal-early embryonic phase, genes maintaining a rapidly dividing cell state are predominantly expressed (cell cycle regulators, chromatin proteins). Genes involved in protein biosynthesis are progressively induced from mid-embryogenesis onwards. The larval-adult phase is characterised by expression of genes involved in metabolism and terminal differentiation. Thirteen potential synexpression groups were identified, which encompass components of diverse molecular processes or supra-molecular structures, including chromatin, RNA processing and nucleolar function, cell cycle, respiratory chain/Krebs cycle, protein biosynthesis, endoplasmic reticulum, vesicle transport, synaptic vesicle, microtubule, intermediate filament, epithelial proteins and collagen. Data filtering identified genes with potential stage-, region- and organ-specific expression. The dataset was assembled in the iChip microarray database, , which allows user-defined queries. The study provides insights into the higher order of vertebrate gene expression, identifies synexpression groups and marker genes, and makes predictions for the biological role of numerous uncharacterized genes.
Collapse
|
23
|
Applying Support Vector Machines for Gene Ontology based gene function prediction. BMC Bioinformatics 2004; 5:116. [PMID: 15333146 PMCID: PMC517617 DOI: 10.1186/1471-2105-5-116] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2004] [Accepted: 08/26/2004] [Indexed: 11/23/2022] Open
Abstract
Background The current progress in sequencing projects calls for rapid, reliable and accurate function assignments of gene products. A variety of methods has been designed to annotate sequences on a large scale. However, these methods can either only be applied for specific subsets, or their results are not formalised, or they do not provide precise confidence estimates for their predictions. Results We have developed a large-scale annotation system that tackles all of these shortcomings. In our approach, annotation was provided through Gene Ontology terms by applying multiple Support Vector Machines (SVM) for the classification of correct and false predictions. The general performance of the system was benchmarked with a large dataset. An organism-wise cross-validation was performed to define confidence estimates, resulting in an average precision of 80% for 74% of all test sequences. The validation results show that the prediction performance was organism-independent and could reproduce the annotation of other automated systems as well as high-quality manual annotations. We applied our trained classification system to Xenopus laevis sequences, yielding functional annotation for more than half of the known expressed genome. Compared to the currently available annotation, we provided more than twice the number of contigs with good quality annotation, and additionally we assigned a confidence value to each predicted GO term. Conclusions We present a complete automated annotation system that overcomes many of the usual problems by applying a controlled vocabulary of Gene Ontology and an established classification method on large and well-described sequence data sets. In a case study, the function for Xenopus laevis contig sequences was predicted and the results are publicly available at .
Collapse
|
24
|
Abstract
MOTIVATION Although many methods are available for the identification of structural domains from protein three-dimensional structures, accurate definition of protein domains and the curation of such data for a large number of proteins are often possible only after manual intervention. The availability of domain definitions for protein structural entries is useful for the sequence analysis of aligned domains, structure comparison, fold recognition procedures and understanding protein folding, domain stability and flexibility. RESULTS We have improved our method of domain identification starting from the concept of clustering secondary structural elements, but with an intention of reducing the number of discontinuous segments in identified domains. The results of our modified and automatic approach have been compared with the domain definitions from other databases. On a test data set of 55 proteins, this method acquires high agreement (88%) in the number of domains with the crystallographers' definition and resources such as SCOP, CATH, DALI, 3Dee and PDP databases. This method also obtains 98% overlap score with the other resources in the definition of domain boundaries of the 55 proteins. We have examined the domain arrangements of 4592 non-redundant protein chains using the improved method to include 5409 domains leading to an update of the structural domain database. AVAILABILITY The latest version of the domain database and online domain identification methods are available from http://www.ncbs.res.in/~faculty/mini/ddbase/ddbase.html SUPPLEMENTARY INFORMATION http://www.ncbs.res.in/~faculty/mini/ddbase/supplementary/supplementary.html
Collapse
|
25
|
DSDBASE: a consortium of native and modelled disulphide bonds in proteins. Nucleic Acids Res 2004; 32:D200-2. [PMID: 14681394 PMCID: PMC308760 DOI: 10.1093/nar/gkh026] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2003] [Revised: 08/16/2003] [Accepted: 09/03/2003] [Indexed: 11/13/2022] Open
Abstract
DSDBASE is a database of disulphide bonds in proteins, which provides information on native disulphides and those that are stereochemically possible between pairs of residues for all known protein structural entries. The modelling of disulphides has been performed, using MODIP, by the identification of residue pairs that can strainlessly accommodate a covalent cross-link. We also assess the stereochemical quality of the covalent cross-link and grade them appropriately. One of the potential uses of the database is to design site-directed mutants in order to enhance the thermal stability of a protein. The proposed sites of mutations can be viewed specifically with respect to active sites of enzymes and across physiological dimers. The occurrence of native and modelled disulphides increases the dimensions of the database enormously. This database can also be employed for proposing three-dimensional models of disulphide-rich short polypeptides. The database can be accessed from http://www.ncbs.res.in/ approximately faculty/mini/dsdbase/dsdbase.html. Supplementary information can be accessed from http://www.ncbs.res.in/ approximately faculty/mini/dsdbase/nar/suppl.htm.
Collapse
|
26
|
Abstract
This rapid and sensitive method for localizing tyrosinase in polyacrylamide slab gels is based on the condensation of Bestthorn's hydrazone (3 methyl-2-benzothiazolinone hydrazone hydrochloride) with the quinone obtained by enzymatic oxidation of phenol. Both monophenolase and diphenolase activities are localized by this method.
Collapse
|