1
|
Kotlyar M, Pastrello C, Pivetta F, Lo Sardo A, Cumbaa C, Li H, Naranian T, Niu Y, Ding Z, Vafaee F, Broackes-Carter F, Petschnigg J, Mills GB, Jurisicova A, Stagljar I, Maestro R, Jurisica I. In silico prediction of physical protein interactions and characterization of interactome orphans. Nat Methods 2014; 12:79-84. [PMID: 25402006 DOI: 10.1038/nmeth.3178] [Citation(s) in RCA: 112] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 08/14/2014] [Indexed: 12/12/2022]
Abstract
Protein-protein interactions (PPIs) are useful for understanding signaling cascades, predicting protein function, associating proteins with disease and fathoming drug mechanism of action. Currently, only ∼ 10% of human PPIs may be known, and about one-third of human proteins have no known interactions. We introduce FpClass, a data mining-based method for proteome-wide PPI prediction. At an estimated false discovery rate of 60%, we predicted 250,498 PPIs among 10,531 human proteins; 10,647 PPIs involved 1,089 proteins without known interactions. We experimentally tested 233 high- and medium-confidence predictions and validated 137 interactions, including seven novel putative interactors of the tumor suppressor p53. Compared to previous PPI prediction methods, FpClass achieved better agreement with experimentally detected PPIs. We provide an online database of annotated PPI predictions (http://ophid.utoronto.ca/fpclass/) and the prediction software (http://www.cs.utoronto.ca/~juris/data/fpclass/).
Collapse
Affiliation(s)
- Max Kotlyar
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Chiara Pastrello
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Centro Riferimento Oncologico, Istituto Nazionale Tumori, Aviano, Italy
| | - Flavia Pivetta
- Centro Riferimento Oncologico, Istituto Nazionale Tumori, Aviano, Italy
| | | | - Christian Cumbaa
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Han Li
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Taline Naranian
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Yun Niu
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | - Zhiyong Ding
- Department of Systems Biology, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Fatemeh Vafaee
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia
| | - Fiona Broackes-Carter
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Julia Petschnigg
- Donnelly Centre, Departments of Molecular Genetics and Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Gordon B Mills
- Department of Systems Biology, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Andrea Jurisicova
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Igor Stagljar
- Donnelly Centre, Departments of Molecular Genetics and Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Roberta Maestro
- Centro Riferimento Oncologico, Istituto Nazionale Tumori, Aviano, Italy
| | - Igor Jurisica
- 1] Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada. [2] Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada. [3] Department of Computer Science, University of Toronto, Toronto, Ontario, Canada. [4] TECHNA Institute for the Advancement of Technology for Health, Toronto, Ontario, Canada
| |
Collapse
|
2
|
Snell EH, Luft JR, Potter SA, Lauricella AM, Gulde SM, Malkowski MG, Koszelak-Rosenblum M, Said MI, Smith JL, Veatch CK, Collins RJ, Franks G, Thayer M, Cumbaa C, Jurisica I, Detitta GT. Establishing a training set through the visual analysis of crystallization trials. Part I: approximately 150,000 images. Acta Crystallogr D Biol Crystallogr 2008; 64:1123-30. [PMID: 19020350 PMCID: PMC2631114 DOI: 10.1107/s0907444908028047] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 09/02/2008] [Indexed: 11/12/2022]
Abstract
As part of a training set for automated image analysis, ∼150 000 images of crystallization experiments from 96 diverse macromolecules have been visually classified within seven categories. Outcomes and trends are analyzed. Structural crystallography aims to provide a three-dimensional representation of macromolecules. Many parts of the multistep process to produce the three-dimensional structural model have been automated, especially through various structural genomics projects. A key step is the production of crystals for diffraction. The target macromolecule is combined with a large and chemically diverse set of cocktails with some leading ideally, but infrequently, to crystallization. A variety of outcomes will be observed during these screening experiments that typically require human interpretation for classification. Human interpretation is neither scalable nor objective, highlighting the need to develop an automatic computer-based image classification. As a first step towards automated image classification, 147 456 images representing crystallization experiments from 96 different macromolecular samples were manually classified. Each image was classified by three experts into seven predefined categories or their combinations. The resulting data where all three observers are in agreement provides one component of a truth set for the development and rigorous testing of automated image-classification systems and provides information about the chemical cocktails used for crystallization. In this paper, the details of this study are presented.
Collapse
Affiliation(s)
- Edward H Snell
- Hauptman-Woodward Medical Research Institute, 700 Ellicott Street, Buffalo, NY 14203, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Snell EH, Lauricella AM, Potter SA, Luft JR, Gulde SM, Collins RJ, Franks G, Malkowski MG, Cumbaa C, Jurisica I, DeTitta GT. Establishing a training set through the visual analysis of crystallization trials. Part II: crystal examples. Acta Crystallogr D Biol Crystallogr 2008; 64:1131-7. [PMID: 19020351 PMCID: PMC2631118 DOI: 10.1107/s0907444908028059] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 09/02/2008] [Indexed: 11/17/2022]
Abstract
As part of a training set for automated image analysis, crystallization screening experiments for 269 different macromolecules were visually analyzed and a set of crystal images extracted. Outcomes and trends are analyzed. In the automated image analysis of crystallization experiments, representative examples of outcomes can be obtained rapidly. However, while the outcomes appear to be diverse, the number of crystalline outcomes can be small. To complement a training set from the visual observation of 147 456 crystallization outcomes, a set of crystal images was produced from 106 and 163 macromolecules under study for the North East Structural Genomics Consortium (NESG) and Structural Genomics of Pathogenic Protozoa (SGPP) groups, respectively. These crystal images have been combined with the initial training set. A description of the crystal-enriched data set and a preliminary analysis of outcomes from the data are described.
Collapse
Affiliation(s)
- Edward H Snell
- Hauptman-Woodward Medical Research Institute, 700 Ellicott Street, Buffalo, NY 14203, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Cumbaa C, Jurisica I. Automatic Classification and Pattern Discovery in High-throughput Protein Crystallization Trials. ACTA ACUST UNITED AC 2005; 6:195-202. [PMID: 16211519 DOI: 10.1007/s10969-005-5243-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2004] [Accepted: 05/04/2005] [Indexed: 10/25/2022]
Abstract
Conceptually, protein crystallization can be divided into two phases search and optimization. Robotic protein crystallization screening can speed up the search phase, and has a potential to increase process quality. Automated image classification helps to increase throughput and consistently generate objective results. Although the classification accuracy can always be improved, our image analysis system can classify images from 1,536-well plates with high classification accuracy (85%) and ROC score (0.87), as evaluated on 127 human-classified protein screens containing 5,600 crystal images and 189,472 non-crystal images. Data mining can integrate results from high-throughput screens with information about crystallizing conditions, intrinsic protein properties, and results from crystallization optimization. We apply association mining, a data mining approach that identifies frequently occurring patterns among variables and their values. This approach segregates proteins into groups based on how they react in a broad range of conditions, and clusters cocktails to reflect their potential to achieve crystallization. These results may lead to crystallization screen optimization, and reveal associations between protein properties and crystallization conditions. We also postulate that past experience may lead us to the identification of initial conditions favorable to crystallization for novel proteins.
Collapse
Affiliation(s)
- Christian Cumbaa
- Ontario Cancer Institute, Northeast Structural Genomics Consortium, 610 University Avenue, Toronto, Ontario M5G 2M9, Canada
| | | |
Collapse
|