1
|
Soleymani F, Paquet E, Viktor HL, Michalowski W, Spinello D. ProtInteract: A deep learning framework for predicting protein-protein interactions. Comput Struct Biotechnol J 2023; 21:1324-1348. [PMID: 36817951 PMCID: PMC9929211 DOI: 10.1016/j.csbj.2023.01.028] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/20/2023] [Accepted: 01/20/2023] [Indexed: 01/26/2023] Open
Abstract
Proteins mainly perform their functions by interacting with other proteins. Protein-protein interactions underpin various biological activities such as metabolic cycles, signal transduction, and immune response. However, due to the sheer number of proteins, experimental methods for finding interacting and non-interacting protein pairs are time-consuming and costly. We therefore developed the ProtInteract framework to predict protein-protein interaction. ProtInteract comprises two components: first, a novel autoencoder architecture that encodes each protein's primary structure to a lower-dimensional vector while preserving its underlying sequence attributes. This leads to faster training of the second network, a deep convolutional neural network (CNN) that receives encoded proteins and predicts their interaction under three different scenarios. In each scenario, the deep CNN predicts the class of a given encoded protein pair. Each class indicates different ranges of confidence scores corresponding to the probability of whether a predicted interaction occurs or not. The proposed framework features significantly low computational complexity and relatively fast response. The contributions of this work are twofold. First, ProtInteract assimilates the protein's primary structure into a pseudo-time series. Therefore, we leverage the nature of the time series of proteins and their physicochemical properties to encode a protein's amino acid sequence into a lower-dimensional vector space. This approach enables extracting highly informative sequence attributes while reducing computational complexity. Second, the ProtInteract framework utilises this information to identify protein interactions with other proteins based on its amino acid configuration. Our results suggest that the proposed framework performs with high accuracy and efficiency in predicting protein-protein interactions.
Collapse
Affiliation(s)
- Farzan Soleymani
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada,Corresponding author.
| | - Herna Lydia Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON K1N 6N5, Canada
| | | | - Davide Spinello
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| |
Collapse
|
2
|
Soleymani F, Paquet E, Viktor H, Michalowski W, Spinello D. Protein-protein interaction prediction with deep learning: A comprehensive review. Comput Struct Biotechnol J 2022; 20:5316-5341. [PMID: 36212542 PMCID: PMC9520216 DOI: 10.1016/j.csbj.2022.08.070] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/29/2022] [Accepted: 08/30/2022] [Indexed: 11/15/2022] Open
Abstract
Most proteins perform their biological function by interacting with themselves or other molecules. Thus, one may obtain biological insights into protein functions, disease prevalence, and therapy development by identifying protein-protein interactions (PPI). However, finding the interacting and non-interacting protein pairs through experimental approaches is labour-intensive and time-consuming, owing to the variety of proteins. Hence, protein-protein interaction and protein-ligand binding problems have drawn attention in the fields of bioinformatics and computer-aided drug discovery. Deep learning methods paved the way for scientists to predict the 3-D structure of proteins from genomes, predict the functions and attributes of a protein, and modify and design new proteins to provide desired functions. This review focuses on recent deep learning methods applied to problems including predicting protein functions, protein-protein interaction and their sites, protein-ligand binding, and protein design.
Collapse
Affiliation(s)
- Farzan Soleymani
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON, Canada
| | - Eric Paquet
- National Research Council, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada
| | - Herna Viktor
- School of Electrical Engineering and Computer Science, University of Ottawa, ON, Canada
| | | | - Davide Spinello
- Department of Mechanical Engineering, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
3
|
Kritzer JA, Freyzon Y, Lindquist S. Yeast can accommodate phosphotyrosine: v-Src toxicity in yeast arises from a single disrupted pathway. FEMS Yeast Res 2018; 18:4931722. [PMID: 29546391 PMCID: PMC6454501 DOI: 10.1093/femsyr/foy027] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Accepted: 03/08/2018] [Indexed: 12/29/2022] Open
Abstract
Tyrosine phosphorylation is a key biochemical signal that controls growth and differentiation in multicellular organisms. Saccharomyces cerevisiae and nearly all other unicellular eukaryotes lack intact phosphotyrosine signaling pathways. However, many of these organisms have primitive phosphotyrosine-binding proteins and tyrosine phosphatases, leading to the assumption that the major barrier for emergence of phosphotyrosine signaling was the negative consequences of promiscuous tyrosine kinase activity. In this work, we reveal that the classic oncogene v-Src, which phosphorylates many dozens of proteins in yeast, is toxic because it disrupts a specific spore wall remodeling pathway. Using genetic selections, we find that expression of a specific cyclic peptide, or overexpression of SMK1, a MAP kinase that controls spore wall assembly, both lead to robust growth despite a continuous high level of phosphotyrosine in the yeast proteome. Thus, minimal genetic manipulations allow yeast to tolerate high levels of phosphotyrosine. These results indicate that the introduction of tyrosine kinases within single-celled organisms may not have been a major obstacle to the evolution of phosphotyrosine signaling.
Collapse
Affiliation(s)
- Joshua A Kritzer
- Department of Chemistry, Tufts University, Medford MA 02155, USA
| | - Yelena Freyzon
- Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge MA 02142, USA
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge MA 02139, USA
| | - Susan Lindquist
- Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge MA 02142, USA
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge MA 02139, USA
| |
Collapse
|
4
|
Yu X, Bian X, Throop A, Song L, Moral LD, Park J, Seiler C, Fiacco M, Steel J, Hunter P, Saul J, Wang J, Qiu J, Pipas JM, LaBaer J. Exploration of panviral proteome: high-throughput cloning and functional implications in virus-host interactions. Am J Cancer Res 2014; 4:808-22. [PMID: 24955142 PMCID: PMC4063979 DOI: 10.7150/thno.8255] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Accepted: 04/27/2014] [Indexed: 12/24/2022] Open
Abstract
Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies.
Collapse
|
5
|
Abstract
High-throughput, automated or semiautomated methodologies implemented by companies and structural genomics initiatives have accelerated the process of acquiring structural information for proteins via x-ray crystallography. This has enabled the application of structure-based drug design technologies to a variety of new structures that have potential pharmacologic relevance. Although there remain major challenges to applying these approaches more broadly to all classes of drug discovery targets, clearly the continued development and implementation of these structure-based drug design methodologies by the scientific community at large will help to address and provide solutions to these hurdles. The result will be a growing number of protein structures of important pharmacologic targets that will help to streamline the process of identification and optimization of lead compounds for drug development. These lead agonist and antagonist pharmacophores should, in turn, help to alleviate one of the current critical bottlenecks in the drug discovery process; that is, defining the functional relevance of potential novel targets to disease modification. The prospect of generating an increasing number of potential drug candidates will serve to highlight perhaps the most significant future bottleneck for drug development, the cost and complexity of the drug approval process.
Collapse
Affiliation(s)
- Leslie W Tari
- ActiveSight, 4045 Sorrento Valley Blvd, San Diego, CA 92121, USA.
| | | | | |
Collapse
|
6
|
Im H, Snyder M. Preparation of recombinant protein spotted arrays for proteome-wide identification of kinase targets. ACTA ACUST UNITED AC 2013; Chapter 27:Unit 27.4. [PMID: 23546622 DOI: 10.1002/0471140864.ps2704s72] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Protein microarrays allow unique approaches for interrogating global protein interaction networks. Protein arrays can be divided into two categories: antibody arrays and functional protein arrays. Antibody arrays consist of various antibodies and are appropriate for profiling protein abundance and modifications. Functional full-length protein arrays employ full-length proteins with various post-translational modifications. A key advantage of the latter is rapid parallel processing of large number of proteins for studying highly controlled biochemical activities, protein-protein interactions, protein-nucleic acid interactions, and protein-small molecule interactions. This unit presents a protocol for constructing functional yeast protein microarrays for global kinase substrate identification. This approach enables the rapid determination of protein interaction networks in yeast on a proteome-wide level. The same methodology can be readily applied to higher eukaryotic systems with careful consideration of overexpression strategy.
Collapse
Affiliation(s)
- Hogune Im
- Department of Genetics, Stanford University, Stanford, California, USA
| | | |
Collapse
|
7
|
Smallwood SE, Rahman MM, Werden SJ, Martino MF, McFadden G. Production of Myxoma virus gateway entry and expression libraries and validation of viral protein expression. CURRENT PROTOCOLS IN MICROBIOLOGY 2011; Chapter 14:Unit 14A.2. [PMID: 21538302 PMCID: PMC3104670 DOI: 10.1002/9780471729259.mc14a02s21] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Invitrogen's Gateway technology is a recombination-based cloning method that allows for rapid transfer of numerous open reading frames (ORFs) into multiple plasmid vectors, making it useful for diverse high-throughput applications. Gateway technology has been utilized to create an ORF library for Myxoma virus (MYXV), a member of the Poxviridae family of DNA viruses. MYXV is the prototype virus for the genus Leporipoxvirus, and is pathogenic only in European rabbits. MYXV replicates exclusively in the host cell cytoplasm, and its genome encodes 171 ORFs. A number of these ORFs encode proteins that interfere with or modulate host defense mechanisms, particularly the inflammatory responses. Furthermore, MYXV is able to productively infect a variety of human cancer cell lines and is being developed as an oncolytic virus for treating human cancers. MYXV is therefore an excellent model for studying poxvirus biology, pathogenesis, and host tropism, and a good candidate for ORFeome development.
Collapse
Affiliation(s)
- Sherin E Smallwood
- Department of Molecular Genetics and Microbiology, College of Medicine, University of Florida, Gainesville, Florida, USA
| | | | | | | | | |
Collapse
|
8
|
Weinrich D, Jonkheijm P, Niemeyer CM, Waldmann H. Applications of protein biochips in biomedical and biotechnological research. Angew Chem Int Ed Engl 2009; 48:7744-51. [PMID: 19757463 PMCID: PMC7159567 DOI: 10.1002/anie.200901480] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Progress in the development of protein‐immobilization strategies and methods has made protein biochips increasingly accessible. The integration of these assay and analysis platforms into biomedical and biotechnological research has substantially expanded the repertoire of methods available for proteomics and biomarker research and for drug development. This Minireview highlights selected developments in the application of protein biochips in these fields.
Collapse
Affiliation(s)
- Dirk Weinrich
- Max-Planck-Institut für molekulare Physiologie and Technische Universität Dortmund, Fachbereich Chemie, Otto-Hahn-Strasse 11, 44227 Dortmund, Germany
| | | | | | | |
Collapse
|
9
|
Weinrich D, Jonkheijm P, Niemeyer C, Waldmann H. Proteinbiochips in der Biomedizin und Biotechnologie. Angew Chem Int Ed Engl 2009. [DOI: 10.1002/ange.200901480] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
10
|
Paulovich AG, Whiteaker JR, Hoofnagle AN, Wang P. The interface between biomarker discovery and clinical validation: The tar pit of the protein biomarker pipeline. Proteomics Clin Appl 2008; 2:1386-1402. [PMID: 20976028 PMCID: PMC2957839 DOI: 10.1002/prca.200780174] [Citation(s) in RCA: 172] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2007] [Indexed: 12/26/2022]
Abstract
The application of "omics" technologies to biological samples generates hundreds to thousands of biomarker candidates; however, a discouragingly small number make it through the pipeline to clinical use. This is in large part due to the incredible mismatch between the large numbers of biomarker candidates and the paucity of reliable assays and methods for validation studies. We desperately need a pipeline that relieves this bottleneck between biomarker discovery and validation. This paper reviews the requirements for technologies to adequately credential biomarker candidates for costly clinical validation and proposes methods and systems to verify biomarker candidates. Models involving pooling of clinical samples, where appropriate, are discussed. We conclude that current proteomic technologies are on the cusp of significantly affecting translation of molecular diagnostics into the clinic.
Collapse
Affiliation(s)
| | | | - Andrew N. Hoofnagle
- Department of Laboratory Medicine, University of Washington, Seattle, WA, USA
| | - Pei Wang
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| |
Collapse
|
11
|
Production and sequence validation of a complete full length ORF collection for the pathogenic bacterium Vibrio cholerae. Proc Natl Acad Sci U S A 2008; 105:4364-9. [PMID: 18337508 DOI: 10.1073/pnas.0712049105] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Cholera, an infectious disease with global impact, is caused by pathogenic strains of the bacterium Vibrio cholerae. High-throughput functional proteomics technologies now offer the opportunity to investigate all aspects of the proteome, which has led to an increased demand for comprehensive protein expression clone resources. Genome-scale reagents for cholera would encourage comprehensive analyses of immune responses and systems-wide functional studies that could lead to improved vaccine and therapeutic strategies. Here, we report the production of the FLEXGene clone set for V. cholerae O1 biovar eltor str. N16961: a complete-genome collection of ORF clones. This collection includes 3,761 sequence-verified clones from 3,887 targeted ORFs (97%). The ORFs were captured in a recombinational cloning vector to facilitate high-throughput transfer of ORF inserts into suitable expression vectors. To demonstrate its application, approximately 15% of the collection was transferred into the relevant expression vector and used to produce a protein microarray by transcribing, translating, and capturing the proteins in situ on the array surface with 92% success. In a second application, a method to screen for protein triggers of Toll-like receptors (TLRs) was developed. We tested in vitro-synthesized proteins for their ability to stimulate TLR5 in A549 cells. This approach appropriately identified FlaC, and previously uncharacterized TLR5 agonist activities. These data suggest that the genome-scale, fully sequenced ORF collection reported here will be useful for high-throughput functional proteomic assays, immune response studies, structure biology, and other applications.
Collapse
|
12
|
Gagliardi S, Rapone B, Mosiello L, Luciani D, Gerardino A, Morales P. Laser-Assisted Fabrication of Biomolecular Sensing Microarrays. IEEE Trans Nanobioscience 2007; 6:242-8. [DOI: 10.1109/tnb.2007.903485] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
13
|
Abstract
Protein chips have emerged as a promising approach for a wide variety of applications including the identification of protein-protein interactions, protein-phospholipid interactions, small molecule targets, and substrates of proteins kinases. They can also be used for clinical diagnostics and monitoring disease states. This article reviews current methods in the generation and applications of protein microarrays.
Collapse
Affiliation(s)
| | | | - Michael Snyder
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06520, United States
| |
Collapse
|
14
|
Zuo D, Mohr SE, Hu Y, Taycher E, Rolfs A, Kramer J, Williamson J, LaBaer J. PlasmID: a centralized repository for plasmid clone information and distribution. Nucleic Acids Res 2006; 35:D680-4. [PMID: 17132831 PMCID: PMC1716714 DOI: 10.1093/nar/gkl898] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Plasmid Information Database (PlasmID; ) was developed as a community-based resource portal to facilitate search and request of plasmid clones shared with the Dana-Farber/Harvard Cancer Center (DF/HCC) DNA Resource Core. PlasmID serves as a central data repository and enables researchers to search the collection online using common gene names and identifiers, keywords, vector features, author names and PubMed IDs. As of October 2006, the repository contains >46 000 plasmids in 98 different vectors, including cloned cDNA and genomic fragments from 26 different species. Moreover, the clones include plasmid vectors useful for routine and cutting-edge techniques; functionally related sets of human cDNA clones; and genome-scale gene collections for Saccharomyces cerevisiae, Pseudomonas aeruginosa, Yersinia pestis, Francisella tularensis, Bacillus anthracis and Vibrio cholerae. Information about the plasmids has been fully annotated in adherence with a high-quality standard, and clone samples are stored as glycerol stocks in a state-of-the-art automated −80°C freezer storage system. Clone replication and distribution is highly automated to minimize human error. Infor-mation about vectors and plasmid clones, including downloadable maps and sequence data, is freely available online. Researchers interested in requesting clone samples or sharing their own plasmids with the repository can visit the PlasmID website for more information.
Collapse
Affiliation(s)
- Dongmei Zuo
- Harvard Institute of Proteomics, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
- DF/HCC DNA Resource Core, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Stephanie E. Mohr
- DF/HCC DNA Resource Core, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Yanhui Hu
- Harvard Institute of Proteomics, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Elena Taycher
- Harvard Institute of Proteomics, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Andreas Rolfs
- Harvard Institute of Proteomics, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Jason Kramer
- DF/HCC DNA Resource Core, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Janice Williamson
- Harvard Institute of Proteomics, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
| | - Joshua LaBaer
- Harvard Institute of Proteomics, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
- DF/HCC DNA Resource Core, Harvard Medical School320 Charles Street, Cambridge, MA 02141, USA
- To whom correspondence should be addressed. Tel: +1 6173240816; Fax: +1 6173240824;
| |
Collapse
|
15
|
Ramachandran N, Larson DN, Stark PRH, Hainsworth E, LaBaer J. Emerging tools for real-time label-free detection of interactions on functional protein microarrays. FEBS J 2005; 272:5412-25. [PMID: 16262683 DOI: 10.1111/j.1742-4658.2005.04971.x] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The availability of extensive genomic information and content has spawned an era of high-throughput screening that is generating large sets of functional genomic data. In particular, the need to understand the biochemical wiring within a cell has introduced novel approaches to map the intricate networks of biological interactions arising from the interactions of proteins. The current technologies for assaying protein interactions--yeast two-hybrid and immunoprecipitation with mass spectrometric detection--have met with considerable success. However, the parallel use of these approaches has identified only a small fraction of physiologically relevant interactions among proteins, neglecting all nonprotein interactions, such as with metabolites, lipids, DNA and small molecules. This highlights the need for further development of proteome scale technologies that enable the study of protein function. Here we discuss recent advances in high-throughput technologies for displaying proteins on functional protein microarrays and the real-time label-free detection of interactions using probes of the local index of refraction, carbon nanotubes and nanowires, or microelectromechanical systems cantilevers. The combination of these technologies will facilitate the large-scale study of protein interactions with proteins as well as with other biomolecules.
Collapse
Affiliation(s)
- Niroshan Ramachandran
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, MA 02141, USA
| | | | | | | | | |
Collapse
|
16
|
Strömberg P, Rotticci-Mulder J, Björnestedt R, Schmidt SR. Preparative parallel protein purification (P4). J Chromatogr B Analyt Technol Biomed Life Sci 2005; 818:11-8. [PMID: 15722038 DOI: 10.1016/j.jchromb.2004.09.035] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2004] [Accepted: 09/20/2004] [Indexed: 11/22/2022]
Abstract
In state of the art drug discovery, it is essential to gain structural information of pharmacologically relevant proteins. Increasing the output of novel protein structures requires improved preparative methods for high throughput (HT) protein purification. Currently, most HT platforms are limited to small-scale and available technology for increasing throughput at larger scales is scarce. We have adapted a 10-channel parallel flash chromatography system for protein purification applications. The system enables us to perform 10 different purifications in parallel with individual gradients and UV monitoring. Typical protein purification applications were set up. Methods for ion exchange chromatography were developed for different sample proteins and columns. Affinity chromatography was optimized for His-tagged proteins using metal chelating media and buffer exchange by gel filtration was also tested. The results from the present system were comparable, with respect to resolution and reproducibility, with those from control experiments on an AKTA purifier system. Finally, lysates from 10 E. coli cultures expressing different His-tagged proteins were subjected to a three-step parallel purification procedure, combining the above-mentioned procedures. Nine proteins were successfully purified whereas one failed probably due to lack of expression.
Collapse
Affiliation(s)
- Patrik Strömberg
- Global Protein Science and Supply, AstraZeneca R&D Södertälje, SE-15185 Södertälje, Sweden
| | | | | | | |
Collapse
|
17
|
Murthy TVS, Wu W, Qiu QQ, Shi Z, LaBaer J, Brizuela L. Bacterial cell-free system for high-throughput protein expression and a comparative analysis of Escherichia coli cell-free and whole cell expression systems. Protein Expr Purif 2005; 36:217-25. [PMID: 15249043 DOI: 10.1016/j.pep.2004.04.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2004] [Revised: 03/26/2004] [Indexed: 11/16/2022]
Abstract
Sixty-three proteins of Pseudomonas aeruginosa in the size range of 18-159 kDa were tested for expression in a bacterial cell-free system. Fifty-one of the 63 proteins could be expressed and partially purified under denaturing conditions. Most of the expressed proteins showed yields greater than 500 ng after a single affinity purification step from 50 microl in vitro protein synthesis reactions. The in vitro protein expression plus purification in a 96-well format and analysis of the proteins by SDS-PAGE were performed by one person in 4 h. A comparison of in vitro and in vivo expression suggests that despite lower yields and less pure protein preparations, bacterial in vitro protein expression coupled with single-step affinity purification offers a rapid, efficient alternative for the high-throughput screening of clones for protein expression and solubility.
Collapse
Affiliation(s)
- T V S Murthy
- Harvard Institute of Proteomics, 320 Charles street, Cambridge, MA 02141, USA
| | | | | | | | | | | |
Collapse
|
18
|
Labaer J, Qiu Q, Anumanthan A, Mar W, Zuo D, Murthy TVS, Taycher H, Halleck A, Hainsworth E, Lory S, Brizuela L. The Pseudomonas aeruginosa PA01 gene collection. Genome Res 2004; 14:2190-200. [PMID: 15489342 PMCID: PMC528936 DOI: 10.1101/gr.2482804] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Pseudomonas aeruginosa, a common inhabitant of soil and water, is an opportunistic pathogen of growing clinical relevance. Its genome, one of the largest among bacteria [5570 open reading frames (ORFs)] approaches that of simple eukaryotes. We have constructed a comprehensive gene collection for this organism utilizing the annotated genome of P. aeruginosa PA01 and a highly automated and laboratory information management system (LIMS)-supported production line. All the individual ORFs have been successfully PCR-amplified and cloned into a recombination-based cloning system. We have isolated and archived four independent isolates of each individual ORF. Full sequence analysis of the first isolate for one-third of the ORFs in the collection has been completed. We used two sets of genes from this repository for high-throughput expression and purification of recombinant proteins in different systems. The purified proteins have been used to set up biochemical and immunological assays directed towards characterization of histidine kinases and identification of bacterial proteins involved in the immune response of cystic fibrosis patients. This gene repository provides a powerful tool for proteome- and genome-scale research of this organism, and the strategies adopted to generate this repository serve as a model for building clone sets for other bacteria.
Collapse
Affiliation(s)
- Joshua Labaer
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Institute of Proteomics, Cambridge, Massachusetts 02141, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Affiliation(s)
- Yanhui Hu
- Institute of Proteomics, Harvard Medical School, Cambridge, Mass, USA
| | | |
Collapse
|
20
|
Ramachandran N, Hainsworth E, Bhullar B, Eisenstein S, Rosen B, Lau AY, Walter JC, LaBaer J. Self-assembling protein microarrays. Science 2004; 305:86-90. [PMID: 15232106 DOI: 10.1126/science.1097639] [Citation(s) in RCA: 423] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Protein microarrays provide a powerful tool for the study of protein function. However, they are not widely used, in part because of the challenges in producing proteins to spot on the arrays. We generated protein microarrays by printing complementary DNAs onto glass slides and then translating target proteins with mammalian reticulocyte lysate. Epitope tags fused to the proteins allowed them to be immobilized in situ. This obviated the need to purify proteins, avoided protein stability problems during storage, and captured sufficient protein for functional studies. We used the technology to map pairwise interactions among 29 human DNA replication initiation proteins, recapitulate the regulation of Cdt1 binding to select replication proteins, and map its geminin-binding domain.
Collapse
Affiliation(s)
- Niroshan Ramachandran
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 320 Charles Street, Cambridge, MA 02141, USA
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Abstract
The manufacture and use of protein microarrays with correctly folded and functional content presents significant challenges. Despite this, the feasibility and utility of such undertakings are now clear, and exciting progress has recently been demonstrated in the areas of content generation, printing strategies and protein immobilization. More importantly, we are now beginning to enjoy the fruits of these efforts as functional protein microarrays are being increasingly employed for biological discovery purposes. Recent examples of this include the characterization of autoantibody responses, antibody specificity profiling, protein-protein domain interaction profiling and a comprehensive characterization of coiled-coil interactions. The best, however, is yet to come.
Collapse
Affiliation(s)
- Paul F Predki
- Protometrix Inc., 688 E Main St, Branford, CT 06405, USA.
| |
Collapse
|
22
|
Abstract
The information generated from the sequence of the human genome has inspired efforts to systematically develop organized collections of human cDNA clones for use in expression screens in mammalian cells. These high-throughput cloning initiatives offer significant advantages over the cDNA libraries that have been used in the past, including greater experimental flexibility, immediate identification of hits, information regarding all tested proteins (even for those giving no response) and eventually more comprehensive coverage. Some of the lessons learned and the considerations that underlie the creation of genome-wide cDNA repositories are discussed here. Although still inchoate, these resources are already impacting the manner in which high-throughput functional screens are performed.
Collapse
Affiliation(s)
- Joseph Pearlberg
- Harvard Institute of Proteomics, and Department of Biochemistry and Molecular Biology, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
23
|
Doolan DL, Aguiar JC, Weiss WR, Sette A, Felgner PL, Regis DP, Quinones-Casas P, Yates JR, Blair PL, Richie TL, Hoffman SL, Carucci DJ. Utilization of genomic sequence information to develop malaria vaccines. J Exp Biol 2003; 206:3789-802. [PMID: 14506214 DOI: 10.1242/jeb.00615] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
SUMMARYRecent advances in the fields of genomics, proteomics and molecular immunology offer tremendous opportunities for the development of novel interventions against public health threats, including malaria. However, there is currently no algorithm that can effectively identify the targets of protective T cell or antibody responses from genomic data. Furthermore, the identification of antigens that will stimulate the most effective immunity against the target pathogen is problematic, particularly if the genome is large. Malaria is an attractive model for the development and validation of approaches to translate genomic information to vaccine development because of the critical need for effective anti-malarial interventions and because the Plasmodium parasite is a complex multistage pathogen targeted by multiple immune responses. Sterile protective immunity can be achieved by immunization with radiation-attenuated sporozoites, and anti-disease immunity can be induced in residents in malaria-endemic areas. However, the 23 Mb Plasmodium falciparum genome encodes more than 5300 proteins, each of which is a potential target of protective immune responses. The current generation of subunit vaccines is based on a single or few antigens and therefore might elicit too narrow a breadth of response. We are working towards the development of a new generation vaccine based on the presumption that duplicating the protection induced by the whole organism may require a vaccine nearly as complex as the organism itself. Here, we present our strategy to exploit the genomic sequence of P. falciparum for malaria vaccine development.
Collapse
Affiliation(s)
- D L Doolan
- Malaria Program, Naval Medical Research Center, Silver Spring, MD 20910-7500, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Sharma A, Antoku S, Fujiwara K, Mayer BJ. Functional interaction trap: a strategy for validating the functional consequences of tyrosine phosphorylation of specific substrates in vivo. Mol Cell Proteomics 2003; 2:1217-24. [PMID: 14519720 DOI: 10.1074/mcp.m300078-mcp200] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Protein tyrosine phosphorylation controls diverse signaling pathways, and disregulated tyrosine kinase activity plays a direct role in human diseases such as cancer. Because activated kinases exert their effects by phosphorylating multiple substrate proteins, it is difficult or impossible to assess experimentally the contribution of a particular substrate to a cellular response or activity. To overcome this problem, we have developed a novel approach termed the "functional interaction trap," in which two proteins are induced to interact in a pairwise fashion through an engineered, highly specific binding interface. We show that the functional interaction trap can be used to direct a modified tyrosine kinase to specifically phosphorylate a single substrate of choice in vivo, permitting analysis of the resulting biological output. This strategy provides a powerful tool for validating the functional significance of tyrosine phosphorylation and other post-translational modifications identified by proteomic discovery efforts.
Collapse
Affiliation(s)
- Alok Sharma
- Department of Genetics and Developmental Biology, University of Connecticut Health Center, 263 Farmington Avenue, Farmington, CT 06030-3301, USA
| | | | | | | |
Collapse
|
25
|
Zacchi P, Sblattero D, Florian F, Marzari R, Bradbury ARM. Selecting open reading frames from DNA. Genome Res 2003; 13:980-90. [PMID: 12727911 PMCID: PMC430925 DOI: 10.1101/gr.861503] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We describe a method to select DNA encoding functional open reading frames (ORFs) from noncoding DNA within the context of a specific vector. Phage display has been used as an example, but any system requiring DNA encoding protein fragments, for example, the yeast two-hybrid system, could be used. By cloning DNA fragments upstream of a fusion gene, consisting of the beta-lactamase gene flanked by lox recombination sites, which is, in turn, upstream of gene 3 from fd phage, only those clones containing DNA fragments encoding ORFs confer ampicillin resistance and survive. After selection, the beta-lactamase gene can be removed by Cre recombinase, leaving a standard phage display vector with ORFs fused to gene 3. This vector has been tested on a plasmid containing tissue transglutaminase. All surviving clones analyzed by sequencing were found to contain ORFs, of which 83% were localized to known genes, and at least 80% produced immunologically detectable polypeptides. Use of a specific anti-tTG monoclonal antibody allowed the identification of clones containing the correct epitope. This approach could be applicable to the efficient selection of random ORFs representing the coding potential of whole organisms, and their subsequent downstream use in a number of different systems.
Collapse
|
26
|
Hosfield D, Palan J, Hilgers M, Scheibe D, McRee DE, Stevens RC. A fully integrated protein crystallization platform for small-molecule drug discovery. J Struct Biol 2003; 142:207-17. [PMID: 12718932 DOI: 10.1016/s1047-8477(03)00051-0] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Structure-based drug discovery in the pharmaceutical industry benefits from cost-efficient methodologies that quickly assess the feasibility of specific, often refractory, protein targets to form well-diffracting crystals. By tightly coupling construct and purification diversity with nanovolume crystallization, the Structural Biology Group at Syrrx has developed such a platform to support its small-molecule drug-discovery program. During the past 18 months of operation at Syrrx, the Structural Biology Group has executed several million crystallization and imaging trials on over 400 unique drug-discovery targets. Here, key components of the platform, as well as an analysis of some experimental results that allowed for platform optimization, will be described.
Collapse
Affiliation(s)
- David Hosfield
- Syrrx, Inc., 10410 Science Center Drive, San Diego, CA 92121, USA
| | | | | | | | | | | |
Collapse
|
27
|
Mandell JW, Manabe RI, Horwitz AF, Baumgart JP. Fluorescence imaging of mobility shifts: an expression cloning method for identification of cell signaling targets. J Transl Med 2002; 82:1631-6. [PMID: 12480913 DOI: 10.1097/01.lab.0000041711.57606.ab] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
There is a need for a simple global approach to identify signaling targets that are posttranslationally modified in response to physiologic or pathologic stimuli within living cells. Reported here is a simple method, fluorescence imaging of mobility shifts (FIMS), which relies on in-gel detection of cell-expressed green fluorescent protein fusion proteins undergoing electrophoretic mobility shifts. This detection method is applied to a small pool cDNA library screening protocol. The readout is essentially a differential display of posttranslational modifications. Unlike biochemical approaches to identifying signaling targets, the screen is performed in living cells using standard methods for transient transfection. This enables detection of intracellular targets modified in response to either molecularly defined stimuli, such as growth factors or drugs, or complex pathologic stimuli, such as oxidative stress or hypoglycemia. FIMS is rapid, sensitive, inexpensive, and nonradioactive and easily adapted to automated high throughput methods, including capillary electrophoresis. The technique is sufficiently sensitive to easily detect fluorescent proteins expressed in a single well in 384-well format. FIMS is applicable to traditional cDNA library screening, but the method will be especially attractive for screening preselected collections of autofluorescent fusion proteins. A bonus of the technique is that examination of transfected cells by fluorescence microscopy provides immediate information about intracellular localization and stimulus-induced translocation of putative targets. We illustrate the utility of the technique with pilot screens for apoptotic and mitogenic targets modified by staurosporine and serum stimulation, respectively.
Collapse
Affiliation(s)
- James W Mandell
- Department of Pathology, University of Virginia, Charlottesville, Virginia 22908, USA
| | | | | | | |
Collapse
|
28
|
Abstract
The system-wide study of proteins presents an exciting challenge in this information-rich age of whole-genome biology. Although traditional investigations have yielded abundant information about individual proteins, they have been less successful at providing us with an integrated understanding of biological systems. The promise of proteomics is that, by studying many components simultaneously, we will learn how proteins interact with each other, as well as with non-proteinaceous molecules, to control complex processes in cells, tissues and even whole organisms. Here, I discuss the role of microarray technology in this burgeoning area.
Collapse
Affiliation(s)
- Gavin MacBeath
- Department of Chemistry and Chemical Biology, and Bauer Center for Genomics Research, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, USA.
| |
Collapse
|
29
|
Ilag LL, Ng JH, Beste G, Henning SW. Emerging high-throughput drug target validation technologies. Drug Discov Today 2002; 7:S136-42. [PMID: 12546880 DOI: 10.1016/s1359-6446(02)02429-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Identifying the right target for drug development is a critical bottleneck in the pharmaceutical and biotech industries. The genomics revolution has shifted the problem from a scarcity of targets to a surplus of putative drug targets. As the validity of a target cannot be simply inferred from correlative data, the key is confirmation of the causative role of a gene product in a particular disease. It should therefore be recognized that an effective therapeutic strategy requires an appropriate target validation technology to verify the right target.
Collapse
Affiliation(s)
- Leodevico L Ilag
- Xerion Pharmaceuticals, Fraunhoferstr. 9, 82152 Martinsried, Germany.
| | | | | | | |
Collapse
|
30
|
Abstract
In the past, protein expression has been perceived as the principle bottleneck in protein characterization and structure determination. The challenge now is to rapidly express large numbers of genes in the search for new drug targets and therapeutic proteins encoded by the human genome. In this competitive environment, several high-throughput expression strategies for protein production are being used to industrialize the process of protein expression.
Collapse
Affiliation(s)
- Stephen P Chambers
- Vertex Pharmaceuticals, 130 Washington Street, Cambridge, MA 02139, USA.
| |
Collapse
|
31
|
Brizuela L, Richardson A, Marsischky G, Labaer J. The FLEXGene repository: exploiting the fruits of the genome projects by creating a needed resource to face the challenges of the post-genomic era. Arch Med Res 2002; 33:318-24. [PMID: 12234520 DOI: 10.1016/s0188-4409(02)00372-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Thanks to the results of the multiple completed and ongoing genome sequencing projects and to the newly available recombination-based cloning techniques, it is now possible to build gene repositories with no precedent in their composition, formatting, and potential. This new type of gene repository is necessary to address the challenges imposed by the post-genomic era, i.e., experimentation on a genome-wide scale. We are building the FLEXGene (Full Length EXpression-ready) repository. This unique resource will contain clones representing the complete ORFeome of different organisms, including Homo sapiens as well as several pathogens and model organisms. It will consist of a comprehensive, characterized (sequence-verified), and arrayed gene repository. This resource will allow full exploitation of the genomic information by enabling genome-wide scale experimentation at the level of functional/phenotypic assays as well as at the level of protein expression, purification, and analysis. Here we describe the rationale and construction of this resource and focus on the data obtained from the Saccharomyces cerevisiae project.
Collapse
Affiliation(s)
- Leonardo Brizuela
- Department of Biological Chemistry and Molecular Pharmacology, Institute of Proteomics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | |
Collapse
|
32
|
Braun P, Hu Y, Shen B, Halleck A, Koundinya M, Harlow E, LaBaer J. Proteome-scale purification of human proteins from bacteria. Proc Natl Acad Sci U S A 2002; 99:2654-9. [PMID: 11880620 PMCID: PMC122403 DOI: 10.1073/pnas.042684199] [Citation(s) in RCA: 215] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The completion of the human genome project and the development of high-throughput approaches herald a dramatic acceleration in the pace of biological research. One of the most compelling next steps will be learning the functional roles of all proteins. Achievement of this goal depends in part on the rapid expression and isolation of proteins at large scale. We exploited recombinational cloning to facilitate the development of methods for the high-throughput purification of human proteins. cDNAs were introduced into a master vector from which they could be rapidly transferred into a variety of protein expression vectors for further analysis. A test set of 32 sequence-verified human cDNAs of various sizes and activities was moved into four different expression vectors encoding different affinity-purification tags. By means of an automatable 2-hr protein purification procedure, all 128 proteins were purified and subsequently characterized for yield, purity, and steps at which losses occurred. Under denaturing conditions when the His6 tag was used, 84% of samples were purified. Under nondenaturing conditions, both the glutathione S-transferase and maltose-binding protein tags were successful in 81% of samples. The developed methods were applied to a larger set of 336 randomly selected cDNAs. Sixty percent of these proteins were successfully purified under denaturing conditions and 82% of these under nondenaturing conditions. A relational database, FLEXProt, was built to compare properties of proteins that were successfully purified and proteins that were not. We observed that some domains in the Pfam database were found almost exclusively in proteins that were successfully purified and thus may have predictive character.
Collapse
Affiliation(s)
- Pascal Braun
- Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston, MA 02115, USA
| | | | | | | | | | | | | |
Collapse
|
33
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2002. [PMCID: PMC2448432 DOI: 10.1002/cfg.119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|