1
|
Banerjee S, Fraser K, Crone DE, Patel JC, Bondos SE, Bystroff C. Challenges and Solutions for Leave-One-Out Biosensor Design in the Context of a Rugged Fitness Landscape. SENSORS (BASEL, SWITZERLAND) 2024; 24:6380. [PMID: 39409420 PMCID: PMC11478963 DOI: 10.3390/s24196380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 09/29/2024] [Accepted: 09/29/2024] [Indexed: 10/20/2024]
Abstract
The leave-one-out (LOO) green fluorescent protein (GFP) approach to biosensor design combines computational protein design with split protein reconstitution. LOO-GFPs reversibly fold and gain fluorescence upon encountering the target peptide, which can be redefined by computational design of the LOO site. Such an approach can be used to create reusable biosensors for the early detection of emerging biological threats. Enlightening biophysical inferences for nine LOO-GFP biosensor libraries are presented, with target sequences from dengue, influenza, or HIV, replacing beta strands 7, 8, or 11. An initially low hit rate was traced to components of the energy function, manifesting in the over-rewarding of over-tight side chain packing. Also, screening by colony picking required a low library complexity, but designing a biosensor against a peptide of at least 12 residues requires a high-complexity library. This double-bind was solved using a "piecemeal" iterative design strategy. Also, designed LOO-GFPs fluoresced in the unbound state due to unwanted dimerization, but this was solved by fusing a fully functional prototype LOO-GFP to a fiber-forming protein, Drosophila ultrabithorax, creating a biosensor fiber. One influenza hemagglutinin biosensor is characterized here in detail, showing a shifted excitation/emission spectrum, a micromolar affinity for the target peptide, and an unexpected photo-switching ability.
Collapse
Affiliation(s)
- Shounak Banerjee
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA;
- Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (K.F.); (D.E.C.); (J.C.P.)
| | - Keith Fraser
- Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (K.F.); (D.E.C.); (J.C.P.)
| | - Donna E. Crone
- Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (K.F.); (D.E.C.); (J.C.P.)
| | - Jinal C. Patel
- Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (K.F.); (D.E.C.); (J.C.P.)
| | - Sarah E. Bondos
- Medical Physiology, Texas A&M University, College Station, TX 77843, USA;
| | - Christopher Bystroff
- Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; (K.F.); (D.E.C.); (J.C.P.)
- Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| |
Collapse
|
2
|
Pal A, Mulumudy R, Mitra P. Modularity-based parallel protein design algorithm with an implementation using shared memory programming. Proteins 2021; 90:658-669. [PMID: 34651333 DOI: 10.1002/prot.26263] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/23/2021] [Accepted: 10/01/2021] [Indexed: 01/08/2023]
Abstract
Given a target protein structure, the prime objective of protein design is to find amino acid sequences that will fold/acquire to the given three-dimensional structure. The protein design problem belongs to the non-deterministic polynomial-time-hard class as sequence search space increases exponentially with protein length. To ensure better search space exploration and faster convergence, we propose a protein modularity-based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU). Here, we have incorporated a divide-and-conquer approach where a protein is split into PUs and each PU region is explored in a parallel fashion. It has been further analyzed that our shared memory implementation of modularity-based parallel sequence search leads to better search space exploration compared to the case of traditional full protein design. Sequence-based analysis on design sequences depicts an average of 39.7% sequence similarity on the benchmark data set. Structure-based comparison of the modeled structures of the design protein with the target structure exhibited an average root-mean-square deviation of 1.17 Å and an average template modeling score of 0.89. The selected modeled structures of the design protein sequences are validated using 100 ns molecular dynamics simulations where 80% of the proteins have shown better or similar stability to the respective target proteins. Our study informs that our modularity-based protein design algorithm can be extended to protein interaction design as well.
Collapse
Affiliation(s)
- Abantika Pal
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Rohith Mulumudy
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| |
Collapse
|
3
|
Dauzhenka T, Kundrotas PJ, Vakser IA. Computational Feasibility of an Exhaustive Search of Side-Chain Conformations in Protein-Protein Docking. J Comput Chem 2018; 39:2012-2021. [PMID: 30226647 DOI: 10.1002/jcc.25381] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 03/24/2018] [Accepted: 05/26/2018] [Indexed: 11/07/2022]
Abstract
Protein-protein docking procedures typically perform the global scan of the proteins relative positions, followed by the local refinement of the putative matches. Because of the size of the search space, the global scan is usually implemented as rigid-body search, using computationally inexpensive intermolecular energy approximations. An adequate refinement has to take into account structural flexibility. Since the refinement performs conformational search of the interacting proteins, it is extremely computationally challenging, given the enormous amount of the internal degrees of freedom. Different approaches limit the search space by restricting the search to the side chains, rotameric states, coarse-grained structure representation, principal normal modes, and so on. Still, even with the approximations, the refinement presents an extreme computational challenge due to the very large number of the remaining degrees of freedom. Given the complexity of the search space, the advantage of the exhaustive search is obvious. The obstacle to such search is computational feasibility. However, the growing computational power of modern computers, especially due to the increasing utilization of Graphics Processing Unit (GPU) with large amount of specialized computing cores, extends the ranges of applicability of the brute-force search methods. This proof-of-concept study demonstrates computational feasibility of an exhaustive search of side-chain conformations in protein pocking. The procedure, implemented on the GPU architecture, was used to generate the optimal conformations in a large representative set of protein-protein complexes. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Taras Dauzhenka
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
| | - Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66047
| |
Collapse
|
4
|
Using natural sequences and modularity to design common and novel protein topologies. Curr Opin Struct Biol 2016; 38:26-36. [PMID: 27270240 DOI: 10.1016/j.sbi.2016.05.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 05/13/2016] [Accepted: 05/18/2016] [Indexed: 02/07/2023]
Abstract
Protein design is still a challenging undertaking, often requiring multiple attempts or iterations for success. Typically, the source of failure is unclear, and scoring metrics appear similar between successful and failed cases. Nevertheless, the use of sequence statistics, modularity and symmetry from natural proteins, combined with computational design both at the coarse-grained and atomistic levels is propelling a new wave of design efforts to success. Here we highlight recent examples of design, showing how the wealth of natural protein sequence and topology data may be leveraged to reduce the search space and increase the likelihood of achieving desired outcomes.
Collapse
|
5
|
Huang YM, Banerjee S, Crone DE, Schenkelberg CD, Pitman DJ, Buck PM, Bystroff C. Toward Computationally Designed Self-Reporting Biosensors Using Leave-One-Out Green Fluorescent Protein. Biochemistry 2015; 54:6263-73. [PMID: 26397806 DOI: 10.1021/acs.biochem.5b00786] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Leave-one-out green fluorescent protein (LOOn-GFP) is a circularly permuted and truncated GFP lacking the nth β-strand element. LOO7-GFP derived from the wild-type sequence (LOO7-WT) folds and reconstitutes fluorescence upon addition of β-strand 7 (S7) as an exogenous peptide. Computational protein design may be used to modify the sequence of LOO7-GFP to fit a different peptide sequence, while retaining the reconstitution activity. Here we present a computationally designed leave-one-out GFP in which wild-type strand 7 has been replaced by a 12-residue peptide (HA) from the H5 antigenic region of the Thailand strain of H5N1 influenza virus hemagglutinin. The DEEdesign software was used to generate a sequence library with mutations at 13 positions around the peptide, coding for approximately 3 × 10(5) sequence combinations. The library was coexpressed with the HA peptide in E. coli and colonies were screened for in vivo fluorescence. Glowing colonies were sequenced, and one (LOO7-HA4) with 7 mutations was purified and characterized. LOO7-HA4 folds, fluoresces in vivo and in vitro, and binds HA. However, binding results in a decrease in fluorescence instead of the expected increase, caused by the peptide-induced dissociation of a novel, glowing oligomeric complex instead of the reconstitution of the native structure. Efforts to improve binding and recover reconstitution using in vitro evolution produced colonies that glowed brighter and matured faster. Two of these were characterized. One lost all affinity for the HA peptide but glowed more brightly in the unbound oligomeric state. The other increased in affinity to the HA peptide but still did not reconstitute the fully folded state. Despite failing to fold completely, peptide binding by computational design was observed and was improved by directed evolution. The ratio of HA to S7 binding increased from 0.0 for the wild-type sequence (no binding) to 0.01 after computational design (weak binding) and to 0.48 (comparable binding) after in vitro evolution. The novel oligomeric state is composed of an open barrel.
Collapse
Affiliation(s)
- Yao-Ming Huang
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco , San Francisco, California 94158, United States
| | | | | | | | | | | | | |
Collapse
|
6
|
Zhou Y, Xu W, Donald BR, Zeng J. An efficient parallel algorithm for accelerating computational protein design. ACTA ACUST UNITED AC 2014; 30:i255-i263. [PMID: 24931991 PMCID: PMC4058937 DOI: 10.1093/bioinformatics/btu264] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Motivation: Structure-based computational protein design (SCPR) is an important topic in protein engineering. Under the assumption of a rigid backbone and a finite set of discrete conformations of side-chains, various methods have been proposed to address this problem. A popular method is to combine the dead-end elimination (DEE) and A* tree search algorithms, which provably finds the global minimum energy conformation (GMEC) solution. Results: In this article, we improve the efficiency of computing A* heuristic functions for protein design and propose a variant of A* algorithm in which the search process can be performed on a single GPU in a massively parallel fashion. In addition, we make some efforts to address the memory exceeding problem in A* search. As a result, our enhancements can achieve a significant speedup of the A*-based protein design algorithm by four orders of magnitude on large-scale test data through pre-computation and parallelization, while still maintaining an acceptable memory overhead. We also show that our parallel A* search algorithm could be successfully combined with iMinDEE, a state-of-the-art DEE criterion, for rotamer pruning to further improve SCPR with the consideration of continuous side-chain flexibility. Availability: Our software is available and distributed open-source under the GNU Lesser General License Version 2.1 (GNU, February 1999). The source code can be downloaded from http://www.cs.duke.edu/donaldlab/osprey.php or http://iiis.tsinghua.edu.cn/∼compbio/software.html. Contact:zengjy321@tsinghua.edu.cn Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yichao Zhou
- Institute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, P. R. China, Department of Computer Science, Duke University, Durham, NC 27708, USA and Department of Biochemistry, Duke University Medical Center, Durham, NC 27708, USA
| | - Wei Xu
- Institute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, P. R. China, Department of Computer Science, Duke University, Durham, NC 27708, USA and Department of Biochemistry, Duke University Medical Center, Durham, NC 27708, USA
| | - Bruce R Donald
- Institute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, P. R. China, Department of Computer Science, Duke University, Durham, NC 27708, USA and Department of Biochemistry, Duke University Medical Center, Durham, NC 27708, USAInstitute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, P. R. China, Department of Computer Science, Duke University, Durham, NC 27708, USA and Department of Biochemistry, Duke University Medical Center, Durham, NC 27708, USA
| | - Jianyang Zeng
- Institute for Theoretical Computer Science (ITCS), Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, P. R. China, Department of Computer Science, Duke University, Durham, NC 27708, USA and Department of Biochemistry, Duke University Medical Center, Durham, NC 27708, USA
| |
Collapse
|