1
|
Gusev VV, Adamson D, Deligkas A, Antypov D, Collins CM, Krysta P, Potapov I, Darling GR, Dyer MS, Spirakis P, Rosseinsky MJ. Optimality guarantees for crystal structure prediction. Nature 2023; 619:68-72. [PMID: 37407679 DOI: 10.1038/s41586-023-06071-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 04/04/2023] [Indexed: 07/07/2023]
Abstract
Crystalline materials enable essential technologies, and their properties are determined by their structures. Crystal structure prediction can thus play a central part in the design of new functional materials1,2. Researchers have developed efficient heuristics to identify structural minima on the potential energy surface3-5. Although these methods can often access all configurations in principle, there is no guarantee that the lowest energy structure has been found. Here we show that the structure of a crystalline material can be predicted with energy guarantees by an algorithm that finds all the unknown atomic positions within a unit cell by combining combinatorial and continuous optimization. We encode the combinatorial task of finding the lowest energy periodic allocation of all atoms on a lattice as a mathematical optimization problem of integer programming6,7, enabling guaranteed identification of the global optimum using well-developed algorithms. A single subsequent local minimization of the resulting atom allocations then reaches the correct structures of key inorganic materials directly, proving their energetic optimality under clear assumptions. This formulation of crystal structure prediction establishes a connection to the theory of algorithms and provides the absolute energetic status of observed or predicted materials. It provides the ground truth for heuristic or data-driven structure prediction methods and is uniquely suitable for quantum annealers8-10, opening a path to overcome the combinatorial explosion of atomic configurations.
Collapse
Affiliation(s)
- Vladimir V Gusev
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
- Department of Computer Science, University of Liverpool, Liverpool, UK
| | - Duncan Adamson
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | - Argyrios Deligkas
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
- Department of Computer Science, Royal Holloway, University of London, London, UK
| | - Dmytro Antypov
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
| | | | - Piotr Krysta
- Department of Computer Science, University of Liverpool, Liverpool, UK
| | - Igor Potapov
- Department of Computer Science, University of Liverpool, Liverpool, UK
| | | | - Matthew S Dyer
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK
- Department of Chemistry, University of Liverpool, Liverpool, UK
| | - Paul Spirakis
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK.
- Department of Computer Science, University of Liverpool, Liverpool, UK.
| | - Matthew J Rosseinsky
- Leverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Liverpool, UK.
- Department of Chemistry, University of Liverpool, Liverpool, UK.
| |
Collapse
|
2
|
Wüthrich K. Brownian motion, spin diffusion and protein structure determination in solution. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2021; 331:107031. [PMID: 34391647 DOI: 10.1016/j.jmr.2021.107031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 06/28/2021] [Accepted: 06/29/2021] [Indexed: 06/13/2023]
Abstract
This paper presents my recollections on the development of protein structure determination by NMR in solution from 1968 to 1992. The key to success was to identify NMR-accessible parameters that unambiguously determine the spatial arrangement of polypeptide chains. Inspired by work with cyclopeptides, model considerations showed that enforcing short non-bonding interatomic distances imposes «ring closure conditions» on polypeptide chains. Given that distances are scalar parameters, this indicated an avenue for studies of proteins in solution, i.e., under the regime of stochastic rotational and translational motions at frequencies in the nanosecond range (Brownian motion), where sharp pictures could not be obtained by photography-related methods. Later-on, we used distance geometry calculations with sets of inter-atomic distances derived from protein crystal structures to confirm that measurements of short proton-proton distances could provide atomic-resolution structures of globular proteins. During the years 1976-1984 the following four lines of research then led to protein structure determination by NMR in solution. First, the development of NMR experiments enabling the use of the nuclear Overhauser effect (NOE) for measurements of interatomic distances between pairs of hydrogen atoms in proteins. Second, obtaining sequence-specific resonance assignment solved the "phase problem" for protein structure determination by NMR. Third, generating and programming novel distance geometry algorithms enabled the calculation of atomic-resolution protein structures from limited sets of distance constraints measured by NMR. Fourth, the introduction of two-dimensional NMR provided greatly improved spectral resolution of the complex spectra of proteins as well as efficient delineation of scalar and dipole-dipole 1H-1H connectivities, thus making protein structure determination in solution viable and attractive.
Collapse
Affiliation(s)
- Kurt Wüthrich
- ETH Zürich, Zürich Switzerland and Scripps Research, La Jolla, CA, USA
| |
Collapse
|
3
|
Influence of cross-linker polarity on selectivity towards lysine side chains. J Proteomics 2020; 218:103716. [DOI: 10.1016/j.jprot.2020.103716] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Revised: 02/02/2020] [Accepted: 02/19/2020] [Indexed: 11/19/2022]
|
4
|
Quantum computing based hybrid solution strategies for large-scale discrete-continuous optimization problems. Comput Chem Eng 2020. [DOI: 10.1016/j.compchemeng.2019.106630] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
5
|
Yoshikawa N, Hutchison GR. Fast, efficient fragment-based coordinate generation for Open Babel. J Cheminform 2019; 11:49. [PMID: 31372768 PMCID: PMC6676618 DOI: 10.1186/s13321-019-0372-5] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Accepted: 07/23/2019] [Indexed: 12/19/2022] Open
Abstract
Rapidly predicting an accurate three dimensional geometry of a molecule is a crucial task for cheminformatics and across a wide range of molecular modeling. Consequently, developing a fast, accurate, and open implementation of structure prediction is necessary for reproducible cheminformatics research. We introduce a fragment-based coordinate generation implementation for Open Babel, a widely-used open source toolkit for cheminformatics. The new implementation improves speed and stereochemical accuracy, while retaining or improving accuracy of bond lengths, bond angles, and dihedral torsions. Input molecules are broken into fragments by cutting at rotatable bonds. The coordinates of fragments are set according to a fragment library, prepared from open crystallographic databases. Since the coordinates of multiple atoms are decided at once, coordinate prediction is accelerated over the previous rules-based implementation in Open Babel, as well as the widely-used distance geometry methods in RDKit. This new implementation will be beneficial for a wide range of applications, including computational property prediction in polymers, molecular materials and drug design.
Collapse
Affiliation(s)
- Naruki Yoshikawa
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Geoffrey R Hutchison
- Department of Chemistry and Chemical Engineering, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, PA, 15260, USA.
| |
Collapse
|
6
|
Lavor C, Liberti L, Donald B, Worley B, Bardiaux B, Malliavin TE, Nilges M. Minimal NMR distance information for rigidity of protein graphs. DISCRETE APPLIED MATHEMATICS (AMSTERDAM, NETHERLANDS : 1988) 2019; 256:91-104. [PMID: 30799888 PMCID: PMC6380886 DOI: 10.1016/j.dam.2018.03.071] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Nuclear Magnetic Resonance (NMR) experiments provide distances between nearby atoms of a protein molecule. The corresponding structure determination problem is to determine the 3D protein structure by exploiting such distances. We present a new order on the atoms of the protein, based on information from the chemistry of proteins and NMR experiments, which allows us to formulate the problem as a combinatorial search. Additionally, this order tells us what kind of NMR distance information is crucial to understand the cardinality of the solution set of the problem and its computational complexity.
Collapse
Affiliation(s)
- Carlile Lavor
- University of Campinas (IMECC-UNICAMP), 13081-970, Campinas - SP, Brazil
| | - Leo Liberti
- CNRS LIX, École Polytechnique, 91128 Palaiseau, France
| | - Bruce Donald
- Duke University, Department of Computer Science, Durham, NC 27708-0129, USA
| | - Bradley Worley
- Institut Pasteur, Structural Bioinformatics Unit, 25 rue du Dr. Roux, 75015 Paris, France
- CNRS UMR3528, 25 rue du Dr. Roux, 75015 Paris, France
| | - Benjamin Bardiaux
- Institut Pasteur, Structural Bioinformatics Unit, 25 rue du Dr. Roux, 75015 Paris, France
- CNRS UMR3528, 25 rue du Dr. Roux, 75015 Paris, France
| | - Thérèse E Malliavin
- Institut Pasteur, Structural Bioinformatics Unit, 25 rue du Dr. Roux, 75015 Paris, France
- CNRS UMR3528, 25 rue du Dr. Roux, 75015 Paris, France
| | - Michael Nilges
- Institut Pasteur, Structural Bioinformatics Unit, 25 rue du Dr. Roux, 75015 Paris, France
- CNRS UMR3528, 25 rue du Dr. Roux, 75015 Paris, France
| |
Collapse
|
7
|
Rozbeský D, Rosůlek M, Kukačka Z, Chmelík J, Man P, Novák P. Impact of Chemical Cross-Linking on Protein Structure and Function. Anal Chem 2018; 90:1104-1113. [DOI: 10.1021/acs.analchem.7b02863] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Daniel Rozbeský
- Institute of Microbiology, v.v.i., Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| | - Michal Rosůlek
- Institute of Microbiology, v.v.i., Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| | - Zdeněk Kukačka
- Institute of Microbiology, v.v.i., Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| | - Josef Chmelík
- Institute of Microbiology, v.v.i., Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| | - Petr Man
- Institute of Microbiology, v.v.i., Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| | - Petr Novák
- Institute of Microbiology, v.v.i., Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles University in Prague, 12843 Prague, Czech Republic
| |
Collapse
|
8
|
An Exhaustive Search Algorithm to Aid NMR-Based Structure Determination of Rotationally Symmetric Transmembrane Oligomers. Sci Rep 2017; 7:17373. [PMID: 29234103 PMCID: PMC5727114 DOI: 10.1038/s41598-017-17639-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 11/15/2017] [Indexed: 11/26/2022] Open
Abstract
Nuclear magnetic resonance (NMR) has been an important source of structural restraints for solving structures of oligomeric transmembrane domains (TMDs) of cell surface receptors and viral membrane proteins. In NMR studies, oligomers are assembled using inter-protomer distance restraints. But, for oligomers that are higher than dimer, these distance restraints all have two-fold directional ambiguity, and resolving such ambiguity often requires time-consuming trial-and-error calculations using restrained molecular dynamics (MD) with simulated annealing (SA). We report an Exhaustive Search algorithm for Symmetric Oligomer (ExSSO), which can perform near-complete search of the symmetric conformational space in a very short time. In this approach, the predetermined protomer model is subject to full angular and spatial search within the symmetry space. This approach, which can be applied to any rotationally symmetric oligomers, was validated using the structures of the Fas death receptor, the HIV-1 gp41 fusion protein, the influenza proton channel, and the MCU pore. The algorithm is able to generate approximate oligomer solutions quickly as initial inputs for further refinement using the MD/SA method.
Collapse
|
9
|
Konopka BM, Ciombor M, Kurczynska M, Kotulska M. Automated procedure for contact-map-based protein structure reconstruction. J Membr Biol 2014; 247:409-20. [PMID: 24682239 PMCID: PMC3983884 DOI: 10.1007/s00232-014-9648-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Accepted: 03/04/2014] [Indexed: 11/25/2022]
Abstract
Knowledge of the three-dimensional structures of ion channels allows for modeling their conductivity characteristics using biophysical models and can lead to discovering their cellular functionality. Recent studies show that quality of structure predictions can be significantly improved using protein contact site information. Therefore, a number of procedures for protein structure prediction based on their contact-map have been proposed. Their comparison is difficult due to different methodologies used for validation. In this work, a Contact Map-to-Structure pipeline (C2S_pipeline) for contact-based protein structure reconstruction is designed and validated. The C2S_pipeline can be used to reconstruct monomeric and multimeric proteins. The median RMSD of structures obtained during validation on a representative set of protein structures, equaled 5.27 Å, and the best structure was reconstructed with RMSD of 1.59 Å. The validation is followed by a detailed case study on the KcsA ion channel. Models of KcsA are reconstructed based on different portions of contact site information. Structural feature analysis of acquired KcsA models is supported by a thorough analysis of electrostatic potential distributions inside the channels. The study shows that electrostatic parameters are correlated with structural quality of models. Therefore, they can be used to discriminate between high and low quality structures. We show that 30 % of contact information is needed to obtain accurate structures of KcsA, if contacts are selected randomly. This number increases to 70 % in case of erroneous maps in which the remaining contacts or non-contacts are changed to the opposite. Furthermore, the study reveals that local reconstruction accuracy is correlated with the number of contacts in which amino acid are involved. This results in higher reconstruction accuracy in the structure core than peripheral regions.
Collapse
Affiliation(s)
- Bogumil M Konopka
- Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wrocław, Poland
| | | | | | | |
Collapse
|
10
|
Sandler I, Zigdon N, Levy E, Aharoni A. The functional importance of co-evolving residues in proteins. Cell Mol Life Sci 2014; 71:673-82. [PMID: 23995987 PMCID: PMC11113390 DOI: 10.1007/s00018-013-1458-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2013] [Revised: 07/26/2013] [Accepted: 08/13/2013] [Indexed: 10/26/2022]
Abstract
Computational approaches for detecting co-evolution in proteins allow for the identification of protein-protein interaction networks in different organisms and the assignment of function to under-explored proteins. The detection of co-variation of amino acids within or between proteins, moreover, allows for the discovery of residue-residue contacts and highlights functional residues that can affect the binding affinity, catalytic activity, or substrate specificity of a protein. To explore the functional impact of co-evolutionary changes in proteins, a combined experimental and computational approach must be recruited. Here, we review recent studies that apply computational and experimental tools to obtain novel insight into the structure, function, and evolution of proteins. Specifically, we describe the application of co-evolutionary analysis for predicting high-resolution three-dimensional structures of proteins. In addition, we describe computational approaches followed by experimental analysis for identifying specificity-determining residues in proteins. Finally, we discuss studies addressing the importance of such residues in terms of the functional divergence of proteins, allowing proteins to evolve new functions while avoiding crosstalk with existing cellular pathways or forming reproductive barriers and hence promoting speciation.
Collapse
Affiliation(s)
- Inga Sandler
- Department of Life Sciences, Ben-Gurion University of the Negev, 84105 Be’er Sheva, Israel
| | - Nitzan Zigdon
- Department of Life Sciences, Ben-Gurion University of the Negev, 84105 Be’er Sheva, Israel
| | - Efrat Levy
- Department of Life Sciences, Ben-Gurion University of the Negev, 84105 Be’er Sheva, Israel
| | - Amir Aharoni
- Department of Life Sciences, Ben-Gurion University of the Negev, 84105 Be’er Sheva, Israel
- National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University of the Negev, 84105 Be’er Sheva, Israel
| |
Collapse
|
11
|
Abstract
There is a wide gap between the generation of large-scale biological data sets and more-detailed, structural and mechanistic studies. However, recent studies that explicitly combine data from systems and structural biological approaches are having a profound effect on our ability to predict how mutations and small molecules affect atomic-level mechanisms, disrupt systems-level networks, and ultimately lead to changes in organismal fitness. In fact, we argue that a shared framework for analysis of nonadditive genetic and thermodynamic responses to perturbations will accelerate the integration of reductionist and global approaches. A stronger bridge between these two areas will allow for a deeper and more-complete understanding of complex biological phenomenon and ultimately provide needed breakthroughs in biomedical research.
Collapse
Affiliation(s)
- James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA.
| | | | | |
Collapse
|
12
|
Protein structure prediction from sequence variation. Nat Biotechnol 2013; 30:1072-80. [PMID: 23138306 DOI: 10.1038/nbt.2419] [Citation(s) in RCA: 423] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Accepted: 10/15/2012] [Indexed: 02/07/2023]
Abstract
Genomic sequences contain rich evolutionary information about functional constraints on macromolecules such as proteins. This information can be efficiently mined to detect evolutionary couplings between residues in proteins and address the long-standing challenge to compute protein three-dimensional structures from amino acid sequences. Substantial progress has recently been made on this problem owing to the explosive growth in available sequences and the application of global statistical methods. In addition to three-dimensional structure, the improved understanding of covariation may help identify functional residues involved in ligand binding, protein-complex formation and conformational changes. We expect computation of covariation patterns to complement experimental structural biology in elucidating the full spectrum of protein structures, their functional interactions and evolutionary dynamics.
Collapse
|
13
|
Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C, Marks DS. Three-dimensional structures of membrane proteins from genomic sequencing. Cell 2012; 149:1607-21. [PMID: 22579045 DOI: 10.1016/j.cell.2012.04.012] [Citation(s) in RCA: 378] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2012] [Revised: 04/12/2012] [Accepted: 04/23/2012] [Indexed: 01/21/2023]
Abstract
We show that amino acid covariation in proteins, extracted from the evolutionary sequence record, can be used to fold transmembrane proteins. We use this technique to predict previously unknown 3D structures for 11 transmembrane proteins (with up to 14 helices) from their sequences alone. The prediction method (EVfold_membrane) applies a maximum entropy approach to infer evolutionary covariation in pairs of sequence positions within a protein family and then generates all-atom models with the derived pairwise distance constraints. We benchmark the approach with blinded de novo computation of known transmembrane protein structures from 23 families, demonstrating unprecedented accuracy of the method for large transmembrane proteins. We show how the method can predict oligomerization, functional sites, and conformational changes in transmembrane proteins. With the rapid rise in large-scale sequencing, more accurate and more comprehensive information on evolutionary constraints can be decoded from genetic variation, greatly expanding the repertoire of transmembrane proteins amenable to modeling by this method.
Collapse
Affiliation(s)
- Thomas A Hopf
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | |
Collapse
|
14
|
Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C. Protein 3D structure computed from evolutionary sequence variation. PLoS One 2011; 6:e28766. [PMID: 22163331 PMCID: PMC3233603 DOI: 10.1371/journal.pone.0028766] [Citation(s) in RCA: 731] [Impact Index Per Article: 56.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2011] [Accepted: 11/14/2011] [Indexed: 11/19/2022] Open
Abstract
The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes.
Collapse
Affiliation(s)
- Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America.
| | | | | | | | | | | | | |
Collapse
|
15
|
Shibberu Y, Holder A. A spectral approach to protein structure alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:867-875. [PMID: 21301031 DOI: 10.1109/tcbb.2011.24] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
A new intrinsic geometry based on a spectral analysis is used to motivate methods for aligning protein folds. The geometry is induced by the fact that a distance matrix can be scaled so that its eigenvalues are positive. We provide a mathematically rigorous development of the intrinsic geometry underlying our spectral approach and use it to motivate two alignment algorithms. The first uses eigenvalues alone and dynamic programming to quickly compute a fold alignment. Family identification results are reported for the Skolnick40 and Proteus300 data sets. The second algorithm extends our spectral method by iterating between our intrinsic geometry and the 3D geometry of a fold to make high-quality alignments. Results and comparisons are reported for several difficult fold alignments. The second algorithm's ability to correctly identify fold families in the Skolnick40 and Proteus300 data sets is also established.
Collapse
Affiliation(s)
- Yosi Shibberu
- Department of Mathematics, Rose-Hulman Institute of Technology, 5500 Wabash Avenue, Terre Haute, IN 47803, USA.
| | | |
Collapse
|
16
|
Liu T, Horst JA, Samudrala R. A novel method for predicting and using distance constraints of high accuracy for refining protein structure prediction. Proteins 2009; 77:220-34. [PMID: 19422061 DOI: 10.1002/prot.22434] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints-based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics-based residue-specific all-atom probability discriminatory function (RAPDF) to discriminate native-like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native-like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement.
Collapse
Affiliation(s)
- Tianyun Liu
- Department of Genetics, Stanford University, Stanford, California, USA
| | | | | |
Collapse
|
17
|
Kloczkowski A, Jernigan RL, Wu Z, Song G, Yang L, Kolinski A, Pokarowski P. Distance matrix-based approach to protein structure prediction. ACTA ACUST UNITED AC 2009; 10:67-81. [PMID: 19224393 DOI: 10.1007/s10969-009-9062-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2008] [Accepted: 02/01/2009] [Indexed: 10/21/2022]
Abstract
Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent motions for both bound and unbound structures. There are significant similarities between the first few key motions and the first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM) that is based on the contact matrix C (related to D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement. We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been successfully used by us in 2006 in the CASPR structure refinement (http://predictioncenter.org/caspR).
Collapse
Affiliation(s)
- Andrzej Kloczkowski
- Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Bldg, Ames, IA 50011-3020, USA.
| | | | | | | | | | | | | |
Collapse
|
18
|
Schedlbauer A, Auer R, Ledolter K, Tollinger M, Kloiber K, Lichtenecker R, Ruedisser S, Hommel U, Schmid W, Konrat R, Kontaxis G. Direct methods and residue type specific isotope labeling in NMR structure determination and model-driven sequential assignment. JOURNAL OF BIOMOLECULAR NMR 2008; 42:111-127. [PMID: 18762865 DOI: 10.1007/s10858-008-9268-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2008] [Revised: 08/08/2008] [Accepted: 08/13/2008] [Indexed: 05/26/2023]
Abstract
Direct methods in NMR based structure determination start from an unassigned ensemble of unconnected gaseous hydrogen atoms. Under favorable conditions they can produce low resolution structures of proteins. Usually a prohibitively large number of NOEs is required, to solve a protein structure ab-initio, but even with a much smaller set of distance restraints low resolution models can be obtained which resemble a protein fold. One problem is that at such low resolution and in the absence of a force field it is impossible to distinguish the correct protein fold from its mirror image. In a hybrid approach these ambiguous models have the potential to aid in the process of sequential backbone chemical shift assignment when (13)C(beta) and (13)C' shifts are not available for sensitivity reasons. Regardless of the overall fold they enhance the information content of the NOE spectra. These, combined with residue specific labeling and minimal triple-resonance data using (13)C(alpha) connectivity can provide almost complete sequential assignment. Strategies for residue type specific labeling with customized isotope labeling patterns are of great advantage in this context. Furthermore, this approach is to some extent error-tolerant with respect to data incompleteness, limited precision of the peak picking, and structural errors caused by misassignment of NOEs.
Collapse
Affiliation(s)
- Andreas Schedlbauer
- Institute of Biomolecular Structural Chemistry, Max F. Perutz Laboratories, University of Vienna, Campus Vienna Biocenter 5/1, Vienna, Austria
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Latek D, Kolinski A. Contact prediction in protein modeling: scoring, folding and refinement of coarse-grained models. BMC STRUCTURAL BIOLOGY 2008; 8:36. [PMID: 18694501 PMCID: PMC2527566 DOI: 10.1186/1472-6807-8-36] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2008] [Accepted: 08/11/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND Several different methods for contact prediction succeeded within the Sixth Critical Assessment of Techniques for Protein Structure Prediction (CASP6). The most relevant were non-local contact predictions for targets from the most difficult categories: fold recognition-analogy and new fold. Such contacts could provide valuable structural information in case a template structure cannot be found in the PDB. RESULTS We described comprehensive tests of the effectiveness of contact data in various aspects of de novo modeling with CABS, an algorithm which was used successfully in CASP6 by the Kolinski-Bujnicki group. We used the predicted contacts in a simple scoring function for the post-simulation ranking of protein models and as a soft bias in the folding simulations and in the fold-refinement procedure. The latter approach turned out to be the most successful. The CABS force field used in the Replica Exchange Monte Carlo simulations cooperated with the true contacts and discriminated the false ones, which resulted in an improvement of the majority of Kolinski-Bujnicki's protein models. In the modeling we tested different sets of predicted contact data submitted to the CASP6 server. According to our results, the best performing were the contacts with the accuracy balanced with the coverage, obtained either from the best two predictors only or by a consensus from as many predictors as possible. CONCLUSION Our tests have shown that theoretically predicted contacts can be very beneficial for protein structure prediction. Depending on the protein modeling method, a contact data set applied should be prepared with differently balanced coverage and accuracy of predicted contacts. Namely, high coverage of contact data is important for the model ranking and high accuracy for the folding simulations.
Collapse
Affiliation(s)
- Dorota Latek
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
20
|
|
21
|
Choi IG, Kim SH. Evolution of protein structural classes and protein sequence families. Proc Natl Acad Sci U S A 2006; 103:14056-61. [PMID: 16959887 PMCID: PMC1560931 DOI: 10.1073/pnas.0606239103] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four "allowed" regions of the protein structure space, "structural selection," an important component of natural selection in gene evolution. Our studies on tracing the "common structural ancestor" for each protein sequence family of known structure suggest that: (i) recently emerged proteins belong mostly to three classes; (ii) the proteins that emerged earlier evolved to gain a new class; and (iii) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were "born" at different evolutionary times, not traceable to one or two ancestral proteins: "the multiple birth model" for the evolution of protein sequence families.
Collapse
Affiliation(s)
- In-Geol Choi
- *Physical Biosciences Division, Lawrence Berkeley National Laboratory, and
| | - Sung-Hou Kim
- *Physical Biosciences Division, Lawrence Berkeley National Laboratory, and
- Department of Chemistry, University of California, Berkeley, CA 94720
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
22
|
Huang YJ, Tejero R, Powers R, Montelione GT. A topology-constrained distance network algorithm for protein structure determination from NOESY data. Proteins 2006; 62:587-603. [PMID: 16374783 DOI: 10.1002/prot.20820] [Citation(s) in RCA: 113] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This article formulates the multidimensional nuclear Overhauser effect spectroscopy (NOESY) interpretation problem using graph theory and presents a novel, bottom-up, topology-constrained distance network analysis algorithm for NOESY cross peak interpretation using assigned resonances. AutoStructure is a software suite that implements this topology-constrained distance network analysis algorithm and iteratively generates structures using the three-dimensional (3D) protein structure calculation programs XPLOR/CNS or DYANA. The minimum input for AutoStructure includes the amino acid sequence, a list of resonance assignments, and lists of 2D, 3D, and/or 4D-NOESY cross peaks. AutoStructure can also analyze homodimeric proteins when X-filtered NOESY experiments are available. The quality of input data and final 3D structures is evaluated using recall, precision, and F-measure (RPF) scores, a statistical measure of goodness of fit with the input data. AutoStructure has been tested on three protein NMR data sets for which high-quality structures have previously been solved by an expert, and yields comparable high-quality distance constraint lists and 3D protein structures in hours. We also compare several protein structures determined using AutoStructure with corresponding homologous proteins determined with other independent methods. The program has been used in more than two dozen protein structure determinations, several of which have already been published.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine and Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey 08854-5638, USA
| | | | | | | |
Collapse
|
23
|
Abstract
The peptide bond quenches tryptophan fluorescence by excited-state electron transfer, which probably accounts for most of the variation in fluorescence intensity of peptides and proteins. A series of seven peptides was designed with a single tryptophan, identical amino acid composition, and peptide bond as the only known quenching group. The solution structure and side-chain chi(1) rotamer populations of the peptides were determined by one-dimensional and two-dimensional (1)H-NMR. All peptides have a single backbone conformation. The -, psi-angles and chi(1) rotamer populations of tryptophan vary with position in the sequence. The peptides have fluorescence emission maxima of 350-355 nm, quantum yields of 0.04-0.24, and triple exponential fluorescence decays with lifetimes of 4.4-6.6, 1.4-3.2, and 0.2-1.0 ns at 5 degrees C. Lifetimes were correlated with ground-state conformers in six peptides by assigning the major lifetime component to the major NMR-determined chi(1) rotamer. In five peptides the chi(1) = -60 degrees rotamer of tryptophan has lifetimes of 2.7-5.5 ns, depending on local backbone conformation. In one peptide the chi(1) = 180 degrees rotamer has a 0.5-ns lifetime. This series of small peptides vividly demonstrates the dominant role of peptide bond quenching in tryptophan fluorescence.
Collapse
Affiliation(s)
- Chia-Pin Pan
- Department of Chemistry, Case Western Reserve University, Cleveland, Ohio, USA
| | | |
Collapse
|
24
|
Choi IG, Kwon J, Kim SH. Local feature frequency profile: a method to measure structural similarity in proteins. Proc Natl Acad Sci U S A 2004; 101:3797-802. [PMID: 14985506 PMCID: PMC374324 DOI: 10.1073/pnas.0308656100] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of C(alpha) distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each C(alpha) distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the "map" of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.
Collapse
Affiliation(s)
- In-Geol Choi
- Department of Chemistry, University of California, and Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | | |
Collapse
|
25
|
Hou J, Sims GE, Zhang C, Kim SH. A global representation of the protein fold space. Proc Natl Acad Sci U S A 2003; 100:2386-90. [PMID: 12606708 PMCID: PMC151350 DOI: 10.1073/pnas.2628030100] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
One of the principal goals of the structural genomics initiative is to identify the total repertoire of protein folds and obtain a global view of the "protein structure universe." Here, we present a 3D map of the protein fold space in which structurally related folds are represented by spatially adjacent points. Such a representation reveals a high-level organization of the fold space that is intuitively interpretable. The shape of the fold space and the overall distribution of the folds are defined by three dominant trends: secondary structure class, chain topology, and protein domain size. Random coil-like structures of small proteins and peptides are mapped to a region where the three trends converge, offering an interesting perspective on both the demography of fold space and the evolution of protein structures.
Collapse
Affiliation(s)
- Jingtong Hou
- Department of Chemistry and Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA
| | | | | | | |
Collapse
|
26
|
Mariappan SV, Catasti P, Silks LA, Bradbury EM, Gupta G. The high-resolution structure of the triplex formed by the GAA/TTC triplet repeat associated with Friedreich's ataxia. J Mol Biol 1999; 285:2035-52. [PMID: 9925783 DOI: 10.1006/jmbi.1998.2435] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Expansions of the triplet repeat, GAA/TTC, inside the first intron of the frataxin gene causes Friedreich's ataxia (FRDA). It was of interest to us to examine whether the FRDA repeat forms an unusual DNA structure, since formation of such structure during replication may cause its expansion. Here, we show that the FRDA repeat forms a triplex in which the TTC strand folds on either side of the same GAA strand. We have determined the high-resolution NMR structures of two intramolecularly folded FRDA triplexes, (GAA)2T4(TTC)2T4(CTT)2 and (GAA)2T4(TTC)2T2CT2(CTT)2 with T.A.T and C+.G.C triads. T4 represents a synthetic loop sequence, whereas T2CT2 is the natural loop-folding sequence of the TTC strand. We have also made use of site-specific 15N-labeling of the cytosine residues to investigate their protonation status and their interaction with other protons. We show that the cytosine residues of the Hoogsteen C+.G pairs in this triplex are protonated close to physiological pH. Therefore, it appears that the triplex formation offers a plausible explanation for the expansion of the GAA/TTC repeats in FRDA.
Collapse
Affiliation(s)
- S V Mariappan
- Life Sciences Division, LS-2 MS 880, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA
| | | | | | | | | |
Collapse
|
27
|
Mazur J, Jernigan RL, Sarai A. Constructing optimal backbone segments for joining fixed DNA base pairs. Biophys J 1996; 71:1493-506. [PMID: 8874023 PMCID: PMC1233616 DOI: 10.1016/s0006-3495(96)79352-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
A method is presented to link a sequence of space-fixed base pairs by the sugar-phosphate segments of single nucleotides and to evaluate the effects in the backbone caused by this positioning of the bases. The entire computational unit comprises several nucleotides that are energy-minimized, subject to constraints imposed by the sugar-phosphate backbone segments being anchored to space-fixed base pairs. The minimization schemes are based on two stages, a conjugate gradient method followed by a Newton-Raphson algorithm. Because our purpose is to examine the response, or relaxation, of an artificially stressed backbone, it is essential to be able to obtain, as closely as possible, a lowest minimum energy conformation of the backbone segment in conformational space. For this purpose, an algorithm is developed that leads to the generation of an assembly of many local energy minima. From these sets of local minima, one conformation corresponding to the one with the lowest minimum is then selected and designated to represent the backbone segment at its minimum. The effective electrostatic potential of mean force is expressed in terms of adjustable parameters that incorporate solvent screening action in the Coulombic interactions between charged backbone atoms; these parameters are adjusted to obtain the best fit of the nearest-neighbor phosphorous atoms in an x-ray structure.
Collapse
Affiliation(s)
- J Mazur
- Frederick Biomedical Super Computing Laboratory, SAIC, NCI-FCRDC, Maryland 21701, USA
| | | | | |
Collapse
|
28
|
Bruccoleri RE. Application of Systematic Conformational Search to Protein Modeling. MOLECULAR SIMULATION 1993. [DOI: 10.1080/08927029308022163] [Citation(s) in RCA: 28] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
29
|
Purisima EO, Scheraga HA. An approach to the multiple-minima problem in protein folding by relaxing dimensionality. Tests on enkephalin. J Mol Biol 1987; 196:697-709. [PMID: 3681972 DOI: 10.1016/0022-2836(87)90041-6] [Citation(s) in RCA: 74] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
An algorithm for locating the region in conformational space containing the global energy minimum of a polypeptide is described. Distances are used as the primary variables in the minimization of an objective function that incorporates both energetic and distance-geometric terms. The latter are obtained from geometry and energy functions, rather than nuclear magnetic resonance experiments, although the algorithm can incorporate distances from nuclear magnetic resonance data if desired. The polypeptide is generated originally in a space of high dimensionality. This has two important consequences. First, all interatomic distances are initially at their energetically most favorable values; i.e. the polypeptide is initially at a global minimum-energy conformation, albeit a high-dimensional one. Second, the relaxation of dimensionality constraints in the early stages of the minimization removes many potential energy barriers that exist in three dimensions, thereby allowing a means of escaping from three-dimensional local minima. These features are used in an algorithm that produces short trajectories of three-dimensional minimum-energy conformations. A conformation in the trajectory is generated by allowing the previous conformation in the trajectory to evolve in a high-dimensional space before returning to three dimensions. The resulting three-dimensional structure is taken to be the next conformation in the trajectory, and the process is iterated. This sequence of conformations results in a limited but efficient sampling of conformational space. Results for test calculations on Met-enkephalin, a pentapeptide with the amino acid sequence H-Tyr-Gly-Gly-Phe-Met-OH, are presented. A tight cluster of conformations (in three-dimensional space) is found with ECEPP energies (Empirical Conformational Energy Program for Peptides) lower than any previously reported. This cluster of conformations defines a region in conformational space in which the global-minimum-energy conformation of enkephalin appears to lie.
Collapse
Affiliation(s)
- E O Purisima
- Baker Laboratory of Chemistry, Cornell University, Ithaca, NY 14853-1301
| | | |
Collapse
|
30
|
Havel TF, Crippen GM, Kuntz ID, Blaney JM. The combinatorial distance geometry method for the calculation of molecular conformation. II. Sample problems and computational statistics. J Theor Biol 1983; 104:383-400. [PMID: 6197591 DOI: 10.1016/0022-5193(83)90113-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The performance of a branch and bound algorithm for molecular energy minimization is evaluated on a variety of test problems. Although not at present efficient enough for use in most practical situations, we show that it has distinct advantages over more conventional methods of global minimization. In addition, this study illustrates the technique on which the present algorithm is based, and the problems which must be overcome in developing an efficient algorithm based on similar principles.
Collapse
|