1
|
Noriega HA, Wang Q, Yu D, Wang XS. Structural studies of Parvoviridae capsid assembly and evolution: implications for novel AAV vector design. Front Artif Intell 2025; 8:1559461. [PMID: 40242328 PMCID: PMC12000042 DOI: 10.3389/frai.2025.1559461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2025] [Accepted: 03/20/2025] [Indexed: 04/18/2025] Open
Abstract
Adeno-associated virus (AAV) vectors have emerged as powerful tools in gene therapy, potentially treating various genetic disorders. Engineering the AAV capsids through computational methods enables the customization of these vectors to enhance their effectiveness and safety. This engineering allows for the development of gene therapies that are not only more efficient but also personalized to unique genetic profiles. When developing, it is essential to understand the structural biology and the vast techniques used to guide vector designs. This review covers the fundamental biology of the Parvoviridae capsids, focusing on modern structural study techniques, including (a) Cryo-electron microscopy and X-ray Crystallography studies and (b) Comparative analysis of capsid structures across different Parvoviridae species. Along with the structure and evolution of the Parvoviridae capsids, computational methods have provided significant insights into the design of novel AAV vector techniques, which include (a) Structure-guided design of AAV capsids with improved properties, (b) Directed Evolution of AAV capsids for specific applications, and (c) Computational prediction of AAV capsid-receptor interactions. Further discussion addressed the ongoing challenges in the AAV vector design and proposed future directions for exploring enhanced computational tools, such as artificial intelligence/machine learning and deep learning.
Collapse
Affiliation(s)
- Heather A. Noriega
- Department of Pharmaceutical Sciences, Artificial Intelligence and Drug Discovery Core Laboratory for District of Columbia Center for AIDS Research (DC CFAR), College of Pharmacy, Howard University, Washington, DC, United States
| | - Qizhao Wang
- AAVnerGene Inc., Rockville, MD, United States
| | - Daozhan Yu
- AAVnerGene Inc., Rockville, MD, United States
| | - Xiang Simon Wang
- Department of Pharmaceutical Sciences, Artificial Intelligence and Drug Discovery Core Laboratory for District of Columbia Center for AIDS Research (DC CFAR), College of Pharmacy, Howard University, Washington, DC, United States
| |
Collapse
|
2
|
Mansurov DA, Khaitbaev AK, Khaitbaev KK, Toshov KS, Benassi E. Relationship between structural properties and biological activity of (-)-menthol and some menthyl esters. Comput Biol Chem 2025; 115:108357. [PMID: 39869952 DOI: 10.1016/j.compbiolchem.2025.108357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 01/12/2025] [Accepted: 01/15/2025] [Indexed: 01/29/2025]
Abstract
Menthol is a naturally occurring cyclic terpene alcohol and is the major component of peppermint and corn mint essential oils extracted from Mentha piperita L. and Mentha arvensis L.. Menthol and its derivatives are widely used in pharmaceutical, cosmetic and food industries. Among its eight isomers, (-)-menthol is the most effective one in terms of refreshing effect. While the invigorating property of (-)-menthol is generally known, this claim is based on a substantial amount of literature and experience. (-)-Menthol has consistently been reported to possess better cooling and refreshing qualities in comparison to its isomers, making it the preferred choice in a broad range of applications such as personal care products, pharmaceuticals and food additives. Additionally, the (-)-menthol molecular structure allows it to have a tighter fitting with the thermoreceptors in the skin and mucous membranes, and thus to provide a more intense cooling feeling. Thus, although others have similar properties to a degree, (-)-menthol is the best compared to all in its refreshing capacity. This study focuses on menthol and some of its esters, viz. menthyl acetate, propionate, butyrate, valerate and hexanoate, with the purpose of establish a connection between structural, electrostatic and electronic characteristics and biological effects. The mostly favoured interactions of the esters with biotargets were investigated at a molecular level, offering a plausible foundation for their bioactivity elucidation. This study is conducted at a quantum mechanical and molecular docking level. The results may be of possible usefulness in areas of applications, such as pharmacological research and drug.
Collapse
Affiliation(s)
| | | | - Khamid Kh Khaitbaev
- Institute of Bioorganic Chemistry named after O. Sodikov, Academy of Sciences of the Republic of Uzbekistan, Tashkent 100057, Uzbekistan
| | - Khamza S Toshov
- National University of Uzbekistan, Tashkent 100057, Uzbekistan
| | - Enrico Benassi
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk 630090, Russia.
| |
Collapse
|
3
|
De Salis SKF, Chen JZ, Skarratt KK, Fuller SJ, Balle T. Deep learning structural insights into heterotrimeric alternatively spliced P2X7 receptors. Purinergic Signal 2024; 20:431-447. [PMID: 38032425 PMCID: PMC11928719 DOI: 10.1007/s11302-023-09978-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 10/31/2023] [Indexed: 12/01/2023] Open
Abstract
P2X7 receptors (P2X7Rs) are membrane-bound ATP-gated ion channels that are composed of three subunits. Different subunit structures may be expressed due to alternative splicing of the P2RX7 gene, altering the receptor's function when combined with the wild-type P2X7A subunits. In this study, the application of the deep-learning method, AlphaFold2-Multimer (AF2M), for the generation of trimeric P2X7Rs was validated by comparing an AF2M-generated rat wild-type P2X7A receptor with a structure determined by cryogenic electron microscopy (cryo-EM) (Protein Data Bank Identification: 6U9V). The results suggested AF2M could firstly, accurately predict the structures of P2X7Rs and secondly, accurately identify the highest quality model through the ranking system. Subsequently, AF2M was used to generate models of heterotrimeric alternatively spliced P2X7Rs consisting of one or two wild-type P2X7A subunits in combination with one or two P2X7B, P2X7E, P2X7J, and P2X7L splice variant subunits. The top-ranking models were deemed valid based on AF2M's confidence measures, stability in molecular dynamics simulations, and consistent flexibility of the conserved regions between the models. The structure of the heterotrimeric receptors, which were missing key residues in the ATP binding sites and carboxyl terminal domains (CTDs) compared to the wild-type receptor, help to explain their observed functions. Overall, the models produced in this study (available as supplementary material) unlock the possibility of structure-based studies into the heterotrimeric P2X7Rs.
Collapse
Affiliation(s)
- Sophie K F De Salis
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia
| | - Jake Zheng Chen
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia
| | - Kristen K Skarratt
- The University of Sydney, Nepean Clinical School, Kingswood, NSW, 2747, Australia
| | - Stephen J Fuller
- The University of Sydney, Nepean Clinical School, Kingswood, NSW, 2747, Australia
| | - Thomas Balle
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia.
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia.
| |
Collapse
|
4
|
Agarwal V, McShan AC. The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins. Nat Chem Biol 2024; 20:950-959. [PMID: 38907110 PMCID: PMC11956457 DOI: 10.1038/s41589-024-01638-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 04/29/2024] [Indexed: 06/23/2024]
Abstract
Artificial intelligence-driven advances in protein structure prediction in recent years have raised the question: has the protein structure-prediction problem been solved? Here, with a focus on nonglobular proteins, we highlight the many strengths and potential weaknesses of DeepMind's AlphaFold2 in the context of its biological and therapeutic applications. We summarize the subtleties associated with evaluation of AlphaFold2 model quality and reliability using the predicted local distance difference test (pLDDT) and predicted aligned error (PAE) values. We highlight various classes of proteins that AlphaFold2 can be applied to and the caveats involved. Concrete examples of how AlphaFold2 models can be integrated with experimental data in the form of small-angle X-ray scattering (SAXS), solution NMR, cryo-electron microscopy (cryo-EM) and X-ray diffraction are discussed. Finally, we highlight the need to move beyond structure prediction of rigid, static structural snapshots toward conformational ensembles and alternate biologically relevant states. The overarching theme is that careful consideration is due when using AlphaFold2-generated models to generate testable hypotheses and structural models, rather than treating predicted models as de facto ground truth structures.
Collapse
Affiliation(s)
- Vinayak Agarwal
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
5
|
Li Z, Fan H, Ding W. Solving protein structures by combining structure prediction, molecular replacement and direct-methods-aided model completion. IUCRJ 2024; 11:152-167. [PMID: 38214490 PMCID: PMC10916285 DOI: 10.1107/s2052252523010291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/29/2023] [Indexed: 01/13/2024]
Abstract
Highly accurate protein structure prediction can generate accurate models of protein and protein-protein complexes in X-ray crystallography. However, the question of how to make more effective use of predicted models for completing structure analysis, and which strategies should be employed for the more challenging cases such as multi-helical structures, multimeric structures and extremely large structures, both in the model preparation and in the completion steps, remains open for discussion. In this paper, a new strategy is proposed based on the framework of direct methods and dual-space iteration, which can greatly simplify the pre-processing steps of predicted models both in normal and in challenging cases. Following this strategy, full-length models or the conservative structural domains could be used directly as the starting model, and the phase error and the model bias between the starting model and the real structure would be modified in the direct-methods-based dual-space iteration. Many challenging cases (from CASP14) have been tested for the general applicability of this constructive strategy, and almost complete models have been generated with reasonable statistics. The hybrid strategy therefore provides a meaningful scheme for X-ray structure determination using a predicted model as the starting point.
Collapse
Affiliation(s)
- Zengru Li
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
- School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100049, People’s Republic of China
| | - Haifu Fan
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
| | - Wei Ding
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
| |
Collapse
|
6
|
Sonani RR, Palmer LK, Esteves NC, Horton AA, Sebastian AL, Kelly RJ, Wang F, Kreutzberger MAB, Russell WK, Leiman PG, Scharf BE, Egelman EH. An extensive disulfide bond network prevents tail contraction in Agrobacterium tumefaciens phage Milano. Nat Commun 2024; 15:756. [PMID: 38272938 PMCID: PMC10811340 DOI: 10.1038/s41467-024-44959-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 01/10/2024] [Indexed: 01/27/2024] Open
Abstract
A contractile sheath and rigid tube assembly is a widespread apparatus used by bacteriophages, tailocins, and the bacterial type VI secretion system to penetrate cell membranes. In this mechanism, contraction of an external sheath powers the motion of an inner tube through the membrane. The structure, energetics, and mechanism of the machinery imply rigidity and straightness. The contractile tail of Agrobacterium tumefaciens bacteriophage Milano is flexible and bent to varying degrees, which sets it apart from other contractile tail-like systems. Here, we report structures of the Milano tail including the sheath-tube complex, baseplate, and putative receptor-binding proteins. The flexible-to-rigid transformation of the Milano tail upon contraction can be explained by unique electrostatic properties of the tail tube and sheath. All components of the Milano tail, including sheath subunits, are crosslinked by disulfides, some of which must be reduced for contraction to occur. The putative receptor-binding complex of Milano contains a tailspike, a tail fiber, and at least two small proteins that form a garland around the distal ends of the tailspikes and tail fibers. Despite being flagellotropic, Milano lacks thread-like tail filaments that can wrap around the flagellum, and is thus likely to employ a different binding mechanism.
Collapse
Affiliation(s)
- Ravi R Sonani
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA, 22903, USA
| | - Lee K Palmer
- Mass Spectrometry Facility, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Nathaniel C Esteves
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Abigail A Horton
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Amanda L Sebastian
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Rebecca J Kelly
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Fengbin Wang
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA, 22903, USA
- Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Mark A B Kreutzberger
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA, 22903, USA
| | - William K Russell
- Mass Spectrometry Facility, University of Texas Medical Branch, Galveston, TX, 77555, USA
| | - Petr G Leiman
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX, 77555, USA.
| | - Birgit E Scharf
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, 24061, USA.
| | - Edward H Egelman
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA, 22903, USA.
| |
Collapse
|
7
|
Zhang Z, Cai Y, Zhang B, Zheng W, Freddolino L, Zhang G, Zhou X. DEMO-EM2: assembling protein complex structures from cryo-EM maps through intertwined chain and domain fitting. Brief Bioinform 2024; 25:bbae113. [PMID: 38517699 PMCID: PMC10959074 DOI: 10.1093/bib/bbae113] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 02/10/2024] [Accepted: 02/25/2024] [Indexed: 03/24/2024] Open
Abstract
The breakthrough in cryo-electron microscopy (cryo-EM) technology has led to an increasing number of density maps of biological macromolecules. However, constructing accurate protein complex atomic structures from cryo-EM maps remains a challenge. In this study, we extend our previously developed DEMO-EM to present DEMO-EM2, an automated method for constructing protein complex models from cryo-EM maps through an iterative assembly procedure intertwining chain- and domain-level matching and fitting for predicted chain models. The method was carefully evaluated on 27 cryo-electron tomography (cryo-ET) maps and 16 single-particle EM maps, where DEMO-EM2 models achieved an average TM-score of 0.92, outperforming those of state-of-the-art methods. The results demonstrate an efficient method that enables the rapid and reliable solution of challenging cryo-EM structure modeling problems.
Collapse
Affiliation(s)
- Ziying Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yaxian Cai
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Biao Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lydia Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
8
|
Terashi G, Wang X, Prasad D, Nakamura T, Kihara D. DeepMainmast: integrated protocol of protein structure modeling for cryo-EM with deep learning and structure prediction. Nat Methods 2024; 21:122-131. [PMID: 38066344 DOI: 10.1038/s41592-023-02099-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 10/22/2023] [Indexed: 12/19/2023]
Abstract
Three-dimensional structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy. Although the resolution of determined cryogenic electron microscopy maps has generally improved, there are still many cases where tracing protein main chains is difficult, even in maps determined at a near-atomic resolution. Here we developed a protein structure modeling method, DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, we integrated AlphaFold2 with the de novo density tracing protocol to combine their complementary strengths and achieved even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign the chain identity to the structure models of homo-multimers, which is not a trivial task for existing methods.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Devashish Prasad
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
9
|
Terashi G, Wang X, Prasad D, Nakamura T, Zhu H, Kihara D. Integrated Protocol of Protein Structure Modeling for Cryo-EM with Deep Learning and Structure Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.19.563151. [PMID: 37904978 PMCID: PMC10614963 DOI: 10.1101/2023.10.19.563151] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM maps has generally improved, there are still many cases where tracing protein main-chains is difficult, even in maps determined at a near atomic resolution. Here, we have developed a protein structure modeling method, called DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, since Alphafold2 demonstrates high accuracy in protein structure prediction, we have integrated complementary strengths of de novo density tracing using deep learning with Alphafold2's structure modeling to achieve even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign chain identity to the structure models of homo-multimers.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Devashish Prasad
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
10
|
Kulczyk AW. Artificial intelligence and the analysis of cryo-EM data provide structural insight into the molecular mechanisms underlying LN-lamininopathies. Sci Rep 2023; 13:17825. [PMID: 37857770 PMCID: PMC10587063 DOI: 10.1038/s41598-023-45200-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/17/2023] [Indexed: 10/21/2023] Open
Abstract
Laminins (Lm) are major components of basement membranes (BM), which polymerize to form a planar lattice on cell surface. Genetic alternations of Lm affect their oligomerization patterns and lead to failures in BM assembly manifesting in a group of human disorders collectively defined as Lm N-terminal domain lamininopathies (LN-lamininopathies). We have employed a recently determined cryo-EM structure of the Lm polymer node, the basic repeating unit of the Lm lattice, along with structure prediction and modeling to systematically analyze structures of twenty-three pathogenic Lm polymer nodes implicated in human disease. Our analysis provides the detailed mechanistic explanation how Lm mutations lead to failures in Lm polymerization underlining LN-lamininopathies. We propose the new categorization scheme of LN-lamininopathies based on the insight gained from the structural analysis. Our results can help to facilitate rational drug design aiming in the treatment of Lm deficiencies.
Collapse
Affiliation(s)
- Arkadiusz W Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ, 08854, USA.
- Department of Biochemistry & Microbiology, Rutgers University, 75 Lipman Drive, New Brunswick, NJ, 08901, USA.
| |
Collapse
|
11
|
Huang B, Kong L, Wang C, Ju F, Zhang Q, Zhu J, Gong T, Zhang H, Yu C, Zheng WM, Bu D. Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:913-925. [PMID: 37001856 PMCID: PMC10928435 DOI: 10.1016/j.gpb.2022.11.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 11/23/2022] [Accepted: 11/30/2022] [Indexed: 03/31/2023]
Abstract
Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields, including biochemistry, medicine, physics, mathematics, and computer science. These researchers adopt various research paradigms to attack the same structure prediction problem: biochemists and physicists attempt to reveal the principles governing protein folding; mathematicians, especially statisticians, usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure, while computer scientists formulate protein structure prediction as an optimization problem - finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure. These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman, namely, data modeling and algorithmic modeling. Recently, we have also witnessed the great success of deep learning in protein structure prediction. In this review, we present a survey of the efforts for protein structure prediction. We compare the research paradigms adopted by researchers from different fields, with an emphasis on the shift of research paradigms in the era of deep learning. In short, the algorithmic modeling techniques, especially deep neural networks, have considerably improved the accuracy of protein structure prediction; however, theories interpreting the neural networks and knowledge on protein folding are still highly desired.
Collapse
Affiliation(s)
- Bin Huang
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lupeng Kong
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; Changping Laboratory, Beijing 102206, China
| | - Chao Wang
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Fusong Ju
- Microsoft Research AI4Science, Beijing 100080, China
| | - Qi Zhang
- Huawei Noah's Ark Lab, Wuhan 430206, China
| | - Jianwei Zhu
- Microsoft Research AI4Science, Beijing 100080, China
| | - Tiansu Gong
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haicang Zhang
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Zhongke Big Data Academy, Zhengzhou 450046, China.
| | - Chungong Yu
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Zhongke Big Data Academy, Zhengzhou 450046, China.
| | - Wei-Mou Zheng
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China.
| | - Dongbo Bu
- Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Zhongke Big Data Academy, Zhengzhou 450046, China.
| |
Collapse
|
12
|
Wodak SJ, Velankar S. Structural biology: The transformational era. Proteomics 2023; 23:e2200084. [PMID: 37667815 DOI: 10.1002/pmic.202200084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 07/26/2023] [Indexed: 09/06/2023]
Affiliation(s)
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
13
|
Stiefel J, Zimmer J, Schloßhauer JL, Vosen A, Kilz S, Balakin S. Just Keep Rolling?-An Encompassing Review towards Accelerated Vaccine Product Life Cycles. Vaccines (Basel) 2023; 11:1287. [PMID: 37631855 PMCID: PMC10459022 DOI: 10.3390/vaccines11081287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 07/20/2023] [Accepted: 07/24/2023] [Indexed: 08/27/2023] Open
Abstract
In light of the recent pandemic, several COVID-19 vaccines were developed, tested and approved in a very short time, a process that otherwise takes many years. Above all, these efforts have also unmistakably revealed the capacity limits and potential for improvement in vaccine production. This review aims to emphasize recent approaches for the targeted rapid adaptation and production of vaccines from an interdisciplinary, multifaceted perspective. Using research from the literature, stakeholder analysis and a value proposition canvas, we reviewed technological innovations on the pharmacological level, formulation, validation and resilient vaccine production to supply bottlenecks and logistic networks. We identified four main drivers to accelerate the vaccine product life cycle: computerized candidate screening, modular production, digitized quality management and a resilient business model with corresponding transparent supply chains. In summary, the results presented here can serve as a guide and implementation tool for flexible, scalable vaccine production to swiftly respond to pandemic situations in the future.
Collapse
Affiliation(s)
- Janis Stiefel
- Fraunhofer Institute for Microengineering and Microsystems IMM, Carl-Zeiss-Straße 18-20, 55129 Mainz, Germany
| | - Jan Zimmer
- Fraunhofer Institute for Microengineering and Microsystems IMM, Carl-Zeiss-Straße 18-20, 55129 Mainz, Germany
| | - Jeffrey L. Schloßhauer
- Fraunhofer Institute for Cell Therapy and Immunology, Branch Bioanalytics and Bioprocesses IZI-BB, Am Mühlenberg 13, 14476 Potsdam, Germany
| | - Agnes Vosen
- Fraunhofer Center for International Management and Knowledge Economy IMW, Neumarkt 20, 04109 Leipzig, Germany
| | - Sarah Kilz
- Fraunhofer Center for International Management and Knowledge Economy IMW, Neumarkt 20, 04109 Leipzig, Germany
| | - Sascha Balakin
- Fraunhofer Institute for Ceramic Technologies and Systems IKTS Material Diagnostics, Bio- and Nanotechnology, Maria-Reiche-Straße 2, 01109 Dresden, Germany
- Max Bergmann Center of Biomaterials (MBC), Technical University of Dresden, Budapester Strasse 27, 01069 Dresden, Germany
| |
Collapse
|
14
|
Wodak SJ, Vajda S, Lensink MF, Kozakov D, Bates PA. Critical Assessment of Methods for Predicting the 3D Structure of Proteins and Protein Complexes. Annu Rev Biophys 2023; 52:183-206. [PMID: 36626764 PMCID: PMC10885158 DOI: 10.1146/annurev-biophys-102622-084607] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Advances in a scientific discipline are often measured by small, incremental steps. In this review, we report on two intertwined disciplines in the protein structure prediction field, modeling of single chains and modeling of complexes, that have over decades emulated this pattern, as monitored by the community-wide blind prediction experiments CASP and CAPRI. However, over the past few years, dramatic advances were observed for the accurate prediction of single protein chains, driven by a surge of deep learning methodologies entering the prediction field. We review the mainscientific developments that enabled these recent breakthroughs and feature the important role of blind prediction experiments in building up and nurturing the structure prediction field. We discuss how the new wave of artificial intelligence-based methods is impacting the fields of computational and experimental structural biology and highlight areas in which deep learning methods are likely to lead to future developments, provided that major challenges are overcome.
Collapse
Affiliation(s)
- Shoshana J Wodak
- VIB-VUB Center for Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium;
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA;
- Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Marc F Lensink
- Univ. Lille, CNRS, UMR 8576-UGSF-Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France;
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA;
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, United Kingdom;
| |
Collapse
|
15
|
Dong WB, Jiang YL, Zhu ZL, Zhu J, Li Y, Xia R, Zhou K. Structural and enzymatic characterization of the sialidase SiaPG from Porphyromonas gingivalis. Acta Crystallogr F Struct Biol Commun 2023; 79:87-94. [PMID: 36995120 PMCID: PMC10071834 DOI: 10.1107/s2053230x23001735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/24/2023] [Indexed: 03/31/2023] Open
Abstract
The sialidases, which catalyze the hydrolysis of sialic acid from extracellular glycoconjugates, are a group of major virulence factors in various pathogenic bacteria. In Porphyromonas gingivalis, which causes human periodontal disease, sialidase contributes to bacterial pathogenesis via promoting the formation of biofilms and capsules, reducing the ability for macrophage clearance, and providing nutrients for bacterial colonization. Here, the crystal structure of the P. gingivalis sialidase SiaPG is reported at 2.1 Å resolution, revealing an N-terminal carbohydrate-binding domain followed by a canonical C-terminal catalytic domain. Simulation of the product sialic acid in the active-site pocket together with functional analysis enables clear identification of the key residues that are required for substrate binding and catalysis. Moreover, structural comparison with other sialidases reveals distinct features of the active-site pocket which might confer substrate specificity. These findings provide the structural basis for the further design and optimization of effective inhibitors to target SiaPG to fight against P. gingivalis-derived oral diseases.
Collapse
Affiliation(s)
- Wen-Bo Dong
- Department of Stomatology, The Second Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230601, People’s Republic of China
| | - Yong-Liang Jiang
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230026, People’s Republic of China
| | - Zhong-Liang Zhu
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230026, People’s Republic of China
| | - Jie Zhu
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230026, People’s Republic of China
| | - Yang Li
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230026, People’s Republic of China
| | - Rong Xia
- Department of Stomatology, The Second Affiliated Hospital of Anhui Medical University, Hefei, Anhui 230601, People’s Republic of China
| | - Kang Zhou
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230026, People’s Republic of China
| |
Collapse
|
16
|
Miller JM, Knyazhanskaya ES, Buth SA, Prokhorov NS, Leiman PG. Function of the bacteriophage P2 baseplate central spike Apex domain in the infection process. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.25.529910. [PMID: 36865152 PMCID: PMC9980179 DOI: 10.1101/2023.02.25.529910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/13/2024]
Abstract
The contractile tail of bacteriophage P2 functions to drive the tail tube across the outer membrane of its host bacterium, a prerequisite event for subsequent translocation of phage genomic DNA into the host cell. The tube is equipped with a spike-shaped protein (product of P2 gene V , gpV or Spike) that contains a membrane-attacking Apex domain carrying a centrally positioned Fe ion. The ion is enclosed in a histidine cage that is formed by three symmetry-related copies of a conserved HxH (histidine, any residue, histidine) sequence motif. Here, we used solution biophysics and X-ray crystallography to characterize the structure and properties of Spike mutants in which the Apex domain was either deleted or its histidine cage was either destroyed or replaced with a hydrophobic core. We found that the Apex domain is not required for the folding of full-length gpV or its middle intertwined β-helical domain. Furthermore, despite its high conservation, the Apex domain is dispensable for infection in laboratory conditions. Collectively, our results show that the diameter of the Spike but not the nature of its Apex domain determines the efficiency of infection, which further strengthens the earlier hypothesis of a drill bit-like function of the Spike in host envelope disruption.
Collapse
|
17
|
Zhao H, Zhang H, She Z, Gao Z, Wang Q, Geng Z, Dong Y. Exploring AlphaFold2's Performance on Predicting Amino Acid Side-Chain Conformations and Its Utility in Crystal Structure Determination of B318L Protein. Int J Mol Sci 2023; 24:2740. [PMID: 36769074 PMCID: PMC9916901 DOI: 10.3390/ijms24032740] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/10/2023] [Accepted: 01/12/2023] [Indexed: 02/04/2023] Open
Abstract
Recent technological breakthroughs in machine-learning-based AlphaFold2 (AF2) are pushing the prediction accuracy of protein structures to an unprecedented level that is on par with experimental structural quality. Despite its outstanding structural modeling capability, further experimental validations and performance assessments of AF2 predictions are still required, thus necessitating the development of integrative structural biology in synergy with both computational and experimental methods. Focusing on the B318L protein that plays an essential role in the African swine fever virus (ASFV) for viral replication, we experimentally demonstrate the high quality of the AF2 predicted model and its practical utility in crystal structural determination. Structural alignment implies that the AF2 model shares nearly the same atomic arrangement as the B318L crystal structure except for some flexible and disordered regions. More importantly, side-chain-based analysis at the individual residue level reveals that AF2's performance is likely dependent on the specific amino acid type and that hydrophobic residues tend to be more accurately predicted by AF2 than hydrophilic residues. Quantitative per-residue RMSD comparisons and further molecular replacement trials suggest that AF2 has a large potential to outperform other computational modeling methods in terms of structural determination. Additionally, it is numerically confirmed that the AF2 model is accurate enough so that it may well potentially withstand experimental data quality to a large extent for structural determination. Finally, an overall structural analysis and molecular docking simulation of the B318L protein are performed. Taken together, our study not only provides new insights into AF2's performance in predicting side-chain conformations but also sheds light upon the significance of AF2 in promoting crystal structural determination, especially when the experimental data quality of the protein crystal is poor.
Collapse
Affiliation(s)
- Haifan Zhao
- School of Life Sciences, University of Science and Technology of China, Hefei 230027, China
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
| | - Heng Zhang
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
| | - Zhun She
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
| | - Zengqiang Gao
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
| | - Qi Wang
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhi Geng
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
| | - Yuhui Dong
- Beijing Synchrotron Radiation Facility, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
18
|
Fieulaine S, Tubiana T, Bressanelli S. De novo modelling of HEV replication polyprotein: Five-domain breakdown and involvement of flexibility in functional regulation. Virology 2023; 578:128-140. [PMID: 36527931 DOI: 10.1016/j.virol.2022.12.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/14/2022]
Abstract
Hepatitis E virus (HEV), a major cause of acute viral hepatitis, is a single-stranded, positive-sense RNA virus. As such, it encodes a 1700-residue replication polyprotein pORF1 that directs synthesis of new viral RNA in infected cells. Here we report extensive modeling with AlphaFold2 of the full-length pORF1, and its production by in vitro translation. From this, we give a detailed update on the breakdown into domains of HEV pORF1. We also provide evidence that pORF1's N-terminal domain is likely to oligomerize to form a dodecameric pore, homologously to what has been described for Chikungunya virus. Beyond providing accurate folds for its five domains, our work highlights that there is no canonical protease encoded in pORF1 and that flexibility in several functionally important regions rather than proteolytic processing may serve to regulate HEV RNA synthesis.
Collapse
Affiliation(s)
- Sonia Fieulaine
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Thibault Tubiana
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Stéphane Bressanelli
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| |
Collapse
|
19
|
Edich M, Briggs DC, Kippes O, Gao Y, Thorn A. The impact of AlphaFold2 on experimental structure solution. Faraday Discuss 2022; 240:184-195. [PMID: 35943157 PMCID: PMC10231047 DOI: 10.1039/d2fd00072e] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 05/03/2022] [Indexed: 01/09/2023]
Abstract
AlphaFold2 is a machine-learning based program that predicts a protein structure based on the amino acid sequence. In this article, we report on the current usages of this new tool and give examples from our work in the Coronavirus Structural Task Force. With its unprecedented accuracy, it can be utilized for the design of expression constructs, de novo protein design and the interpretation of Cryo-EM data with an atomic model. However, these methods are limited by their training data and are of limited use to predict conformational variability and fold flexibility; they also lack co-factors, post-translational modifications and multimeric complexes with oligonucleotides. They also are not always perfect in terms of chemical geometry. Nevertheless, machine learning-based fold prediction is a game changer for structural bioinformatics and experimentalists alike, with exciting developments ahead.
Collapse
Affiliation(s)
- Maximilian Edich
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| | - David C Briggs
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
| | - Oliver Kippes
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| | - Yunyun Gao
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| | - Andrea Thorn
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany.
| |
Collapse
|
20
|
Goulet A, Cambillau C. Present Impact of AlphaFold2 Revolution on Structural Biology, and an Illustration With the Structure Prediction of the Bacteriophage J-1 Host Adhesion Device. Front Mol Biosci 2022; 9:907452. [PMID: 35615740 PMCID: PMC9124777 DOI: 10.3389/fmolb.2022.907452] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 04/21/2022] [Indexed: 12/26/2022] Open
Abstract
In 2021, the release of AlphaFold2 - the DeepMind's machine-learning protein structure prediction program - revolutionized structural biology. Results of the CASP14 contest were an immense surprise as AlphaFold2 successfully predicted 3D structures of nearly all submitted protein sequences. The AlphaFold2 craze has rapidly spread the life science community since structural biologists as well as untrained biologists have now the possibility to obtain high-confidence protein structures. This revolution is opening new avenues to address challenging biological questions. Moreover, AlphaFold2 is imposing itself as an essential step of any structural biology project, and requires us to revisit our structural biology workflows. On one hand, AlphaFold2 synergizes with experimental methods including X-ray crystallography and cryo-electron microscopy. On the other hand, it is, to date, the only method enabling structural analyses of large and flexible assemblies resistant to experimental approaches. We illustrate this valuable application of AlphaFold2 with the structure prediction of the whole host adhesion device from the Lactobacillus casei bacteriophage J-1. With the ongoing improvement of AlphaFold2 algorithms and notebooks, there is no doubt that AlphaFold2-driven biological stories will increasingly be reported, which questions the future directions of experimental structural biology.
Collapse
Affiliation(s)
- Adeline Goulet
- Laboratoire D’Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie, Aix-Marseille Université—CNRS, Marseille, France
| | - Christian Cambillau
- Laboratoire D’Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie, Aix-Marseille Université—CNRS, Marseille, France
- School of Microbiology, University College Cork, Cork, Ireland
| |
Collapse
|
21
|
A database of calculated solution parameters for the AlphaFold predicted protein structures. Sci Rep 2022; 12:7349. [PMID: 35513443 PMCID: PMC9072687 DOI: 10.1038/s41598-022-10607-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 04/07/2022] [Indexed: 12/22/2022] Open
Abstract
Recent spectacular advances by AI programs in 3D structure predictions from protein sequences have revolutionized the field in terms of accuracy and speed. The resulting “folding frenzy” has already produced predicted protein structure databases for the entire human and other organisms’ proteomes. However, rapidly ascertaining a predicted structure’s reliability based on measured properties in solution should be considered. Shape-sensitive hydrodynamic parameters such as the diffusion and sedimentation coefficients (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${D_{t(20,w)}^{0}}$$\end{document}Dt(20,w)0, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${s_{{\left( {{20},w} \right)}}^{{0}} }$$\end{document}s20,w0) and the intrinsic viscosity ([η]) can provide a rapid assessment of the overall structure likeliness, and SAXS would yield the structure-related pair-wise distance distribution function p(r) vs. r. Using the extensively validated UltraScan SOlution MOdeler (US-SOMO) suite, a database was implemented calculating from AlphaFold structures the corresponding \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${D_{t(20,w)}^{0}}$$\end{document}Dt(20,w)0, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${s_{{\left( {{20},w} \right)}}^{{0}} }$$\end{document}s20,w0, [η], p(r) vs. r, and other parameters. Circular dichroism spectra were computed using the SESCA program. Some of AlphaFold’s drawbacks were mitigated, such as generating whenever possible a protein’s mature form. Others, like the AlphaFold direct applicability to single-chain structures only, the absence of prosthetic groups, or flexibility issues, are discussed. Overall, this implementation of the US-SOMO-AF database should already aid in rapidly evaluating the consistency in solution of a relevant portion of AlphaFold predicted protein structures.
Collapse
|
22
|
Simpkin AJ, Thomas JMH, Keegan RM, Rigden DJ. MrParse: finding homologues in the PDB and the EBI AlphaFold database for molecular replacement and more. Acta Crystallogr D Struct Biol 2022; 78:553-559. [PMID: 35503204 PMCID: PMC9063843 DOI: 10.1107/s2059798322003576] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 03/29/2022] [Indexed: 11/10/2022] Open
Abstract
Crystallographers have an array of search-model options for structure solution by molecular replacement (MR). The well established options of homologous experimental structures and regular secondary-structure elements or motifs are increasingly supplemented by computational modelling. Such modelling may be carried out locally or may use pre-calculated predictions retrieved from databases such as the EBI AlphaFold database. MrParse is a new pipeline to help to streamline the decision process in MR by consolidating bioinformatic predictions in one place. When reflection data are provided, MrParse can rank any experimental homologues found using eLLG, which indicates the likelihood that a given search model will work in MR. Inbuilt displays of predicted secondary structure, coiled-coil and transmembrane regions further inform the choice of MR protocol. MrParse can also identify and rank homologues in the EBI AlphaFold database, a function that will also interest other structural biologists and bioinformaticians.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Jens M. H. Thomas
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Ronan M. Keegan
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| |
Collapse
|
23
|
Aderinwale T, Bharadwaj V, Christoffer C, Terashi G, Zhang Z, Jahandideh R, Kagaya Y, Kihara D. Real-time structure search and structure classification for AlphaFold protein models. Commun Biol 2022; 5:316. [PMID: 35383281 PMCID: PMC8983703 DOI: 10.1038/s42003-022-03261-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/11/2022] [Indexed: 11/17/2022] Open
Abstract
Last year saw a breakthrough in protein structure prediction, where the AlphaFold2 method showed a substantial improvement in the modeling accuracy. Following the software release of AlphaFold2, predicted structures by AlphaFold2 for proteins in 21 species were made publicly available via the AlphaFold Database. Here, to facilitate structural analysis and application of AlphaFold2 models, we provide the infrastructure, 3D-AF-Surfer, which allows real-time structure-based search for the AlphaFold2 models. In 3D-AF-Surfer, structures are represented with 3D Zernike descriptors (3DZD), which is a rotationally invariant, mathematical representation of 3D shapes. We developed a neural network that takes 3DZDs of proteins as input and retrieves proteins of the same fold more accurately than direct comparison of 3DZDs. Using 3D-AF-Surfer, we report structure classifications of AlphaFold2 models and discuss the correlation between confidence levels of AlphaFold2 models and intrinsic disordered regions.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Vijay Bharadwaj
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
24
|
Barbarin-Bocahu I, Graille M. The X-ray crystallography phase problem solved thanks to AlphaFold and RoseTTAFold models: a case-study report. ACTA CRYSTALLOGRAPHICA SECTION D STRUCTURAL BIOLOGY 2022; 78:517-531. [PMID: 35362474 DOI: 10.1107/s2059798322002157] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 02/23/2022] [Indexed: 11/10/2022]
Abstract
The breakthrough recently made in protein structure prediction by deep-learning programs such as AlphaFold and RoseTTAFold will certainly revolutionize biology over the coming decades. The scientific community is only starting to appreciate the various applications, benefits and limitations of these protein models. Yet, after the first thrills due to this revolution, it is important to evaluate the impact of the proposed models and their overall quality to avoid the misinterpretation or overinterpretation of these models by biologists. One of the first applications of these models is in solving the `phase problem' encountered in X-ray crystallography in calculating electron-density maps from diffraction data. Indeed, the most frequently used technique to derive electron-density maps is molecular replacement. As this technique relies on knowledge of the structure of a protein that shares strong structural similarity with the studied protein, the availability of high-accuracy models is then definitely critical for successful structure solution. After the collection of a 2.45 Å resolution data set, we struggled for two years in trying to solve the crystal structure of a protein involved in the nonsense-mediated mRNA decay pathway, an mRNA quality-control pathway dedicated to the elimination of eukaryotic mRNAs harboring premature stop codons. We used different methods (isomorphous replacement, anomalous diffraction and molecular replacement) to determine this structure, but all failed until we straightforwardly succeeded thanks to both AlphaFold and RoseTTAFold models. Here, we describe how these new models helped us to solve this structure and conclude that in our case the AlphaFold model largely outcompetes the other models. We also discuss the importance of search-model generation for successful molecular replacement.
Collapse
|
25
|
Yamamori Y, Tomii K. Application of Homology Modeling by Enhanced Profile-Profile Alignment and Flexible-Fitting Simulation to Cryo-EM Based Structure Determination. Int J Mol Sci 2022; 23:1977. [PMID: 35216093 PMCID: PMC8879198 DOI: 10.3390/ijms23041977] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/07/2022] [Accepted: 02/09/2022] [Indexed: 12/03/2022] Open
Abstract
Application of cryo-electron microscopy (cryo-EM) is crucially important for ascertaining the atomic structure of large biomolecules such as ribosomes and protein complexes in membranes. Advances in cryo-EM technology and software have made it possible to obtain data with near-atomic resolution, but the method is still often capable of producing only a density map with up to medium resolution, either partially or entirely. Therefore, bridging the gap separating the density map and the atomic model is necessary. Herein, we propose a methodology for constructing atomic structure models based on cryo-EM maps with low-to-medium resolution. The method is a combination of sensitive and accurate homology modeling using our profile-profile alignment method with a flexible-fitting method using molecular dynamics simulation. As described herein, this study used benchmark applications to evaluate the model constructions of human two-pore channel 2 (one target protein in CASP13 with its structure determined using cryo-EM data) and the overall structure of Enterococcus hirae V-ATPase complex.
Collapse
Affiliation(s)
- Yu Yamamori
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan;
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan;
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
26
|
|
27
|
McCoy AJ, Sammito MD, Read RJ. Implications of AlphaFold2 for crystallographic phasing by molecular replacement. Acta Crystallogr D Struct Biol 2022; 78:1-13. [PMID: 34981757 PMCID: PMC8725160 DOI: 10.1107/s2059798321012122] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 11/13/2021] [Indexed: 12/11/2022] Open
Abstract
The AlphaFold2 results in the 14th edition of Critical Assessment of Structure Prediction (CASP14) showed that accurate (low root-mean-square deviation) in silico models of protein structure domains are on the horizon, whether or not the protein is related to known structures through high-coverage sequence similarity. As highly accurate models become available, generated by harnessing the power of correlated mutations and deep learning, one of the aspects of structural biology to be impacted will be methods of phasing in crystallography. Here, the data from CASP14 are used to explore the prospects for changes in phasing methods, and in particular to explore the prospects for molecular-replacement phasing using in silico models.
Collapse
Affiliation(s)
- Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Massimo D. Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
28
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins 2021; 89:1607-1617. [PMID: 34533838 PMCID: PMC8726744 DOI: 10.1002/prot.26237] [Citation(s) in RCA: 273] [Impact Index Per Article: 68.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 07/28/2021] [Indexed: 01/14/2023]
Abstract
Critical assessment of structure prediction (CASP) is a community experiment to advance methods of computing three-dimensional protein structure from amino acid sequence. Core components are rigorous blind testing of methods and evaluation of the results by independent assessors. In the most recent experiment (CASP14), deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. In this sense, the results represent a solution to the classical protein-folding problem, at least for single proteins. The models have already been shown to be capable of providing solutions for problematic crystal structures, and there are broad implications for the rest of structural biology. Other research groups also substantially improved performance. Here, we describe these results and outline some of the many implications. Other related areas of CASP, including modeling of protein complexes, structure refinement, estimation of model accuracy, and prediction of inter-residue contacts and distances, are also described.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Maya Topf
- Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universit tsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, 9600 Gudelsky Drive, Rockville, MD 20850, USA, Department of Cell Biology and Molecular Genetics, University of Maryland
| |
Collapse
|
29
|
Cragnolini T, Kryshtafovych A, Topf M. Cryo-EM targets in CASP14. Proteins 2021; 89:1949-1958. [PMID: 34398978 PMCID: PMC8630773 DOI: 10.1002/prot.26216] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 07/27/2021] [Accepted: 08/06/2021] [Indexed: 11/22/2022]
Abstract
Structures of seven CASP14 targets were determined using cryo-electron microscopy (cryo-EM) technique with resolution between 2.1 and 3.8 Å. We provide an evaluation of the submitted models versus the experimental data (cryo-EM density maps) and experimental reference structures built into the maps. The accuracy of models is measured in terms of coordinate-to-density and coordinate-to-coordinate fit. A-posteriori refinement of the most accurate models in their corresponding cryo-EM density resulted in structures that are close to the reference structure, including some regions with better fit to the density. Regions that were found to be less "refineable" correlate well with regions of high diversity between the CASP models and low goodness-of-fit to density in the reference structure.
Collapse
Affiliation(s)
- Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck, University College London, London, UK
| | | | - Maya Topf
- Center for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| |
Collapse
|
30
|
Alexander LT, Lepore R, Kryshtafovych A, Adamopoulos A, Alahuhta M, Arvin AM, Bomble YJ, Böttcher B, Breyton C, Chiarini V, Chinnam NB, Chiu W, Fidelis K, Grinter R, Gupta GD, Hartmann MD, Hayes CS, Heidebrecht T, Ilari A, Joachimiak A, Kim Y, Linares R, Lovering AL, Lunin VV, Lupas AN, Makbul C, Michalska K, Moult J, Mukherjee PK, Nutt W(S, Oliver SL, Perrakis A, Stols L, Tainer JA, Topf M, Tsutakawa SE, Valdivia‐Delgado M, Schwede T. Target highlights in CASP14: Analysis of models by structure providers. Proteins 2021; 89:1647-1672. [PMID: 34561912 PMCID: PMC8616854 DOI: 10.1002/prot.26247] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 09/13/2021] [Accepted: 09/16/2021] [Indexed: 12/11/2022]
Abstract
The biological and functional significance of selected Critical Assessment of Techniques for Protein Structure Prediction 14 (CASP14) targets are described by the authors of the structures. The authors highlight the most relevant features of the target proteins and discuss how well these features were reproduced in the respective submitted predictions. The overall ability to predict three-dimensional structures of proteins has improved remarkably in CASP14, and many difficult targets were modeled with impressive accuracy. For the first time in the history of CASP, the experimentalists not only highlighted that computational models can accurately reproduce the most critical structural features observed in their targets, but also envisaged that models could serve as a guidance for further studies of biologically-relevant properties of proteins.
Collapse
Affiliation(s)
- Leila T. Alexander
- Biozentrum, University of BaselBaselSwitzerland
- Computational Structural BiologySIB Swiss Institute of BioinformaticsBaselSwitzerland
| | | | | | - Athanassios Adamopoulos
- Oncode Institute and Division of BiochemistryNetherlands Cancer InstituteAmsterdamThe Netherlands
| | - Markus Alahuhta
- Bioscience Center, National Renewable Energy LaboratoryGoldenColoradoUSA
| | - Ann M. Arvin
- Department of PediatricsStanford University School of MedicineStanfordCaliforniaUSA
- Microbiology and ImmunologyStanford University School of MedicineStanfordCaliforniaUSA
| | - Yannick J. Bomble
- Bioscience Center, National Renewable Energy LaboratoryGoldenColoradoUSA
| | - Bettina Böttcher
- Biocenter and Rudolf Virchow Center, Julius‐Maximilians Universität WürzburgWürzburgGermany
| | - Cécile Breyton
- Univ. Grenoble Alpes, CNRS, CEA, Institute for Structural BiologyGrenobleFrance
| | - Valerio Chiarini
- Program in Structural Biology and BiophysicsInstitute of Biotechnology, University of HelsinkiHelsinkiFinland
| | - Naga babu Chinnam
- Department of Molecular and Cellular OncologyThe University of Texas M.D. Anderson Cancer CenterHoustonTexasUSA
| | - Wah Chiu
- Microbiology and ImmunologyStanford University School of MedicineStanfordCaliforniaUSA
- BioengineeringStanford University School of MedicineStanfordCaliforniaUSA
- Division of Cryo‐EM and Bioimaging SSRLSLAC National Accelerator LaboratoryMenlo ParkCaliforniaUSA
| | | | - Rhys Grinter
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of MicrobiologyMonash UniversityClaytonAustralia
| | - Gagan D. Gupta
- Radiation Biology & Health Sciences DivisionBhabha Atomic Research CentreMumbaiIndia
| | - Marcus D. Hartmann
- Department of Protein EvolutionMax Planck Institute for Developmental BiologyTübingenGermany
| | - Christopher S. Hayes
- Department of Molecular, Cellular and Developmental BiologyUniversity of California, Santa BarbaraSanta BarbaraCaliforniaUSA
- Biomolecular Science and Engineering ProgramUniversity of California, Santa BarbaraSanta BarbaraCaliforniaUSA
| | - Tatjana Heidebrecht
- Oncode Institute and Division of BiochemistryNetherlands Cancer InstituteAmsterdamThe Netherlands
| | - Andrea Ilari
- Institute of Molecular Biology and Pathology of the National Research Council of Italy (CNR)RomeItaly
| | - Andrzej Joachimiak
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
- Department of Biochemistry and Molecular BiologyUniversity of ChicagoChicagoIllinoisUSA
| | - Youngchang Kim
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - Romain Linares
- Univ. Grenoble Alpes, CNRS, CEA, Institute for Structural BiologyGrenobleFrance
| | | | - Vladimir V. Lunin
- Bioscience Center, National Renewable Energy LaboratoryGoldenColoradoUSA
| | - Andrei N. Lupas
- Department of Protein EvolutionMax Planck Institute for Developmental BiologyTübingenGermany
| | - Cihan Makbul
- Biocenter and Rudolf Virchow Center, Julius‐Maximilians Universität WürzburgWürzburgGermany
| | - Karolina Michalska
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - John Moult
- Department of Cell Biology and Molecular GeneticsInstitute for Bioscience and Biotechnology Research, University of MarylandRockvilleMarylandUSA
| | - Prasun K. Mukherjee
- Nuclear Agriculture & Biotechnology DivisionBhabha Atomic Research CentreMumbaiIndia
| | - William (Sam) Nutt
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - Stefan L. Oliver
- Department of PediatricsStanford University School of MedicineStanfordCaliforniaUSA
| | - Anastassis Perrakis
- Oncode Institute and Division of BiochemistryNetherlands Cancer InstituteAmsterdamThe Netherlands
| | - Lucy Stols
- Center for Structural Genomics of Infectious Diseases, Consortium for Advanced Science and Engineering, University of ChicagoChicagoIllinoisUSA
- X‐ray Science DivisionArgonne National Laboratory, Structural Biology CenterArgonneIllinoisUSA
| | - John A. Tainer
- Department of Molecular and Cellular OncologyThe University of Texas M.D. Anderson Cancer CenterHoustonTexasUSA
- Department of Cancer BiologyUniversity of Texas MD Anderson Cancer CenterHoustonTexasUSA
| | - Maya Topf
- Institute of Structural and Molecular Biology, Birkbeck, University College LondonLondonUK
- Centre for Structural Systems Biology, Leibniz‐Institut für Experimentelle VirologieHamburgGermany
| | - Susan E. Tsutakawa
- Molecular Biophysics and Integrated BioimagingLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | | | - Torsten Schwede
- Biozentrum, University of BaselBaselSwitzerland
- Computational Structural BiologySIB Swiss Institute of BioinformaticsBaselSwitzerland
| |
Collapse
|
31
|
Kryshtafovych A, Moult J, Albrecht R, Chang GA, Chao K, Fraser A, Greenfield J, Hartmann MD, Herzberg O, Josts I, Leiman PG, Linden SB, Lupas AN, Nelson DC, Rees SD, Shang X, Sokolova ML, Tidow H. Computational models in the service of X-ray and cryo-electron microscopy structure determination. Proteins 2021; 89:1633-1646. [PMID: 34449113 PMCID: PMC8616789 DOI: 10.1002/prot.26223] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/11/2021] [Accepted: 08/17/2021] [Indexed: 01/20/2023]
Abstract
Critical assessment of structure prediction (CASP) conducts community experiments to determine the state of the art in computing protein structure from amino acid sequence. The process relies on the experimental community providing information about not yet public or about to be solved structures, for use as targets. For some targets, the experimental structure is not solved in time for use in CASP. Calculated structure accuracy improved dramatically in this round, implying that models should now be much more useful for resolving many sorts of experimental difficulties. To test this, selected models for seven unsolved targets were provided to the experimental groups. These models were from the AlphaFold2 group, who overall submitted the most accurate predictions in CASP14. Four targets were solved with the aid of the models, and, additionally, the structure of an already solved target was improved. An a posteriori analysis showed that, in some cases, models from other groups would also be effective. This paper provides accounts of the successful application of models to structure determination, including molecular replacement for X-ray crystallography, backbone tracing and sequence positioning in a cryo-electron microscopy structure, and correction of local features. The results suggest that, in future, there will be greatly increased synergy between computational and experimental approaches to structure determination.
Collapse
Affiliation(s)
| | - John Moult
- Institute for Bioscience and Biotechnology Research, Department of Cell Biology and Molecular genetics, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Reinhard Albrecht
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Geoffrey A. Chang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California-San Diego, La Jolla, CA, 92093, USA
- Department of Pharmacology, University of California-San Diego, La Jolla, CA, 92093, USA
| | - Kinlin Chao
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Alec Fraser
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics (SCSB), The University of Texas Medical Branch at Galveston, TX 77555, USA
| | - Julia Greenfield
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Marcus D. Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Osnat Herzberg
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Inokentijs Josts
- The Hamburg Advanced Research Center for Bioorganic Chemistry (HARBOR) & Department of Chemistry, Institute for Biochemistry and Molecular Biology, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | - Petr G. Leiman
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics (SCSB), The University of Texas Medical Branch at Galveston, TX 77555, USA
| | - Sara B. Linden
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Andrei N. Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Daniel C. Nelson
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
- Department of Veterinary Medicine, University of Maryland, College Park, MD 20742, USA
| | - Steven D. Rees
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California-San Diego, La Jolla, CA, 92093, USA
| | - Xiaoran Shang
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Maria L. Sokolova
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Henning Tidow
- The Hamburg Advanced Research Center for Bioorganic Chemistry (HARBOR) & Department of Chemistry, Institute for Biochemistry and Molecular Biology, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | | |
Collapse
|
32
|
Millán C, Keegan RM, Pereira J, Sammito MD, Simpkin AJ, McCoy AJ, Lupas AN, Hartmann MD, Rigden DJ, Read RJ. Assessing the utility of CASP14 models for molecular replacement. Proteins 2021; 89:1752-1769. [PMID: 34387010 PMCID: PMC8881082 DOI: 10.1002/prot.26214] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 07/20/2021] [Accepted: 07/27/2021] [Indexed: 11/21/2022]
Abstract
The assessment of CASP models for utility in molecular replacement is a measure of their use in a valuable real‐world application. In CASP7, the metric for molecular replacement assessment involved full likelihood‐based molecular replacement searches; however, this restricted the assessable targets to crystal structures with only one copy of the target in the asymmetric unit, and to those where the search found the correct pose. In CASP10, full molecular replacement searches were replaced by likelihood‐based rigid‐body refinement of models superimposed on the target using the LGA algorithm, with the metric being the refined log‐likelihood‐gain (LLG) score. This enabled multi‐copy targets and very poor models to be evaluated, but a significant further issue remained: the requirement of diffraction data for assessment. We introduce here the relative‐expected‐LLG (reLLG), which is independent of diffraction data. This reLLG is also independent of any crystal form, and can be calculated regardless of the source of the target, be it X‐ray, NMR or cryo‐EM. We calibrate the reLLG against the LLG for targets in CASP14, showing that it is a robust measure of both model and group ranking. Like the LLG, the reLLG shows that accurate coordinate error estimates add substantial value to predicted models. We find that refinement by CASP groups can often convert an inadequate initial model into a successful MR search model. Consistent with findings from others, we show that the AlphaFold2 models are sufficiently good, and reliably so, to surpass other current model generation strategies for attempting molecular replacement phasing.
Collapse
Affiliation(s)
- Claudia Millán
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| | - Ronan M Keegan
- Scientific Computing Dept., Science and Technologies Facilities Council, UK Research and Innovation, Didcot, Oxfordshire, United Kingdom
| | - Joana Pereira
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, Germany
| | - Massimo D Sammito
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| | - Adam J Simpkin
- Institute of Systems, Molecular and Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7BE, United Kingdom
| | - Airlie J McCoy
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| | - Andrei N Lupas
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, Germany
| | - Marcus D Hartmann
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, Germany
| | - Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7BE, United Kingdom
| | - Randy J Read
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| |
Collapse
|