1
|
Wang J, Zhang Q, Fan W, Shi Q, Mao J, Xie J, Chai G, Zhang C. Deciphering olfactory receptor binding mechanisms: a structural and dynamic perspective on olfactory receptors. Front Mol Biosci 2025; 11:1498796. [PMID: 39845900 PMCID: PMC11751049 DOI: 10.3389/fmolb.2024.1498796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Accepted: 12/23/2024] [Indexed: 01/24/2025] Open
Abstract
Olfactory receptors, classified as G-protein coupled receptors (GPCRs), have been a subject of scientific inquiry since the early 1950s. Historically, investigations into the sensory mechanisms of olfactory receptors were often confined to behavioral characteristics in model organisms or the expression of related proteins and genes. However, with the development of cryo-electron microscopy techniques, it has gradually become possible to decipher the specific structures of olfactory receptors in insects and humans. This has provided new insights into the binding mechanisms between odor molecules and olfactory receptors. Furthermore, due to the rapid advancements in related fields such as computer simulations, the prediction and exploration of odor molecule binding to olfactory receptors have been progressively achieved through molecular dynamics simulations. Through this comprehensive review, we aim to provide a thorough analysis of research related to the binding mechanisms between odor molecules and olfactory receptors from the perspectives of structural biology and molecular dynamics simulations. Finally, we will provide an outlook on the future of research in the field of olfactory receptor sensory mechanisms.
Collapse
Affiliation(s)
- Jingtao Wang
- College of Chemistry, Zhengzhou University, Zhengzhou, Henan, China
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
| | - Qidong Zhang
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
| | - Wu Fan
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
| | - Qingzhao Shi
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
| | - Jian Mao
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
| | - Jianping Xie
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
| | - Guobi Chai
- Department of tobacco flavor, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, Henan, China
- Food Laboratory of Zhongyuan, Flavour Science Research Center of Zhengzhou University, Zhengzhou, Henan, China
| | - Chenglei Zhang
- Medical Laboratory, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China
| |
Collapse
|
2
|
Gut JA, Lemmin T. Dissecting AlphaFold2's capabilities with limited sequence information. BIOINFORMATICS ADVANCES 2024; 5:vbae187. [PMID: 39846081 PMCID: PMC11751578 DOI: 10.1093/bioadv/vbae187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 10/24/2024] [Accepted: 11/21/2024] [Indexed: 01/24/2025]
Abstract
Summary Protein structure prediction aims to infer a protein's three-dimensional (3D) structure from its amino acid sequence. Protein structure is pivotal for elucidating protein functions, interactions, and driving biotechnological innovation. The deep learning model AlphaFold2, has revolutionized this field by leveraging phylogenetic information from multiple sequence alignments (MSAs) to achieve remarkable accuracy in protein structure prediction. However, a key question remains: how well does AlphaFold2 understand protein structures? This study investigates AlphaFold2's capabilities when relying primarily on high-quality template structures, without the additional information provided by MSAs. By designing experiments that probe local and global structural understanding, we aimed to dissect its dependence on specific features and its ability to handle missing information. Our findings revealed AlphaFold2's reliance on sterically valid C β for correctly interpreting structural templates. Additionally, we observed its remarkable ability to recover 3D structures from certain perturbations and the negligible impact of the previous structure in recycling. Collectively, these results support the hypothesis that AlphaFold2 has learned an accurate biophysical energy function. However, this function seems most effective for local interactions. Our work advances understanding of how deep learning models predict protein structures and provides guidance for researchers aiming to overcome limitations in these models. Availability and implementation Data and implementation are available at https://github.com/ibmm-unibe-ch/template-analysis.
Collapse
Affiliation(s)
- Jannik Adrian Gut
- Institute of Biochemistry and Molecular Medicine, University of Bern, Bern 3012, Switzerland
- Graduate School for Cellular and Biomedical Sciences (GCB), University of Bern, Bern 3012, Switzerland
| | - Thomas Lemmin
- Institute of Biochemistry and Molecular Medicine, University of Bern, Bern 3012, Switzerland
| |
Collapse
|
3
|
Zhang F, Li Z, Zhao K, Zhao P, Zhang G. Prediction of Inter-Residue Multiple Distances and Exploration of Protein Multiple Conformations by Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1731-1739. [PMID: 38857126 DOI: 10.1109/tcbb.2024.3411825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]
Abstract
AlphaFold2 has achieved a major breakthrough in end-to-end prediction for static protein structures. However, protein conformational change is considered to be a key factor in protein biological function. Inter-residue multiple distances prediction is of great significance for research on protein multiple conformations exploration. In this study, we proposed an inter-residue multiple distances prediction method, DeepMDisPre, based on an improved network which integrates triangle update, axial attention and ResNet to predict multiple distances of residue pairs. We built a dataset which contains proteins with a single structure and proteins with multiple conformations to train the network. We tested DeepMDisPre on 114 proteins with multiple conformations. The results show that the inter-residue distance distribution predicted by DeepMDisPre tends to have multiple peaks for flexible residue pairs than for rigid residue pairs. On two cases of proteins with multiple conformations, we modeled the multiple conformations relatively accurately by using the predicted inter-residue multiple distances. In addition, we also tested the performance of DeepMDisPre on 279 proteins with a single structure. Experimental results demonstrate that the average contact accuracy of DeepMDisPre is higher than that of the comparative method. In terms of static protein modeling, the average TM-score of the 3D models built by DeepMDisPre is also improved compared with the comparative method.
Collapse
|
4
|
Cheng J, Liang T, Xie XQ, Feng Z, Meng L. A new era of antibody discovery: an in-depth review of AI-driven approaches. Drug Discov Today 2024; 29:103984. [PMID: 38642702 DOI: 10.1016/j.drudis.2024.103984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 04/02/2024] [Accepted: 04/15/2024] [Indexed: 04/22/2024]
Abstract
Given their high affinity and specificity for a range of macromolecules, antibodies are widely used in the treatment of autoimmune diseases, cancers, inflammatory diseases, and Alzheimer's disease (AD). Traditional experimental methods are time-consuming, expensive, and labor-intensive. Recent advances in artificial intelligence (AI) technologies provide complementary methods that can reduce the time and costs required for antibody design by minimizing failures and increasing the success rate of experimental tests. In this review, we scrutinize the plethora of AI-driven methodologies that have been deployed over the past 4 years for modeling antibody structures, predicting antibody-antigen interactions, optimizing antibody affinity, and generating novel antibody candidates. We also briefly address the challenges faced in integrating AI-based models with traditional antibody discovery pipelines and highlight the potential future directions in this burgeoning field.
Collapse
Affiliation(s)
- Jin Cheng
- School of Pharmacy, Jiangsu Vocational College of Medicine, Yancheng, 224005, China
| | - Tianjian Liang
- Department of Pharmaceutical Sciences, Computational Chemical Genomics Screening Center, and Pharmacometrics & System Pharmacology PharmacoAnalytics, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA; Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Xiang-Qun Xie
- Department of Pharmaceutical Sciences, Computational Chemical Genomics Screening Center, and Pharmacometrics & System Pharmacology PharmacoAnalytics, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA; Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, Pittsburgh, PA 15261, USA; Drug Discovery Institute, University of Pittsburgh, Pittsburgh, PA 15261, USA; Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15261, USA; Department of Structural Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15261, USA.
| | - Zhiwei Feng
- Department of Pharmaceutical Sciences, Computational Chemical Genomics Screening Center, and Pharmacometrics & System Pharmacology PharmacoAnalytics, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA; Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, Pittsburgh, PA 15261, USA.
| | - Li Meng
- School of Pharmacy, Jiangsu Vocational College of Medicine, Yancheng, 224005, China.
| |
Collapse
|
5
|
Li Z, Fan H, Ding W. Solving protein structures by combining structure prediction, molecular replacement and direct-methods-aided model completion. IUCRJ 2024; 11:152-167. [PMID: 38214490 PMCID: PMC10916285 DOI: 10.1107/s2052252523010291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/29/2023] [Indexed: 01/13/2024]
Abstract
Highly accurate protein structure prediction can generate accurate models of protein and protein-protein complexes in X-ray crystallography. However, the question of how to make more effective use of predicted models for completing structure analysis, and which strategies should be employed for the more challenging cases such as multi-helical structures, multimeric structures and extremely large structures, both in the model preparation and in the completion steps, remains open for discussion. In this paper, a new strategy is proposed based on the framework of direct methods and dual-space iteration, which can greatly simplify the pre-processing steps of predicted models both in normal and in challenging cases. Following this strategy, full-length models or the conservative structural domains could be used directly as the starting model, and the phase error and the model bias between the starting model and the real structure would be modified in the direct-methods-based dual-space iteration. Many challenging cases (from CASP14) have been tested for the general applicability of this constructive strategy, and almost complete models have been generated with reasonable statistics. The hybrid strategy therefore provides a meaningful scheme for X-ray structure determination using a predicted model as the starting point.
Collapse
Affiliation(s)
- Zengru Li
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
- School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100049, People’s Republic of China
| | - Haifu Fan
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
| | - Wei Ding
- Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
| |
Collapse
|
6
|
Das A, Cheng H, Wang Y, Kinch LN, Liang G, Hong S, Hobbs HH, Cohen JC. The ubiquitin E3 ligase BFAR promotes degradation of PNPLA3. Proc Natl Acad Sci U S A 2024; 121:e2312291121. [PMID: 38294943 PMCID: PMC10861911 DOI: 10.1073/pnas.2312291121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 12/26/2023] [Indexed: 02/02/2024] Open
Abstract
A missense variant in patatin-like phospholipase domain-containing protein 3 [PNPLA3(I148M)] is the most impactful genetic risk factor for fatty liver disease (FLD). We previously showed that PNPLA3 is ubiquitylated and subsequently degraded by proteasomes and autophagosomes and that the PNPLA3(148M) variant interferes with this process. To define the machinery responsible for PNPLA3 turnover, we used small interfering (si)RNAs to inactivate components of the ubiquitin proteasome system. Inactivation of bifunctional apoptosis regulator (BFAR), a membrane-bound E3 ubiquitin ligase, reproducibly increased PNPLA3 levels in two lines of cultured hepatocytes. Conversely, overexpression of BFAR decreased levels of endogenous PNPLA3 in HuH7 cells. BFAR and PNPLA3 co-immunoprecipitated when co-expressed in cells. BFAR promoted ubiquitylation of PNPLA3 in vitro in a reconstitution assay using purified, epitope-tagged recombinant proteins. To confirm that BFAR targets PNPLA3, we inactivated Bfar in mice. Levels of PNPLA3 protein were increased twofold in hepatic lipid droplets of Bfar-/- mice with no associated increase in PNPLA3 mRNA levels. Taken together these data are consistent with a model in which BFAR plays a role in the post-translational degradation of PNPLA3. The identification of BFAR provides a potential target to enhance PNPLA3 turnover and prevent FLD.
Collapse
Affiliation(s)
- Avash Das
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Haili Cheng
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Yang Wang
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Lisa N. Kinch
- HHMI, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Guosheng Liang
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Sen Hong
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Helen H. Hobbs
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
- HHMI, University of Texas Southwestern Medical Center, Dallas, TX75390
| | - Jonathan C. Cohen
- Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, TX75390
- Center for Human Nutrition, University of Texas Southwestern Medical Center, Dallas, TX75390
| |
Collapse
|
7
|
Kinch LN, Schaeffer RD, Zhang J, Cong Q, Orth K, Grishin N. Insights into virulence: structure classification of the Vibrio parahaemolyticus RIMD mobilome. mSystems 2023; 8:e0079623. [PMID: 38014954 PMCID: PMC10734457 DOI: 10.1128/msystems.00796-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 10/17/2023] [Indexed: 11/29/2023] Open
Abstract
IMPORTANCE The pandemic Vpar strain RIMD causes seafood-borne illness worldwide. Previous comparative genomic studies have revealed pathogenicity islands in RIMD that contribute to the success of the strain in infection. However, not all virulence determinants have been identified, and many of the proteins encoded in known pathogenicity islands are of unknown function. Based on the EOCD database, we used evolution-based classification of structure models for the RIMD proteome to improve our functional understanding of virulence determinants acquired by the pandemic strain. We further identify and classify previously unknown mobile protein domains as well as fast evolving residue positions in structure models that contribute to virulence and adaptation with respect to a pre-pandemic strain. Our work highlights key contributions of phage in mediating seafood born illness, suggesting this strain balances its avoidance of phage predators with its successful colonization of human hosts.
Collapse
Affiliation(s)
- Lisa N. Kinch
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - R. Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Jing Zhang
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Qian Cong
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Kim Orth
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Nick Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
8
|
Kryshtafovych A, Rigden DJ. To split or not to split: CASP15 targets and their processing into tertiary structure evaluation units. Proteins 2023; 91:1558-1570. [PMID: 37254889 PMCID: PMC10687315 DOI: 10.1002/prot.26533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 05/02/2023] [Accepted: 05/18/2023] [Indexed: 06/01/2023]
Abstract
Processing of CASP15 targets into evaluation units (EUs) and assigning them to evolutionary-based prediction classes is presented in this study. The targets were first split into structural domains based on compactness and similarity to other proteins. Models were then evaluated against these domains and their combinations. The domains were joined into larger EUs if predictors' performance on the combined units was similar to that on individual domains. Alternatively, if most predictors performed better on the individual domains, then they were retained as EUs. As a result, 112 evaluation units were created from 77 tertiary structure prediction targets. The EUs were assigned to four prediction classes roughly corresponding to target difficulty categories in previous CASPs: TBM (template-based modeling, easy or hard), FM (free modeling), and the TBM/FM overlap category. More than a third of CASP15 EUs were attributed to the historically most challenging FM class, where homology or structural analogy to proteins of known fold cannot be detected.
Collapse
Affiliation(s)
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
9
|
Huang GJ, Parry TK, McLaughlin WA. Assessment of the Performances of the Protein Modeling Techniques Participating in CASP15 Using a Structure-Based Functional Site Prediction Approach: ResiRole. Bioengineering (Basel) 2023; 10:1377. [PMID: 38135968 PMCID: PMC10740689 DOI: 10.3390/bioengineering10121377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 12/24/2023] Open
Abstract
BACKGROUND Model quality assessments via computational methods which entail comparisons of the modeled structures to the experimentally determined structures are essential in the field of protein structure prediction. The assessments provide means to benchmark the accuracies of the modeling techniques and to aid with their development. We previously described the ResiRole method to gauge model quality principally based on the preservation of the structural characteristics described in SeqFEATURE functional site prediction models. METHODS We apply ResiRole to benchmark modeling group performances in the Critical Assessment of Structure Prediction experiment, round 15. To gauge model quality, a normalized Predicted Functional site Similarity Score (PFSS) was calculated as the average of one minus the absolute values of the differences of the functional site prediction probabilities, as found for the experimental structures versus those found at the corresponding sites in the structure models. RESULTS The average PFSS per modeling group (gPFSS) correlates with standard quality metrics, and can effectively be used to rank the accuracies of the groups. For the free modeling (FM) category, correlation coefficients of the Local Distance Difference Test (LDDT) and Global Distance Test-Total Score (GDT-TS) metrics with gPFSS were 0.98239 and 0.87691, respectively. An example finding for a specific group is that the gPFSS for EMBER3D was higher than expected based on the predictive relationship between gPFSS and LDDT. We infer the result is due to the use of constraints imprinted by function that are a part of the EMBER3D methodology. Also, we find functional site predictions that may guide further functional characterizations of the respective proteins. CONCLUSION The gPFSS metric provides an effective means to assess and rank the performances of the structure prediction techniques according to their abilities to accurately recount the structural features at predicted functional sites.
Collapse
Affiliation(s)
| | | | - William A. McLaughlin
- Department of Medical Education, Geisinger Commonwealth School of Medicine, 525 Pine Street, Scranton, PA 18509, USA (T.K.P.)
| |
Collapse
|
10
|
Polonsky K, Pupko T, Freund NT. Evaluation of the Ability of AlphaFold to Predict the Three-Dimensional Structures of Antibodies and Epitopes. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023; 211:1578-1588. [PMID: 37782047 DOI: 10.4049/jimmunol.2300150] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/06/2023] [Indexed: 10/03/2023]
Abstract
Being able to accurately predict the three-dimensional structure of an Ab can facilitate Ab characterization and epitope prediction, with important diagnostic and clinical implications. In this study, we evaluated the ability of AlphaFold to predict the structures of 222 recently published, high-resolution Fab H and L chain structures of Abs from different species directed against different Ags. We show that although the overall Ab prediction quality is in line with the results of CASP14, regions such as the complementarity-determining regions (CDRs) of the H chain, which are prone to higher variation, are predicted less accurately. Moreover, we discovered that AlphaFold mispredicts the bending angles between the variable and constant domains. To evaluate the ability of AlphaFold to model Ab-Ag interactions based only on sequence, we used AlphaFold-Multimer in combination with ZDOCK to predict the structures of 26 known Ab-Ag complexes. ZDOCK, which was applied on bound components of both the Ab and the Ag, succeeded in assembling 11 complexes, whereas AlphaFold succeeded in predicting only 2 of 26 models, with significant deviations in the docking contacts predicted in the rest of the molecules. Within the 11 complexes that were successfully predicted by ZDOCK, 9 involved short-peptide Ags (18-mer or less), whereas only 2 were complexes of Ab with a full-length protein. Docking of modeled unbound Ab and Ag was unsuccessful. In summary, our study provides important information about the abilities and limitations of using AlphaFold to predict Ab-Ag interactions and suggests areas for possible improvement.
Collapse
Affiliation(s)
- Ksenia Polonsky
- Department of Clinical Microbiology and Immunology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Tal Pupko
- Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Natalia T Freund
- Department of Clinical Microbiology and Immunology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
11
|
Gil Zuluaga FH, D’Arminio N, Bardozzo F, Tagliaferri R, Marabotti A. An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction. Comput Struct Biotechnol J 2023; 21:5620-5629. [PMID: 38047234 PMCID: PMC10690423 DOI: 10.1016/j.csbj.2023.10.056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/31/2023] [Accepted: 10/31/2023] [Indexed: 12/05/2023] Open
Abstract
The ability to predict a protein's three-dimensional conformation represents a crucial starting point for investigating evolutionary connections with other members of the corresponding protein family, examining interactions with other proteins, and potentially utilizing this knowledge for the purpose of rational drug design. In this work, we evaluated the feasibility of improving AlphaFold2's three-dimensional protein predictions by developing a novel pipeline (AlphaMod) that incorporates AlphaFold2 with MODELLER, a template-based modeling program. Additionally, our tool can drive a comprehensive quality assessment of the tertiary protein structure by incorporating and comparing a set of different quality assessment tools. The outcomes of selected tools are combined into a composite score (BORDASCORE) that exhibits a meaningful correlation with GDT_TS and facilitates the selection of optimal models in the absence of a reference structure. To validate AlphaMod's results, we conducted evaluations using two distinct datasets summing up to 72 targets, previously used to independently assess AlphaFold2's performance. The generated models underwent evaluation through two methods: i) averaging the GDT_TS scores across all produced structures for a single target sequence, and ii) a pairwise comparison of the best structures generated by AlphaFold2 and AlphaMod. The latter, within the unsupervised setups, shows a rising accuracy of approximately 34% over AlphaFold2. While, when considering the supervised setup, AlphaMod surpasses AlphaFold2 in 18% of the instances. Finally, there is an 11% correspondence in outcomes between the diverse methodologies. Consequently, AlphaMod's best-predicted tertiary structures in several cases exhibited a significant improvement in the accuracy of the predictions with respect to the best models obtained by AlphaFold2. This pipeline paves the way for the integration of additional data and AI-based algorithms to further improve the reliability of the predictions.
Collapse
Affiliation(s)
- Fabio Hernan Gil Zuluaga
- Department of Management & Innovation Systems, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy
| | - Nancy D’Arminio
- Department of Chemistry and Biology “A. Zambelli”, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy
| | - Francesco Bardozzo
- Department of Management & Innovation Systems, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy
| | - Roberto Tagliaferri
- Department of Management & Innovation Systems, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy
| | - Anna Marabotti
- Department of Chemistry and Biology “A. Zambelli”, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy
| |
Collapse
|
12
|
Herzberg O, Moult J. More than just pattern recognition: Prediction of uncommon protein structure features by AI methods. Proc Natl Acad Sci U S A 2023; 120:e2221745120. [PMID: 37399411 PMCID: PMC10334792 DOI: 10.1073/pnas.2221745120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 06/01/2023] [Indexed: 07/05/2023] Open
Abstract
The CASP14 experiment demonstrated the extraordinary structure modeling capabilities of artificial intelligence (AI) methods. That result has ignited a fierce debate about what these methods are actually doing. One of the criticisms has been that the AI does not have any sense of the underlying physics but is merely performing pattern recognition. Here, we address that issue by analyzing the extent to which the methods identify rare structural motifs. The rationale underlying the approach is that a pattern recognition machine tends to choose the more frequently occurring motifs, whereas some sense of subtle energetic factors is required to choose infrequently occurring ones. To reduce the possibility of bias from related experimental structures and to minimize the effect of experimental errors, we examined only CASP14 target protein crystal structures determined to a resolution limit better than 2 Å, which lacked significant amino acid sequence homology to proteins of known structure. In those experimental structures and in the corresponding models, we track cis peptides, π-helices, 310-helices, and other small 3D motifs that occur in the PDB database at a frequency of lower than 1% of total amino acid residues. The best-performing AI method, AlphaFold2, captured these uncommon structural elements exquisitely well. All discrepancies appeared to be a consequence of crystal environment effects. We propose that the neural network learned a protein structure potential of mean force, enabling it to correctly identify situations where unusual structural features represent the lowest local free energy because of subtle influences from the atomic environment.
Collapse
Affiliation(s)
- Osnat Herzberg
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD20850
- Chemistry and Biochemistry Department, University of Maryland, Chemistry Building, College Park, MD20742
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD20850
- Department of Cell Biology and Molecular Genetics, University of Maryland, Microbiology Building, College Park, MD20742
| |
Collapse
|
13
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. Acta Crystallogr F Struct Biol Commun 2023; 79:166-168. [PMID: 37358500 PMCID: PMC10327576 DOI: 10.1107/s2053230x23004934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/27/2023] Open
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J. Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N. Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J. van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
14
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. IUCRJ 2023; 10:377-379. [PMID: 37358477 PMCID: PMC10324484 DOI: 10.1107/s2052252523004943] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/27/2023]
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J. Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N. Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J. van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
15
|
Read RJ, Baker EN, Bond CS, Garman EF, van Raaij MJ. AlphaFold and the future of structural biology. Acta Crystallogr D Struct Biol 2023; 79:556-558. [PMID: 37378959 DOI: 10.1107/s2059798323004928] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023] Open
Abstract
This editorial acknowledges the transformative impact of new machine-learning methods, such as the use of AlphaFold, but also makes the case for the continuing need for experimental structural biology.
Collapse
Affiliation(s)
- Randy J Read
- Cambridge Institute for Medical Research, University of Cambridge, The Keith Peters Building, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Edward N Baker
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Charles S Bond
- School of Molecular Sciences, University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia
| | - Elspeth F Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, United Kingdom
| | - Mark J van Raaij
- Departamento de Estructura de Macromoleculas, Centro Nacional de Biotecnologia, Consejo Superior de Investigaciones Cientificas, 28049 Madrid, Spain
| |
Collapse
|
16
|
Zhang J, Schaeffer RD, Durham J, Cong Q, Grishin NV. DPAM: A domain parser for AlphaFold models. Protein Sci 2023; 32:e4548. [PMID: 36539305 PMCID: PMC9850437 DOI: 10.1002/pro.4548] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 12/06/2022] [Accepted: 12/13/2022] [Indexed: 01/20/2023]
Abstract
The recent breakthroughs in structure prediction, where methods such as AlphaFold demonstrated near-atomic accuracy, herald a paradigm shift in structural biology. The 200 million high-accuracy models released in the AlphaFold Database are expected to guide protein science in the coming decades. Partitioning these AlphaFold models into domains and assigning them to an evolutionary hierarchy provide an efficient way to gain functional insights into proteins. However, classifying such a large number of predicted structures challenges the infrastructure of current structure classifications, including our Evolutionary Classification of protein Domains (ECOD). Better computational tools are urgently needed to parse and classify domains from AlphaFold models automatically. Here we present a Domain Parser for AlphaFold Models (DPAM) that can automatically recognize globular domains from these models based on inter-residue distances in 3D structures, predicted aligned errors, and ECOD domains found by sequence (HHsuite) and structural (Dali) similarity searches. Based on a benchmark of 18,759 AlphaFold models, we demonstrate that DPAM can recognize 98.8% of domains and assign correct boundaries for 87.5%, significantly outperforming structure-based domain parsers and homology-based domain assignment using ECOD domains found by HHsuite or Dali. Application of DPAM to the massive AlphaFold models will enable efficient classification of domains, providing evolutionary contexts and facilitating functional studies.
Collapse
Affiliation(s)
- Jing Zhang
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - R. Dustin Schaeffer
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Jesse Durham
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Nick V. Grishin
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiochemistryUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| |
Collapse
|
17
|
Heo L, Feig M. Multi-state modeling of G-protein coupled receptors at experimental accuracy. Proteins 2022; 90:1873-1885. [PMID: 35510704 PMCID: PMC9561049 DOI: 10.1002/prot.26382] [Citation(s) in RCA: 113] [Impact Index Per Article: 37.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 04/07/2022] [Accepted: 04/26/2022] [Indexed: 12/30/2022]
Abstract
The family of G-protein coupled receptors (GPCRs) is one of the largest protein families in the human genome. GPCRs transduct chemical signals from extracellular to intracellular regions via a conformational switch between active and inactive states upon ligand binding. While experimental structures of GPCRs remain limited, high-accuracy computational predictions are now possible with AlphaFold2. However, AlphaFold2 only predicts one state and is biased toward either the active or inactive conformation depending on the GPCR class. Here, a multi-state prediction protocol is introduced that extends AlphaFold2 to predict either active or inactive states at very high accuracy using state-annotated templated GPCR databases. The predicted models accurately capture the main structural changes upon activation of the GPCR at the atomic level. For most of the benchmarked GPCRs (10 out of 15), models in the active and inactive states were closer to their corresponding activation state structures. Median RMSDs of the transmembrane regions were 1.12 Å and 1.41 Å for the active and inactive state models, respectively. The models were more suitable for protein-ligand docking than the original AlphaFold2 models and template-based models. Finally, our prediction protocol predicted accurate GPCR structures and GPCR-peptide complex structures in GPCR Dock 2021, a blind GPCR-ligand complex modeling competition. We expect that high accuracy GPCR models in both activation states will promote understanding in GPCR activation mechanisms and drug discovery for GPCRs. At the time, the new protocol paves the way towards capturing the dynamics of proteins at high-accuracy via machine-learning methods.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular BiologyMichigan State UniversityEast LansingMichiganUSA
| | - Michael Feig
- Department of Biochemistry and Molecular BiologyMichigan State UniversityEast LansingMichiganUSA
| |
Collapse
|
18
|
Öten AM, Atak E, Taktak Karaca B, Fırtına S, Kutlu A. Discussing the roles of proline and glycine from the perspective of cold adaptation in lipases and cellulases. BIOCATAL BIOTRANSFOR 2022. [DOI: 10.1080/10242422.2022.2124111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Ahmet Melih Öten
- Biology Education Center, Faculty of Science and Technology, Uppsala University, Uppsala, Sweden
| | - Evren Atak
- Bioinformatics and System Biology, Bioengineering Department, Gebze Technical University, Kocaeli, Turkey
| | - Banu Taktak Karaca
- Molecular Biology & Genetics Department, Faculty of Natural Science and Engineering, Atlas University, Istanbul, Turkey
| | - Sinem Fırtına
- Bioinformatics & Genetics, Faculty of Natural Science and Engineering, İstinye University, Istanbul, Turkey
| | - Aslı Kutlu
- Bioinformatics & Genetics, Faculty of Natural Science and Engineering, İstinye University, Istanbul, Turkey
| |
Collapse
|
19
|
Della Corte D, Morris CJ, Billings WM, Stern J, Jarrett AJ, Hedelius B, Bennion A. Training undergraduate research assistants with an outcome-oriented and skill-based mentoring strategy. Acta Crystallogr D Struct Biol 2022; 78:936-944. [PMID: 35916219 PMCID: PMC9344475 DOI: 10.1107/s2059798322005861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 06/01/2022] [Indexed: 11/10/2022] Open
Abstract
Effective mentoring of undergraduate students is a growing requirement for the promotion of faculty at many universities. It is often challenging for young investigators to define a successful mentoring strategy, partially due to the absence of a broadly accepted definition of what mentoring should entail. To overcome this, an outcome-oriented mentoring framework was developed and used with more than 25 students over three years. It was found that a systematic mentoring approach can help students quickly realize their scientific potential and result in meaningful contributions to science. This report especially shows how the Critical Assessment of Protein Structure Prediction (CASP14) challenge was used to amplify student research efforts. As a result of this challenge, multiple publications, presentations and scholarships were awarded to the participating students. The mentoring framework continues to see much success in allowing undergraduate students, including students from underrepresented groups, to foster scientific talent and make meaningful contributions to the scientific community.
Collapse
Affiliation(s)
- Dennis Della Corte
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| | - Connor J. Morris
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| | - Wendy M. Billings
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| | - Jacob Stern
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| | - Austin J. Jarrett
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| | - Bryce Hedelius
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| | - Adam Bennion
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, USA
| |
Collapse
|
20
|
Kinch LN, Cong Q, Jaishankar J, Orth K. Co-component signal transduction systems: Fast-evolving virulence regulation cassettes discovered in enteric bacteria. Proc Natl Acad Sci U S A 2022; 119:e2203176119. [PMID: 35648808 PMCID: PMC9214523 DOI: 10.1073/pnas.2203176119] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 04/08/2022] [Indexed: 01/31/2023] Open
Abstract
Bacterial signal transduction systems sense changes in the environment and transmit these signals to control cellular responses. The simplest one-component signal transduction systems include an input sensor domain and an output response domain encoded in a single protein chain. Alternatively, two-component signal transduction systems transmit signals by phosphorelay between input and output domains from separate proteins. The membrane-tethered periplasmic bile acid sensor that activates the Vibrio parahaemolyticus type III secretion system adopts an obligate heterodimer of two proteins encoded by partially overlapping VtrA and VtrC genes. This co-component signal transduction system binds bile acid using a lipocalin-like domain in VtrC and transmits the signal through the membrane to a cytoplasmic DNA-binding transcription factor in VtrA. Using the domain and operon organization of VtrA/VtrC, we identify a fast-evolving superfamily of co-component systems in enteric bacteria. Accurate machine learning–based fold predictions for the candidate co-components support their homology in the twilight zone of rapidly evolving sequences and provide mechanistic hypotheses about previously unrecognized lipid-sensing functions.
Collapse
Affiliation(s)
- Lisa N. Kinch
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390
- HHMI, University of Texas Southwestern Medical Center, Dallas, TX 75390
| | - Qian Cong
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX 75390
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390
| | - Jananee Jaishankar
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390
- HHMI, University of Texas Southwestern Medical Center, Dallas, TX 75390
| | - Kim Orth
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390
- HHMI, University of Texas Southwestern Medical Center, Dallas, TX 75390
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390
| |
Collapse
|
21
|
Hong Y, Lee J, Ko J. A-Prot: protein structure modeling using MSA transformer. BMC Bioinformatics 2022; 23:93. [PMID: 35296230 PMCID: PMC8925138 DOI: 10.1186/s12859-022-04628-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 03/03/2022] [Indexed: 11/18/2022] Open
Abstract
Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structures. The success of AF shows that the multiple sequence alignment of a sequence contains rich evolutionary information, leading to accurate 3D models. Despite the success of AF, only the prediction code is open, and training a similar model requires a vast amount of computational resources. Thus, developing a lighter prediction model is still necessary. Results In this study, we propose a new protein 3D structure modeling method, A-Prot, using MSA Transformer, one of the state-of-the-art protein language models. An MSA feature tensor and row attention maps are extracted and converted into 2D residue-residue distance and dihedral angle predictions for a given MSA. We demonstrated that A-Prot predicts long-range contacts better than the existing methods. Additionally, we modeled the 3D structures of the free modeling and hard template-based modeling targets of CASP14. The assessment shows that the A-Prot models are more accurate than most top server groups of CASP14. Conclusion These results imply that A-Prot accurately captures the evolutionary and structural information of proteins with relatively low computational cost. Thus, A-Prot can provide a clue for the development of other protein property prediction methods. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04628-8.
Collapse
Affiliation(s)
- Yiyu Hong
- Arontier Co, Seoul, Republic of Korea
| | - Juyong Lee
- Arontier Co, Seoul, Republic of Korea. .,Department of Chemistry, Division of Chemistry and Biochemistry, Kangwon National University, Chuncheon, Republic of Korea.
| | - Junsu Ko
- Arontier Co, Seoul, Republic of Korea
| |
Collapse
|
22
|
Ankrah NYD, Bernstein DB, Biggs M, Carey M, Engevik M, García-Jiménez B, Lakshmanan M, Pacheco AR, Sulheim S, Medlock GL. Enhancing Microbiome Research through Genome-Scale Metabolic Modeling. mSystems 2021; 6:e0059921. [PMID: 34904863 PMCID: PMC8670372 DOI: 10.1128/msystems.00599-21] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Construction and analysis of genome-scale metabolic models (GEMs) is a well-established systems biology approach that can be used to predict metabolic and growth phenotypes. The ability of GEMs to produce mechanistic insight into microbial ecological processes makes them appealing tools that can open a range of exciting opportunities in microbiome research. Here, we briefly outline these opportunities, present current rate-limiting challenges for the trustworthy application of GEMs to microbiome research, and suggest approaches for moving the field forward.
Collapse
Affiliation(s)
- Nana Y. D. Ankrah
- State University of New York at Plattsburgh, Plattsburgh, New York, USA
| | | | | | - Maureen Carey
- University of Virginia, Charlottesville, Virginia, USA
| | - Melinda Engevik
- Medical University of South Carolina, Charleston, South Carolina, USA
| | | | - Meiyappan Lakshmanan
- Bioprocessing Technology Institute, Agency for Science, Technology and Research (A*STAR), Singapore
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore
| | | | | | | |
Collapse
|
23
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins 2021; 89:1607-1617. [PMID: 34533838 PMCID: PMC8726744 DOI: 10.1002/prot.26237] [Citation(s) in RCA: 273] [Impact Index Per Article: 68.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 07/28/2021] [Indexed: 01/14/2023]
Abstract
Critical assessment of structure prediction (CASP) is a community experiment to advance methods of computing three-dimensional protein structure from amino acid sequence. Core components are rigorous blind testing of methods and evaluation of the results by independent assessors. In the most recent experiment (CASP14), deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. In this sense, the results represent a solution to the classical protein-folding problem, at least for single proteins. The models have already been shown to be capable of providing solutions for problematic crystal structures, and there are broad implications for the rest of structural biology. Other research groups also substantially improved performance. Here, we describe these results and outline some of the many implications. Other related areas of CASP, including modeling of protein complexes, structure refinement, estimation of model accuracy, and prediction of inter-residue contacts and distances, are also described.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Maya Topf
- Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universit tsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, 9600 Gudelsky Drive, Rockville, MD 20850, USA, Department of Cell Biology and Molecular Genetics, University of Maryland
| |
Collapse
|
24
|
Cragnolini T, Kryshtafovych A, Topf M. Cryo-EM targets in CASP14. Proteins 2021; 89:1949-1958. [PMID: 34398978 PMCID: PMC8630773 DOI: 10.1002/prot.26216] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 07/27/2021] [Accepted: 08/06/2021] [Indexed: 11/22/2022]
Abstract
Structures of seven CASP14 targets were determined using cryo-electron microscopy (cryo-EM) technique with resolution between 2.1 and 3.8 Å. We provide an evaluation of the submitted models versus the experimental data (cryo-EM density maps) and experimental reference structures built into the maps. The accuracy of models is measured in terms of coordinate-to-density and coordinate-to-coordinate fit. A-posteriori refinement of the most accurate models in their corresponding cryo-EM density resulted in structures that are close to the reference structure, including some regions with better fit to the density. Regions that were found to be less "refineable" correlate well with regions of high diversity between the CASP models and low goodness-of-fit to density in the reference structure.
Collapse
Affiliation(s)
- Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck, University College London, London, UK
| | | | - Maya Topf
- Center for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| |
Collapse
|
25
|
Ruiz-Serra V, Pontes C, Milanetti E, Kryshtafovych A, Lepore R, Valencia A. Assessing the accuracy of contact and distance predictions in CASP14. Proteins 2021; 89:1888-1900. [PMID: 34595772 DOI: 10.1002/prot.26248] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 09/06/2021] [Accepted: 09/21/2021] [Indexed: 12/26/2022]
Abstract
We present the results of the assessment of the intramolecular residue-residue contact and distance predictions from groups participating in the 14th round of the CASP experiment. The performance of contact prediction methods was evaluated with the measures used in previous CASPs, while distance predictions were assessed based on a new protocol, which considers individual distance pairs as well as the whole predicted distance matrix, using a graph-based framework. The results of the evaluation indicate that predictions by the tFold framework, TripletRes and DeepPotential were the most accurate in both categories. With regards to progress in method performance, the results of the assessment in contact prediction did not reveal any discernible difference when compared to CASP13. Arguably, this could be due to CASP14 FM targets being more challenging than ever before.
Collapse
Affiliation(s)
| | - Camila Pontes
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Edoardo Milanetti
- Department of Physics, Sapienza Università di Roma, Rome, Italy.,Center for Life Nano- & Neuro-Science, Fondazione Istituto Italiano di Tecnologia (IIT), Rome, Italy
| | | | | | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,ICREA, Pg. Lluís Companys, Barcelona, Spain
| |
Collapse
|
26
|
Schaeffer RD, Kinch L, Kryshtafovych A, Grishin NV. Assessment of domain interactions in the fourteenth round of the Critical Assessment of Structure Prediction (CASP14). Proteins 2021; 89:1700-1710. [PMID: 34455641 PMCID: PMC8616818 DOI: 10.1002/prot.26225] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 08/07/2021] [Accepted: 08/24/2021] [Indexed: 12/29/2022]
Abstract
The high accuracy of some CASP14 models at the domain level prompted a more detailed evaluation of structure predictions on whole targets. For the first time in critical assessment of structure prediction (CASP), we evaluated accuracy of difficult domain assembly in models submitted for multidomain targets where the community predicted individual evaluation units (EUs) with greater accuracy than full-length targets. Ten proteins with domain interactions that did not show evidence of conformational change and were not involved in significant oligomeric contacts were chosen as targets for the domain interaction assessment. Groups were ranked using complementary interaction scores (F1, QS score, and Jaccard coefficient), and their predictions were evaluated for their ability to correctly model inter-domain interfaces and overall protein folds. Target performance was broadly grouped into two clusters. The first consisted primarily of targets containing two EUs wherein predictors more broadly predicted domain positioning and interfacial contacts correctly. The other consisted of complex two-EU and three-EU targets where few predictors performed well. The highest ranked predictor, AlphaFold2, produced high-accuracy models on eight out of 10 targets. Their interdomain scores on three of these targets were significantly higher than all other groups and were responsible for their overall outperformance in the category. We further highlight the performance of AlphaFold2 and the next best group, BAKER-experimental on several interesting targets.
Collapse
Affiliation(s)
- R Dustin Schaeffer
- Department of Biophysics, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Lisa Kinch
- Howard Hughes Medical Institute, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, USA
| | - Nick V Grishin
- Department of Biophysics, UT Southwestern Medical Center, Dallas, Texas, USA.,Howard Hughes Medical Institute, UT Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
27
|
Millán C, Keegan RM, Pereira J, Sammito MD, Simpkin AJ, McCoy AJ, Lupas AN, Hartmann MD, Rigden DJ, Read RJ. Assessing the utility of CASP14 models for molecular replacement. Proteins 2021; 89:1752-1769. [PMID: 34387010 PMCID: PMC8881082 DOI: 10.1002/prot.26214] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 07/20/2021] [Accepted: 07/27/2021] [Indexed: 11/21/2022]
Abstract
The assessment of CASP models for utility in molecular replacement is a measure of their use in a valuable real‐world application. In CASP7, the metric for molecular replacement assessment involved full likelihood‐based molecular replacement searches; however, this restricted the assessable targets to crystal structures with only one copy of the target in the asymmetric unit, and to those where the search found the correct pose. In CASP10, full molecular replacement searches were replaced by likelihood‐based rigid‐body refinement of models superimposed on the target using the LGA algorithm, with the metric being the refined log‐likelihood‐gain (LLG) score. This enabled multi‐copy targets and very poor models to be evaluated, but a significant further issue remained: the requirement of diffraction data for assessment. We introduce here the relative‐expected‐LLG (reLLG), which is independent of diffraction data. This reLLG is also independent of any crystal form, and can be calculated regardless of the source of the target, be it X‐ray, NMR or cryo‐EM. We calibrate the reLLG against the LLG for targets in CASP14, showing that it is a robust measure of both model and group ranking. Like the LLG, the reLLG shows that accurate coordinate error estimates add substantial value to predicted models. We find that refinement by CASP groups can often convert an inadequate initial model into a successful MR search model. Consistent with findings from others, we show that the AlphaFold2 models are sufficiently good, and reliably so, to surpass other current model generation strategies for attempting molecular replacement phasing.
Collapse
Affiliation(s)
- Claudia Millán
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| | - Ronan M Keegan
- Scientific Computing Dept., Science and Technologies Facilities Council, UK Research and Innovation, Didcot, Oxfordshire, United Kingdom
| | - Joana Pereira
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, Germany
| | - Massimo D Sammito
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| | - Adam J Simpkin
- Institute of Systems, Molecular and Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7BE, United Kingdom
| | - Airlie J McCoy
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| | - Andrei N Lupas
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, Germany
| | - Marcus D Hartmann
- Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, Tübingen, Germany
| | - Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7BE, United Kingdom
| | - Randy J Read
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, United Kingdom
| |
Collapse
|
28
|
Kinch LN, Pei J, Kryshtafovych A, Schaeffer RD, Grishin NV. Topology evaluation of models for difficult targets in the 14th round of the critical assessment of protein structure prediction. Proteins 2021; 89:1673-1686. [PMID: 34240477 DOI: 10.1002/prot.26172] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 06/28/2021] [Accepted: 07/01/2021] [Indexed: 12/25/2022]
Abstract
This report describes the tertiary structure prediction assessment of difficult modeling targets in the 14th round of the Critical Assessment of Structure Prediction (CASP14). We implemented an official ranking scheme that used the same scores as the previous CASP topology-based assessment, but combined these scores with one that emphasized physically realistic models. The top performing AlphaFold2 group outperformed the rest of the prediction community on all but two of the difficult targets considered in this assessment. They provided high quality models for most of the targets (86% over GDT_TS 70), including larger targets above 150 residues, and they correctly predicted the topology of almost all the rest. AlphaFold2 performance was followed by two manual Baker methods, a Feig method that refined Zhang-server models, two notable automated Zhang server methods (QUARK and Zhang-server), and a Zhang manual group. Despite the remarkable progress in protein structure prediction of difficult targets, both the prediction community and AlphaFold2, to a lesser extent, faced challenges with flexible regions and obligate oligomeric assemblies. The official ranking of top-performing methods was supported by performance generated PCA and heatmap clusters that gave insight into target difficulties and the most successful state-of-the-art structure prediction methodologies.
Collapse
Affiliation(s)
- Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Jimin Pei
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | | | - R Dustin Schaeffer
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA.,Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA.,Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|