1
|
Liang F, Sun M, Xie L, Zhao X, Liu D, Zhao K, Zhang G. Recent advances and challenges in protein complex model accuracy estimation. Comput Struct Biotechnol J 2024; 23:1824-1832. [PMID: 38707538 PMCID: PMC11066466 DOI: 10.1016/j.csbj.2024.04.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 04/18/2024] [Accepted: 04/18/2024] [Indexed: 05/07/2024] Open
Abstract
Estimation of model accuracy plays a crucial role in protein structure prediction, aiming to evaluate the quality of predicted protein structure models accurately and objectively. This process is not only key to screening candidate models that are close to the real structure, but also provides guidance for further optimization of protein structures. With the significant advancements made by AlphaFold2 in monomer structure, the problem of single-domain protein structure prediction has been widely solved. Correspondingly, the importance of assessing the quality of single-domain protein models decreased, and the research focus has shifted to estimation of model accuracy of protein complexes. In this review, our goal is to provide a comprehensive overview of the reference and statistical metrics, as well as representative methods, and the current challenges within four distinct facets (Topology Global Score, Interface Total Score, Interface Residue-Wise Score, and Tertiary Residue-Wise Score) in the field of complex EMA.
Collapse
Affiliation(s)
| | | | - Lei Xie
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xuanfeng Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Dong Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
2
|
Li M, Qing R, Tao F, Xu P, Zhang S. Inhibitory effect of truncated isoforms on GPCR dimerization predicted by combinatorial computational strategy. Comput Struct Biotechnol J 2024; 23:278-286. [PMID: 38173876 PMCID: PMC10762321 DOI: 10.1016/j.csbj.2023.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/07/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024] Open
Abstract
G protein-coupled receptors (GPCRs) play a pivotal role in fundamental biological processes and disease development. GPCR isoforms, derived from alternative splicing, can exhibit distinct signaling patterns. Some highly-truncated isoforms can impact functional performance of full-length receptors, suggesting their intriguing regulatory roles. However, how these truncated isoforms interact with full-length counterparts remains largely unexplored. Here, we computationally investigated the interaction patterns of three human GPCRs from three different classes, ADORA1 (Class A), mGlu2 (Class C) and SMO (Class F) with their respective truncated isoforms because their homodimer structures have been experimentally determined, and they have truncated isoforms deposited and identified at protein level in Uniprot database. Combining the neural network-based AlphaFold2 and two physics-based protein-protein docking tools, we generated multiple complex structures and assessed the binding affinity in the context of atomistic molecular dynamics simulations. Our computational results suggested all the four studied truncated isoforms showed potent binding to their counterparts and overlapping interfaces with homodimers, indicating their strong potential to block homodimerization of their counterparts. Our study offers insights into functional significance of GPCR truncated isoforms and supports the ubiquity of their regulatory roles.
Collapse
Affiliation(s)
- Mengke Li
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Laboratory of Molecular Architecture, Media Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Rui Qing
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Fei Tao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ping Xu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Shuguang Zhang
- Laboratory of Molecular Architecture, Media Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| |
Collapse
|
3
|
Zheng F, Jiang X, Wen Y, Yang Y, Li M. Systematic investigation of machine learning on limited data: A study on predicting protein-protein binding strength. Comput Struct Biotechnol J 2024; 23:460-472. [PMID: 38235359 PMCID: PMC10792694 DOI: 10.1016/j.csbj.2023.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/14/2023] [Accepted: 12/16/2023] [Indexed: 01/19/2024] Open
Abstract
The application of machine learning techniques in biological research, especially when dealing with limited data availability, poses significant challenges. In this study, we leveraged advancements in method development for predicting protein-protein binding strength to conduct a systematic investigation into the application of machine learning on limited data. The binding strength, quantitatively measured as binding affinity, is vital for understanding the processes of recognition, association, and dysfunction that occur within protein complexes. By incorporating transfer learning, integrating domain knowledge, and employing both deep learning and traditional machine learning algorithms, we mitigated the impact of data limitations and made significant advancements in predicting protein-protein binding affinity. In particular, we developed over 20 models, ultimately selecting three representative best-performing ones that belong to distinct categories. The first model is structure-based, consisting of a random forest regression and thirteen handcrafted features. The second model is sequence-based, employing an architecture that combines transferred embedding features with a multilayer perceptron. Finally, we created an ensemble model by averaging the predictions of the two aforementioned models. The comparison with other predictors on three independent datasets confirms the significant improvements achieved by our models in predicting protein-protein binding affinity. The programs for running these three models are available at https://github.com/minghuilab/BindPPI.
Collapse
Affiliation(s)
- Feifan Zheng
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Xin Jiang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yuhao Wen
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yan Yang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Minghui Li
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| |
Collapse
|
4
|
Urvas L, Chiesa L, Bret G, Jacquemard C, Kellenberger E. Benchmarking AlphaFold-Generated Structures of Chemokine-Chemokine Receptor Complexes. J Chem Inf Model 2024; 64:4587-4600. [PMID: 38809680 DOI: 10.1021/acs.jcim.3c01835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
AlphaFold and AlphaFold-Multimer have become two essential tools for the modeling of unknown structures of proteins and protein complexes. In this work, we extensively benchmarked the quality of chemokine-chemokine receptor structures generated by AlphaFold-Multimer against experimentally determined structures. Our analysis considered both the global quality of the model, as well as key structural features for chemokine recognition. To study the effects of template and multiple sequence alignment parameters on the results, a new prediction pipeline called LIT-AlphaFold (https://github.com/LIT-CCM-lab/LIT-AlphaFold) was developed, allowing extensive input customization. AlphaFold-Multimer correctly predicted differences in chemokine binding orientation and accurately reproduced the unique binding orientation of the CXCL12-ACKR3 complex. Further, the predictions of the full receptor N-terminus provided insights into a putative chemokine recognition site 0.5. The accuracy of chemokine N-terminus binding mode prediction varied between complexes, but the confidence score permitted the distinguishing of residues that were very likely well positioned. Finally, we generated a high-confidence model of the unsolved CXCL12-CXCR4 complex, which agreed with experimental mutagenesis and cross-linking data.
Collapse
Affiliation(s)
- Lauri Urvas
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Luca Chiesa
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Guillaume Bret
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Célien Jacquemard
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Esther Kellenberger
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
5
|
Chen J, Li Q, Xia S, Arsala D, Sosa D, Wang D, Long M. The Rapid Evolution of De Novo Proteins in Structure and Complex. Genome Biol Evol 2024; 16:evae107. [PMID: 38753069 PMCID: PMC11149777 DOI: 10.1093/gbe/evae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/06/2024] Open
Abstract
Recent studies in the rice genome-wide have established that de novo genes, evolving from noncoding sequences, enhance protein diversity through a stepwise process. However, the pattern and rate of their evolution in protein structure over time remain unclear. Here, we addressed these issues within a surprisingly short evolutionary timescale (<1 million years for 97% of Oryza de novo genes) with comparative approaches to gene duplicates. We found that de novo genes evolve faster than gene duplicates in the intrinsically disordered regions (such as random coils), secondary structure elements (such as α helix and β strand), hydrophobicity, and molecular recognition features. In de novo proteins, specifically, we observed an 8% to 14% decay in random coils and intrinsically disordered region lengths and a 2.3% to 6.5% increase in structured elements, hydrophobicity, and molecular recognition features, per million years on average. These patterns of structural evolution align with changes in amino acid composition over time as well. We also revealed higher positive charges but smaller molecular weights for de novo proteins than duplicates. Tertiary structure predictions showed that most de novo proteins, though not typically well folded on their own, readily form low-energy and compact complexes with other proteins facilitated by extensive residue contacts and conformational flexibility, suggesting a faster-binding scenario in de novo proteins to promote interaction. These analyses illuminate a rapid evolution of protein structure in de novo genes in rice genomes, originating from noncoding sequences, highlighting their quick transformation into active, protein complex-forming components within a remarkably short evolutionary timeframe.
Collapse
Affiliation(s)
- Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Qingrong Li
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dylan Sosa
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dong Wang
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
6
|
Carugo O. Accuracy of AlphaFold models: Comparison with short N …O contacts in atomic resolution protein crystal structures. Comput Biol Chem 2024; 110:108069. [PMID: 38581839 DOI: 10.1016/j.compbiolchem.2024.108069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/29/2024] [Accepted: 04/04/2024] [Indexed: 04/08/2024]
Abstract
Artificial intelligence (AI) has revolutionized structural biology by predicting protein 3D structures with near-experimental accuracy. Here, short backbone N-O distances in high-resolution crystal structures were compared to those in three-dimensional models based on AI AlphaFold/ColabFold, specifically considering their estimated standard errors. Experimental and computationally modeled distances very often differ significantly, showing that these models' precision is inadequate to reproduce experimental results at high resolution. T-tests and normal probability plots showed that these computational methods predict atomic position standard errors 3.5-6 times bigger than experimental errors. SYNOPSIS: Positional standard errors in AI-based protein 3D models are 3.5-6 times larger than in atomic resolution crystal structures.
Collapse
Affiliation(s)
- Oliviero Carugo
- Department of Chemistry, University of Pavia, Pavia, Italy; Max Perutz Labs University of Vienna, Department of Structural and Computational Biology, Vienna, Austria.
| |
Collapse
|
7
|
Shor B, Schneidman-Duhovny D. Integrative modeling meets deep learning: Recent advances in modeling protein assemblies. Curr Opin Struct Biol 2024; 87:102841. [PMID: 38795564 DOI: 10.1016/j.sbi.2024.102841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/24/2024] [Accepted: 04/27/2024] [Indexed: 05/28/2024]
Abstract
Recent progress in protein structure prediction based on deep learning revolutionized the field of Structural Biology. Beyond single proteins, it also enabled high-throughput prediction of structures of protein-protein interactions. Despite the success in predicting complex structures, large macromolecular assemblies still require specialized approaches. Here we describe recent advances in modeling macromolecular assemblies using integrative and hierarchical approaches. We highlight applications that predict protein-protein interactions and challenges in modeling complexes based on the interaction networks, including the prediction of complex stoichiometry and heterogeneity.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel. https://twitter.com/ben_shor
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
8
|
Xia S, Li D, Deng X, Liu Z, Zhu H, Liu Y, Li D. Integration of protein sequence and protein-protein interaction data by hypergraph learning to identify novel protein complexes. Brief Bioinform 2024; 25:bbae274. [PMID: 38851299 PMCID: PMC11162299 DOI: 10.1093/bib/bbae274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 05/22/2024] [Accepted: 05/24/2024] [Indexed: 06/10/2024] Open
Abstract
Protein-protein interactions (PPIs) are the basis of many important biological processes, with protein complexes being the key forms implementing these interactions. Understanding protein complexes and their functions is critical for elucidating mechanisms of life processes, disease diagnosis and treatment and drug development. However, experimental methods for identifying protein complexes have many limitations. Therefore, it is necessary to use computational methods to predict protein complexes. Protein sequences can indicate the structure and biological functions of proteins, while also determining their binding abilities with other proteins, influencing the formation of protein complexes. Integrating these characteristics to predict protein complexes is very promising, but currently there is no effective framework that can utilize both protein sequence and PPI network topology for complex prediction. To address this challenge, we have developed HyperGraphComplex, a method based on hypergraph variational autoencoder that can capture expressive features from protein sequences without feature engineering, while also considering topological properties in PPI networks, to predict protein complexes. Experiment results demonstrated that HyperGraphComplex achieves satisfactory predictive performance when compared with state-of-art methods. Further bioinformatics analysis shows that the predicted protein complexes have similar attributes to known ones. Moreover, case studies corroborated the remarkable predictive capability of our model in identifying protein complexes, including 3 that were not only experimentally validated by recent studies but also exhibited high-confidence structural predictions from AlphaFold-Multimer. We believe that the HyperGraphComplex algorithm and our provided proteome-wide high-confidence protein complex prediction dataset will help elucidate how proteins regulate cellular processes in the form of complexes, and facilitate disease diagnosis and treatment and drug development. Source codes are available at https://github.com/LiDlab/HyperGraphComplex.
Collapse
Affiliation(s)
- Simin Xia
- School of Basic Medical Sciences, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei 230032, China
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China
| | - Dianke Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China
- State Key Laboratory of Farm Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, 2 Yuanmingyuan West Road, Haidian District, Beijing 100193, China
| | - Xinru Deng
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China
| | - Zhongyang Liu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China
| | - Huaqing Zhu
- School of Basic Medical Sciences, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei 230032, China
| | - Yuan Liu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China
| | - Dong Li
- School of Basic Medical Sciences, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei 230032, China
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China
| |
Collapse
|
9
|
Bosart K, Petreaca RC, Bouley RA. In silico analysis of several frequent SLX4 mutations appearing in human cancers. MICROPUBLICATION BIOLOGY 2024; 2024:10.17912/micropub.biology.001216. [PMID: 38828439 PMCID: PMC11143449 DOI: 10.17912/micropub.biology.001216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/16/2024] [Accepted: 05/16/2024] [Indexed: 06/05/2024]
Abstract
SLX4 is an interactor and activator of structure-specific exonuclease that helps resolve tangled recombination intermediates arising at stalled replication forks. It is one of the many factors that assist with homologous recombination, the major mechanism for restarting replication. SLX4 mutations have been reported in many cancers but a pan cancer map of all the mutations has not been undertaken. Here, using data from the Catalogue of Somatic Mutations in Cancers (COSMIC), we show that mutations occur in almost every cancer and many of them truncate the protein which should severely alter the function of the enzyme. We identified a frequent R1779W point mutation that occurs in the SLX4 domain required for heterodimerization with its partner, SLX1. In silico protein structure analysis of this mutation shows that it significantly alters the protein structure and is likely to destabilize the interaction with SLX1. Although this brief communication is limited to only in silico analysis, it identifies certain high frequency SLX4 mutations in human cancers that would warrant further in vivo studies. Additionally, these mutations may be potentially actionable for drug therapies.
Collapse
Affiliation(s)
- Korey Bosart
- James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, United States
| | - Ruben C Petreaca
- James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, United States
- Molecular Genetics, The Ohio State University at Marion, Marion, Ohio, United States
| | - Renee A Bouley
- Chemistry and Biochemistry, The Ohio State University at Marion, Marion, Ohio, United States
| |
Collapse
|
10
|
Tominaga K, Ozaki S, Sato S, Katayama T, Nishimura Y, Omae K, Iwasaki W. Frequent nonhomologous replacement of replicative helicase loaders by viruses in Vibrionaceae. Proc Natl Acad Sci U S A 2024; 121:e2317954121. [PMID: 38683976 PMCID: PMC11087808 DOI: 10.1073/pnas.2317954121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 03/14/2024] [Indexed: 05/02/2024] Open
Abstract
Several microbial genomes lack textbook-defined essential genes. If an essential gene is absent from a genome, then an evolutionarily independent gene of unknown function complements its function. Here, we identified frequent nonhomologous replacement of an essential component of DNA replication initiation, a replicative helicase loader gene, in Vibrionaceae. Our analysis of Vibrionaceae genomes revealed two genes with unknown function, named vdhL1 and vdhL2, that were substantially enriched in genomes without the known helicase-loader genes. These genes showed no sequence similarities to genes with known function but encoded proteins structurally similar with a viral helicase loader. Analyses of genomic syntenies and coevolution with helicase genes suggested that vdhL1/2 encodes a helicase loader. The in vitro assay showed that Vibrio harveyi VdhL1 and Vibrio ezurae VdhL2 promote the helicase activity of DnaB. Furthermore, molecular phylogenetics suggested that vdhL1/2 were derived from phages and replaced an intrinsic helicase loader gene of Vibrionaceae over 20 times. This high replacement frequency implies the host's advantage in acquiring a viral helicase loader gene.
Collapse
Affiliation(s)
- Kento Tominaga
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba277-0882, Japan
| | - Shogo Ozaki
- Department of Molecular Biology, Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka812-8582, Japan
| | - Shohei Sato
- Department of Molecular Biology, Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka812-8582, Japan
| | - Tsutomu Katayama
- Department of Molecular Biology, Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka812-8582, Japan
| | - Yuki Nishimura
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba277-0882, Japan
| | - Kimiho Omae
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba277-0882, Japan
| | - Wataru Iwasaki
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba277-0882, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo113-0032, Japan
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba277-0882, Japan
- Atmosphere and Ocean Research Institute, The University of Tokyo, Chiba277-8564, Japan
- Institute for Quantitative Biosciences, The University of Tokyo, Tokyo113-0032, Japan
- Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Tokyo113-8657, Japan
| |
Collapse
|
11
|
Launay R, Chobert SC, Abby SS, Pierrel F, André I, Esque J. Structural Reconstruction of E. coli Ubi Metabolon Using an AlphaFold2-Based Computational Framework. J Chem Inf Model 2024. [PMID: 38710096 DOI: 10.1021/acs.jcim.4c00304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Ubiquinone (UQ) is a redox polyisoprenoid lipid found in the membranes of bacteria and eukaryotes that has important roles, notably one in respiratory metabolism, which sustains cellular bioenergetics. In Escherichia coli, several steps of the UQ biosynthesis take place in the cytosol. To perform these reactions, a supramolecular assembly called Ubi metabolon is involved. This latter is composed of seven proteins (UbiE, UbiG, UbiF, UbiH, UbiI, UbiJ, and UbiK), and its structural organization is unknown as well as its protein stoichiometry. In this study, a computational framework has been designed to predict the structure of this macromolecular assembly. In several successive steps, we explored the possible protein interactions as well as the protein stoichiometry, to finally obtain a structural organization of the complex. The use of AlphaFold2-based methods combined with evolutionary information enabled us to predict several models whose quality and confidence were further analyzed using different metrics and scores. Our work led to the identification of a "core assembly" that will guide functional and structural characterization of the Ubi metabolon.
Collapse
Affiliation(s)
- Romain Launay
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Sophie-Carole Chobert
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Sophie S Abby
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Fabien Pierrel
- Univ. Grenoble Alpes, CNRS, UMR 5525, VetAgro Sup, Grenoble INP, TIMC, 38000 Grenoble, France
| | - Isabelle André
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| | - Jérémy Esque
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, 31077 Toulouse, France
| |
Collapse
|
12
|
Azzam T, Du JJ, Flowers MW, Ali AV, Hunn JC, Vijayvargiya N, Knagaram R, Bogacz M, Maravillas KE, Sastre DE, Fields JK, Mirzaei A, Pierce BG, Sundberg EJ. Combinatorially restricted computational design of protein-protein interfaces to produce IgG heterodimers. SCIENCE ADVANCES 2024; 10:eadk8157. [PMID: 38598628 PMCID: PMC11006224 DOI: 10.1126/sciadv.adk8157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 03/08/2024] [Indexed: 04/12/2024]
Abstract
Redesigning protein-protein interfaces is an important tool for developing therapeutic strategies. Interfaces can be redesigned by in silico screening, which allows for efficient sampling of a large protein space before experimental validation. However, computational costs limit the number of combinations that can be reasonably sampled. Here, we present combinatorial tyrosine (Y)/serine (S) selection (combYSelect), a computational approach combining in silico determination of the change in binding free energy (ΔΔG) of an interface with a highly restricted library composed of just two amino acids, tyrosine and serine. We used combYSelect to design two immunoglobulin G (IgG) heterodimers-combYSelect1 (L368S/D399Y-K409S/T411Y) and combYSelect2 (D399Y/K447S-K409S/T411Y)-that exhibit near-optimal heterodimerization, without affecting IgG stability or function. We solved the crystal structures of these heterodimers and found that dynamic π-stacking interactions and polar contacts drive preferential heterodimeric interactions. Finally, we demonstrated the utility of our combYSelect heterodimers by engineering both a bispecific antibody and a cytokine trap for two unique therapeutic applications.
Collapse
Affiliation(s)
- Tala Azzam
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Jonathan J. Du
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Maria W. Flowers
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Adeela V. Ali
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Jeremy C. Hunn
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Nina Vijayvargiya
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Rushil Knagaram
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Marek Bogacz
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Kino E. Maravillas
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Diego E. Sastre
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - James K. Fields
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Ardalan Mirzaei
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW, Australia
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20850, USA
| | - Eric J. Sundberg
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| |
Collapse
|
13
|
Schmid EW, Walter JC. Predictomes: A classifier-curated database of AlphaFold-modeled protein-protein interactions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.09.588596. [PMID: 38645019 PMCID: PMC11030396 DOI: 10.1101/2024.04.09.588596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Protein-protein interactions (PPIs) are ubiquitous in biology, yet a comprehensive structural characterization of the PPIs underlying biochemical processes is lacking. Although AlphaFold-Multimer (AF-M) has the potential to fill this knowledge gap, standard AF-M confidence metrics do not reliably separate relevant PPIs from an abundance of false positive predictions. To address this limitation, we used machine learning on well curated datasets to train a Structure Prediction and Omics informed Classifier called SPOC that shows excellent performance in separating true and false PPIs, including in proteome-wide screens. We applied SPOC to an all-by-all matrix of nearly 300 human genome maintenance proteins, generating ~40,000 predictions that can be viewed at predictomes.org, where users can also score their own predictions with SPOC. High confidence PPIs discovered using our approach suggest novel hypotheses in genome maintenance. Our results provide a framework for interpreting large scale AF-M screens and help lay the foundation for a proteome-wide structural interactome.
Collapse
Affiliation(s)
- Ernst W. Schmid
- Department of Biological Chemistry & Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Johannes C. Walter
- Department of Biological Chemistry & Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
- Howard Hughes Medical Institute, Boston, MA 02115, USA
| |
Collapse
|
14
|
Goh KJ, Stubenrauch CJ, Lithgow T. The TAM, a Translocation and Assembly Module for protein assembly and potential conduit for phospholipid transfer. EMBO Rep 2024; 25:1711-1720. [PMID: 38467907 PMCID: PMC11014939 DOI: 10.1038/s44319-024-00111-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 02/08/2024] [Accepted: 02/20/2024] [Indexed: 03/13/2024] Open
Abstract
The assembly of β-barrel proteins into the bacterial outer membrane is an essential process enabling the colonization of new environmental niches. The TAM was discovered as a module of the β-barrel protein assembly machinery; it is a heterodimeric complex composed of an outer membrane protein (TamA) bound to an inner membrane protein (TamB). The TAM spans the periplasm, providing a scaffold through the peptidoglycan layer and catalyzing the translocation and assembly of β-barrel proteins into the outer membrane. Recently, studies on another membrane protein (YhdP) have suggested that TamB might play a role in phospholipid transport to the outer membrane. Here we review and re-evaluate the literature covering the experimental studies on the TAM over the past decade, to reconcile what appear to be conflicting claims on the function of the TAM.
Collapse
Affiliation(s)
- Kwok Jian Goh
- Centre to Impact AMR, Monash University, Melbourne, VIC, 3800, Australia
- Infection Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC, 3800, Australia
| | - Christopher J Stubenrauch
- Centre to Impact AMR, Monash University, Melbourne, VIC, 3800, Australia
- Infection Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC, 3800, Australia
| | - Trevor Lithgow
- Centre to Impact AMR, Monash University, Melbourne, VIC, 3800, Australia.
- Infection Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC, 3800, Australia.
| |
Collapse
|
15
|
Shor B, Schneidman-Duhovny D. CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2. Nat Methods 2024; 21:477-487. [PMID: 38326495 PMCID: PMC10927564 DOI: 10.1038/s41592-024-02174-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 01/09/2024] [Indexed: 02/09/2024]
Abstract
Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold's high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
16
|
Corum MR, Venkannagari H, Hryc CF, Baker ML. Predictive modeling and cryo-EM: A synergistic approach to modeling macromolecular structure. Biophys J 2024; 123:435-450. [PMID: 38268190 PMCID: PMC10912932 DOI: 10.1016/j.bpj.2024.01.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 01/09/2024] [Accepted: 01/18/2024] [Indexed: 01/26/2024] Open
Abstract
Over the last 15 years, structural biology has seen unprecedented development and improvement in two areas: electron cryo-microscopy (cryo-EM) and predictive modeling. Once relegated to low resolutions, single-particle cryo-EM is now capable of achieving near-atomic resolutions of a wide variety of macromolecular complexes. Ushered in by AlphaFold, machine learning has powered the current generation of predictive modeling tools, which can accurately and reliably predict models for proteins and some complexes directly from the sequence alone. Although they offer new opportunities individually, there is an inherent synergy between these techniques, allowing for the construction of large, complex macromolecular models. Here, we give a brief overview of these approaches in addition to illustrating works that combine these techniques for model building. These examples provide insight into model building, assessment, and limitations when integrating predictive modeling with cryo-EM density maps. Together, these approaches offer the potential to greatly accelerate the generation of macromolecular structural insights, particularly when coupled with experimental data.
Collapse
Affiliation(s)
- Michael R Corum
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas
| | - Harikanth Venkannagari
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas
| | - Corey F Hryc
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas
| | - Matthew L Baker
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas.
| |
Collapse
|
17
|
Guo D, De Sciscio ML, Chi-Fung Ng J, Fraternali F. Modelling the assembly and flexibility of antibody structures. Curr Opin Struct Biol 2024; 84:102757. [PMID: 38118364 DOI: 10.1016/j.sbi.2023.102757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/29/2023] [Accepted: 11/30/2023] [Indexed: 12/22/2023]
Abstract
Antibodies are large protein assemblies capable of both specifically recognising antigens and engaging with other proteins and receptors to coordinate immune action. Traditionally, structural studies have been dedicated to antibody variable regions, but efforts to determine and model full-length antibody structures are emerging. Here we review the current knowledge on modelling the structures of antibody assemblies, focusing on their conformational flexibility and the challenge this poses to obtaining and evaluating structural models. Integrative modelling approaches, combining experiments (cryo-electron microscopy, mass spectrometry, etc.) and computational methods (molecular dynamics simulations, deep-learning based approaches, etc.), hold the promise to map the complex conformational landscape of full-length antibody structures.
Collapse
Affiliation(s)
- Dongjun Guo
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom; Randall Centre for Cell & Molecular Biophysics, King's College London, New Hunt's House, Guy's Campus, London, SE1 1UL, United Kingdom
| | - Maria Laura De Sciscio
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom; Department of Chemistry, Sapienza University of Rome, P.le A. Moro 5, Rome, 00185, Italy
| | - Joseph Chi-Fung Ng
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom
| | - Franca Fraternali
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom.
| |
Collapse
|
18
|
Ragonis-Bachar P, Axel G, Blau S, Ben-Tal N, Kolodny R, Landau M. What can AlphaFold do for antimicrobial amyloids? Proteins 2024; 92:265-281. [PMID: 37855235 DOI: 10.1002/prot.26618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/05/2023] [Accepted: 10/05/2023] [Indexed: 10/20/2023]
Abstract
Amyloids, protein, and peptide assemblies in various organisms are crucial in physiological and pathological processes. Their intricate structures, however, present significant challenges, limiting our understanding of their functions, regulatory mechanisms, and potential applications in biomedicine and technology. This study evaluated the AlphaFold2 ColabFold method's structure predictions for antimicrobial amyloids, using eight antimicrobial peptides (AMPs), including those with experimentally determined structures and AMPs known for their distinct amyloidogenic morphological features. Additionally, two well-known human amyloids, amyloid-β and islet amyloid polypeptide, were included in the analysis due to their disease relevance, short sequences, and antimicrobial properties. Amyloids typically exhibit tightly mated β-strand sheets forming a cross-β configuration. However, certain amphipathic α-helical subunits can also form amyloid fibrils adopting a cross-α structure. Some AMPs in the study exhibited a combination of cross-α and cross-β amyloid fibrils, adding complexity to structure prediction. The results showed that the AlphaFold2 ColabFold models favored α-helical structures in the tested amyloids, successfully predicting the presence of α-helical mated sheets and a hydrophobic core resembling the cross-α configuration. This implies that the AI-based algorithms prefer assemblies of the monomeric state, which was frequently predicted as helical, or capture an α-helical membrane-active form of toxic peptides, which is triggered upon interaction with lipid membranes.
Collapse
Affiliation(s)
| | - Gabriel Axel
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Shahar Blau
- Department of Biology, Technion-Israel Institute of Technology, Haifa, Israel
| | - Nir Ben-Tal
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel
| | - Meytal Landau
- Department of Biology, Technion-Israel Institute of Technology, Haifa, Israel
- CSSB Centre for Structural Systems Biology, Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
- The Center for Experimental Medicine, Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
- European Molecular Biology Laboratory (EMBL), Hamburg, Germany
| |
Collapse
|
19
|
Gómez Borrego J, Torrent Burgas M. Structural assembly of the bacterial essential interactome. eLife 2024; 13:e94919. [PMID: 38226900 PMCID: PMC10863985 DOI: 10.7554/elife.94919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 12/22/2023] [Indexed: 01/17/2024] Open
Abstract
The study of protein interactions in living organisms is fundamental for understanding biological processes and central metabolic pathways. Yet, our knowledge of the bacterial interactome remains limited. Here, we combined gene deletion mutant analysis with deep-learning protein folding using AlphaFold2 to predict the core bacterial essential interactome. We predicted and modeled 1402 interactions between essential proteins in bacteria and generated 146 high-accuracy models. Our analysis reveals previously unknown details about the assembly mechanisms of these complexes, highlighting the importance of specific structural features in their stability and function. Our work provides a framework for predicting the essential interactomes of bacteria and highlight the potential of deep-learning algorithms in advancing our understanding of the complex biology of living organisms. Also, the results presented here offer a promising approach to identify novel antibiotic targets.
Collapse
Affiliation(s)
- Jordi Gómez Borrego
- Systems Biology of Infection Lab, Department of Biochemistry and Molecular Biology, Biosciences Faculty, Universitat Autònoma de BarcelonaCerdanyola del VallèsSpain
| | - Marc Torrent Burgas
- Systems Biology of Infection Lab, Department of Biochemistry and Molecular Biology, Biosciences Faculty, Universitat Autònoma de BarcelonaCerdanyola del VallèsSpain
| |
Collapse
|
20
|
Krokidis MG, Dimitrakopoulos GN, Vrahatis AG, Exarchos TP, Vlamos P. Challenges and limitations in computational prediction of protein misfolding in neurodegenerative diseases. Front Comput Neurosci 2024; 17:1323182. [PMID: 38250244 PMCID: PMC10796696 DOI: 10.3389/fncom.2023.1323182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/19/2023] [Indexed: 01/23/2024] Open
Affiliation(s)
| | | | | | | | - Panagiotis Vlamos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Corfu, Greece
| |
Collapse
|
21
|
Chaulagain D, Schnabel E, Lin EX, Garcia RR, Noorai RE, Müller LM, Frugoli JA. TML1 AND TML2 SYNERGISTICALLY REGULATE NODULATION BUT NOT ARBUSCULAR MYCORRHIZA IN MEDICAGO TRUNCATULA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570674. [PMID: 38106087 PMCID: PMC10723381 DOI: 10.1101/2023.12.07.570674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Two symbiotic processes, nodulation and arbuscular mycorrhiza, are primarily controlled by the plant's need for nitrogen (N) and phosphorus (P), respectively. Autoregulation of Nodulation (AON) and Autoregulation of Mycorrhization (AOM) share multiple components - plants that make too many nodules usually have higher arbuscule density. The protein TML (TOO MUCH LOVE) was shown to function in roots to maintain susceptibly to rhizobial infection under low N conditions and control nodule number through AON in Lotus japonicus. M. truncatula has two sequence homologs: MtTML1 and MtTML2. We report the generation of stable single and double mutants harboring multiple allelic variations in MtTML1 and MtTML2 using CRISPR-Cas9 targeted mutagenesis and screening of a transposon mutagenesis library. Plants containing single mutations in either gene produced twice the nodules of wild type plants whereas plants containing mutations in both genes displayed a synergistic effect, forming 20x more nodules and short roots compared to wild type plants. The synergistic effect on nodulation was maintained in the presence of 10mM nitrogen, but not observed in root length phenotypes. Examination of expression and heterozygote effects suggest genetic compensation may play a role in the observed synergy. However, plants with mutations in both TMLs had no detectable change in arbuscular mycorrhizal associations, suggesting that MtTMLs are specific to nodulation and nitrate signaling. The mutants created will be useful tools to dissect the mechanism of synergistic action of MtTML1 and MtTML2 in M. truncatula nodulation as well as the separation of AON from AOM.
Collapse
|
22
|
Li M, Qing R, Tao F, Xu P, Zhang S. Dynamic Dimerization of Chemokine Receptors and Potential Inhibitory Role of Their Truncated Isoforms Revealed through Combinatorial Prediction. Int J Mol Sci 2023; 24:16266. [PMID: 38003455 PMCID: PMC10671024 DOI: 10.3390/ijms242216266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/03/2023] [Accepted: 11/08/2023] [Indexed: 11/26/2023] Open
Abstract
Chemokine receptors play crucial roles in fundamental biological processes. Their malfunction may result in many diseases, including cancer, autoimmune diseases, and HIV. The oligomerization of chemokine receptors holds significant functional implications that directly affect their signaling patterns and pharmacological responses. However, the oligomerization patterns of many chemokine receptors remain poorly understood. Furthermore, several chemokine receptors have highly truncated isoforms whose functional role is not yet clear. Here, we computationally show homo- and heterodimerization patterns of four human chemokine receptors, namely CXCR2, CXCR7, CCR2, and CCR7, along with their interaction patterns with their respective truncated isoforms. By combining the neural network-based AlphaFold2 and physics-based protein-protein docking tool ClusPro, we predicted 15 groups of complex structures and assessed the binding affinities in the context of atomistic molecular dynamics simulations. Our results are in agreement with previous experimental observations and support the dynamic and diverse nature of chemokine receptor dimerization, suggesting possible patterns of higher-order oligomerization. Additionally, we uncover the strong potential of truncated isoforms to block homo- and heterodimerization of chemokine receptors, also in a dynamic manner. Our study provides insights into the dimerization patterns of chemokine receptors and the functional significance of their truncated isoforms.
Collapse
Affiliation(s)
- Mengke Li
- Laboratory of Molecular Architecture, Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA;
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China; (R.Q.); (F.T.); (P.X.)
| | - Rui Qing
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China; (R.Q.); (F.T.); (P.X.)
| | - Fei Tao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China; (R.Q.); (F.T.); (P.X.)
| | - Ping Xu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China; (R.Q.); (F.T.); (P.X.)
| | - Shuguang Zhang
- Laboratory of Molecular Architecture, Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA;
| |
Collapse
|
23
|
Sledzieski S, Kshirsagar M, Baek M, Berger B, Dodhia R, Ferres JL. Democratizing Protein Language Models with Parameter-Efficient Fine-Tuning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566187. [PMID: 37986761 PMCID: PMC10659351 DOI: 10.1101/2023.11.09.566187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Proteomics has been revolutionized by large pre-trained protein language models, which learn unsupervised representations from large corpora of sequences. The parameters of these models are then fine-tuned in a supervised setting to tailor the model to a specific downstream task. However, as model size increases, the computational and memory footprint of fine-tuning becomes a barrier for many research groups. In the field of natural language processing, which has seen a similar explosion in the size of models, these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we newly bring parameter-efficient fine-tuning methods to proteomics. Using the parameter-efficient method LoRA, we train new models for two important proteomic tasks: predicting protein-protein interactions (PPI) and predicting the symmetry of homooligomers. We show that for homooligomer symmetry prediction, these approaches achieve performance competitive with traditional fine-tuning while requiring reduced memory and using three orders of magnitude fewer parameters. On the PPI prediction task, we surprisingly find that PEFT models actually outperform traditional fine-tuning while using two orders of magnitude fewer parameters. Here, we go even further to show that freezing the parameters of the language model and training only a classification head also outperforms fine-tuning, using five orders of magnitude fewer parameters, and that both of these models outperform state-of-the-art PPI prediction methods with substantially reduced compute. We also demonstrate that PEFT is robust to variations in training hyper-parameters, and elucidate where best practices for PEFT in proteomics differ from in natural language processing. Thus, we provide a blueprint to democratize the power of protein language model tuning to groups which have limited computational resources.
Collapse
Affiliation(s)
- Samuel Sledzieski
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge MA 02139, USA
- AI for Good Research Lab, Microsoft Corporation, Redmond WA 98052, USA
| | | | - Minkyung Baek
- Department of Biological Sciences, Seoul National University, Seoul 08826, South Korea
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge MA 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge MA 02139, USA
| | - Rahul Dodhia
- AI for Good Research Lab, Microsoft Corporation, Redmond WA 98052, USA
| | | |
Collapse
|
24
|
Sledzieski S, Devkota K, Singh R, Cowen L, Berger B. TT3D: Leveraging precomputed protein 3D sequence models to predict protein-protein interactions. Bioinformatics 2023; 39:btad663. [PMID: 37897686 PMCID: PMC10640393 DOI: 10.1093/bioinformatics/btad663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 09/24/2023] [Accepted: 10/27/2023] [Indexed: 10/30/2023] Open
Abstract
MOTIVATION High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method of van Kempen et al. encodes the structural information of distances and angles along the protein backbone into a linear string of the same length as the protein string, using tokens from a 21-letter discretized structural alphabet (3Di). RESULTS We show that using both the amino acid sequence and the 3Di sequence generated by Foldseek as inputs to our recent deep-learning method, Topsy-Turvy, substantially improves the performance of predicting protein-protein interactions cross-species. Thus TT3D (Topsy-Turvy 3D) presents a way to reuse all the computational effort going into producing high-quality structural models from sequence, while being sufficiently lightweight so that high-quality binary protein-protein interaction predictions across all protein pairs can be made genome-wide. AVAILABILITY AND IMPLEMENTATION TT3D is available at https://github.com/samsledje/D-SCRIPT. An archived version of the code at time of submission can be found at https://zenodo.org/records/10037674.
Collapse
Affiliation(s)
- Samuel Sledzieski
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, 177 College Avenue, Medford, MA 02155, United States
| | - Rohit Singh
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC 27705, United States
- Department of Cell Biology, Duke University, Durham, NC 27705, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, 177 College Avenue, Medford, MA 02155, United States
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, United States
| |
Collapse
|
25
|
Gordon CH, Hendrix E, He Y, Walker MC. AlphaFold Accurately Predicts the Structure of Ribosomally Synthesized and Post-Translationally Modified Peptide Biosynthetic Enzymes. Biomolecules 2023; 13:1243. [PMID: 37627309 PMCID: PMC10452190 DOI: 10.3390/biom13081243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 08/08/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
Ribosomally synthesized and post-translationally modified peptides (RiPPs) are a growing class of natural products biosynthesized from a genetically encoded precursor peptide. The enzymes that install the post-translational modifications on these peptides have the potential to be useful catalysts in the production of natural-product-like compounds and can install non-proteogenic amino acids in peptides and proteins. However, engineering these enzymes has been somewhat limited, due in part to limited structural information on enzymes in the same families that nonetheless exhibit different substrate selectivities. Despite AlphaFold2's superior performance in single-chain protein structure prediction, its multimer version lacks accuracy and requires high-end GPUs, which are not typically available to most research groups. Additionally, the default parameters of AlphaFold2 may not be optimal for predicting complex structures like RiPP biosynthetic enzymes, due to their dynamic binding and substrate-modifying mechanisms. This study assessed the efficacy of the structure prediction program ColabFold (a variant of AlphaFold2) in modeling RiPP biosynthetic enzymes in both monomeric and dimeric forms. After extensive benchmarking, it was found that there were no statistically significant differences in the accuracy of the predicted structures, regardless of the various possible prediction parameters that were examined, and that with the default parameters, ColabFold was able to produce accurate models. We then generated additional structural predictions for select RiPP biosynthetic enzymes from multiple protein families and biosynthetic pathways. Our findings can serve as a reference for future enzyme engineering complemented by AlphaFold-related tools.
Collapse
Affiliation(s)
| | | | | | - Mark C. Walker
- Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
26
|
Rogers JR, Nikolényi G, AlQuraishi M. Growing ecosystem of deep learning methods for modeling protein-protein interactions. Protein Eng Des Sel 2023; 36:gzad023. [PMID: 38102755 DOI: 10.1093/protein/gzad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023] Open
Abstract
Numerous cellular functions rely on protein-protein interactions. Efforts to comprehensively characterize them remain challenged however by the diversity of molecular recognition mechanisms employed within the proteome. Deep learning has emerged as a promising approach for tackling this problem by exploiting both experimental data and basic biophysical knowledge about protein interactions. Here, we review the growing ecosystem of deep learning methods for modeling protein interactions, highlighting the diversity of these biophysically informed models and their respective trade-offs. We discuss recent successes in using representation learning to capture complex features pertinent to predicting protein interactions and interaction sites, geometric deep learning to reason over protein structures and predict complex structures, and generative modeling to design de novo protein assemblies. We also outline some of the outstanding challenges and promising new directions. Opportunities abound to discover novel interactions, elucidate their physical mechanisms, and engineer binders to modulate their functions using deep learning and, ultimately, unravel how protein interactions orchestrate complex cellular behaviors.
Collapse
Affiliation(s)
- Julia R Rogers
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Gergő Nikolényi
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|