1
|
Murali H, Wang P, Liao EC, Wang K. Genetic variant classification by predicted protein structure: A case study on IRF6. Comput Struct Biotechnol J 2024; 23:892-904. [PMID: 38370976 PMCID: PMC10869248 DOI: 10.1016/j.csbj.2024.01.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/24/2024] [Accepted: 01/25/2024] [Indexed: 02/20/2024] Open
Abstract
Next-generation genome sequencing has revolutionized genetic testing, identifying numerous rare disease-associated gene variants. However, to impute pathogenicity, computational approaches remain inadequate and functional testing of gene variant is required to provide the highest level of evidence. The emergence of AlphaFold2 has transformed the field of protein structure determination, and here we outline a strategy that leverages predicted protein structure to enhance genetic variant classification. We used the gene IRF6 as a case study due to its clinical relevance, its critical role in cleft lip/palate malformation, and the availability of experimental data on the pathogenicity of IRF6 gene variants through phenotype rescue experiments in irf6-/- zebrafish. We compared results from over 30 pathogenicity prediction tools on 37 IRF6 missense variants. IRF6 lacks an experimentally derived structure, so we used predicted structures to explore associations between mutational clustering and pathogenicity. We found that among these variants, 19 of 37 were unanimously predicted as deleterious by computational tools. Comparing in silico predictions with experimental findings, 12 variants predicted as pathogenic were experimentally determined as benign. Even with the recently published AlphaMissense model, 15/18 (83%) of the predicted pathogenic variants were experimentally determined as benign. In comparison, mapping variants to the protein revealed deleterious mutation clusters around the protein binding domain, whereas N-terminal variants tend to be benign, suggesting the importance of structural information in determining pathogenicity of mutations in this gene. In conclusion, incorporating gene-specific structural features of known pathogenic/benign mutations may provide meaningful insights into pathogenicity predictions in a gene-specific manner and facilitate the interpretation of variant pathogenicity.
Collapse
Affiliation(s)
- Hemma Murali
- Graduate Program in Biochemistry and Molecular Biophysics, University of Pennsylvania, Philadelphia, PA 19104, United States
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
| | - Peng Wang
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
- Master of Biotechnology Program, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Eric C. Liao
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
- Center for Craniofacial Innovation, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
| | - Kai Wang
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| |
Collapse
|
2
|
Villamor-Payà M, Sanchiz-Calvo M, Smak J, Pais L, Sud M, Shankavaram U, Lovgren AK, Austin-Tse C, Ganesh VS, Gay M, Vilaseca M, Arauz-Garofalo G, Palenzuela L, VanNoy G, O’Donnell-Luria A, Stracker TH. De novo TLK1 and MDM1 mutations in a patient with a neurodevelopmental disorder and immunodeficiency. iScience 2024; 27:109984. [PMID: 38868186 PMCID: PMC11166698 DOI: 10.1016/j.isci.2024.109984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 04/08/2024] [Accepted: 05/13/2024] [Indexed: 06/14/2024] Open
Abstract
The Tousled-like kinases 1 and 2 (TLK1/TLK2) regulate DNA replication, repair and chromatin maintenance. TLK2 variants underlie the neurodevelopmental disorder (NDD) 'Intellectual Disability, Autosomal Dominant 57' (MRD57), characterized by intellectual disability and microcephaly. Several TLK1 variants have been reported in NDDs but their functional significance is unknown. A male patient presenting with ID, seizures, global developmental delay, hypothyroidism, and primary immunodeficiency was determined to have a heterozygous TLK1 variant (c.1435C>G, p.Q479E), as well as a mutation in MDM1 (c.1197dupT, p.K400∗). Cells expressing TLK1 p.Q479E exhibited reduced cytokine responses and elevated DNA damage, but not increased radiation sensitivity or DNA repair defects. The TLK1 p.Q479E variant impaired kinase activity but not proximal protein interactions. Our study provides the first functional characterization of NDD-associated TLK1 variants and suggests that, such as TLK2, TLK1 variants may impact development in multiple tissues and should be considered in the diagnosis of rare NDDs.
Collapse
Affiliation(s)
- Marina Villamor-Payà
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
- National Cancer Institute, Center for Cancer Research, Radiation Oncology Branch, Bethesda, MD 20892, USA
| | - María Sanchiz-Calvo
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Jordann Smak
- National Cancer Institute, Center for Cancer Research, Radiation Oncology Branch, Bethesda, MD 20892, USA
| | - Lynn Pais
- Division of Genetics & Genomics, Department of Pediatrics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Malika Sud
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Uma Shankavaram
- National Cancer Institute, Center for Cancer Research, Radiation Oncology Branch, Bethesda, MD 20892, USA
| | - Alysia Kern Lovgren
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Christina Austin-Tse
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Vijay S. Ganesh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Neurology, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Marina Gay
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Marta Vilaseca
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Gianluca Arauz-Garofalo
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Lluís Palenzuela
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Grace VanNoy
- Division of Genetics & Genomics, Department of Pediatrics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Anne O’Donnell-Luria
- Division of Genetics & Genomics, Department of Pediatrics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Travis H. Stracker
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
- National Cancer Institute, Center for Cancer Research, Radiation Oncology Branch, Bethesda, MD 20892, USA
| |
Collapse
|
3
|
Sass MI, Wang S, Mack D, Cottam SL, Shen PS, Willardson BM. Protocol to study CCT-mediated folding of Gβ 5 by single-particle cryo-EM. STAR Protoc 2024; 5:103116. [PMID: 38848218 DOI: 10.1016/j.xpro.2024.103116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 04/30/2024] [Accepted: 05/16/2024] [Indexed: 06/09/2024] Open
Abstract
The chaperonin CCT mediates folding of many cytosolic proteins, including G protein β subunits (Gβs). Here, we present a protocol for isolating Gβ5 bound to CCT and its co-chaperone PhLP1 and determining the CCT-mediated folding trajectory of Gβ5 using single-particle cryoelectron microscopy (cryo-EM) techniques. We describe steps for purifying CCT-Gβ5-PhLP1 from human cells, stabilizing the closed CCT conformation, preparing and imaging cryo-EM specimens, and processing data to recover multiple Gβ5 folding intermediates. This protocol permits visualization of protein folding by CCT. For complete details on the use and execution of this protocol, please refer to Sass et al.1.
Collapse
Affiliation(s)
- Mikaila I Sass
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, UT 84602, USA
| | - Shuxin Wang
- Department of Biochemistry, School of Medicine, University of Utah, 15 N. Medical Drive East, Salt Lake City, UT 84112, USA
| | - Deirdre Mack
- Department of Biochemistry, School of Medicine, University of Utah, 15 N. Medical Drive East, Salt Lake City, UT 84112, USA
| | - Samuel L Cottam
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, UT 84602, USA
| | - Peter S Shen
- Department of Biochemistry, School of Medicine, University of Utah, 15 N. Medical Drive East, Salt Lake City, UT 84112, USA.
| | - Barry M Willardson
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, UT 84602, USA.
| |
Collapse
|
4
|
Duan M, Plemel RL, Takenaka T, Lin A, Delgado BM, Nattermann U, Nickerson DP, Mima J, Miller EA, Merz AJ. SNARE chaperone Sly1 directly mediates close-range vesicle tethering. J Cell Biol 2024; 223:e202001032. [PMID: 38478018 PMCID: PMC10943277 DOI: 10.1083/jcb.202001032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 12/20/2023] [Accepted: 02/22/2024] [Indexed: 03/17/2024] Open
Abstract
The essential Golgi protein Sly1 is a member of the Sec1/mammalian Unc-18 (SM) family of SNARE chaperones. Sly1 was originally identified through remarkable gain-of-function alleles that bypass requirements for diverse vesicle tethering factors. Employing genetic analyses and chemically defined reconstitutions of ER-Golgi fusion, we discovered that a loop conserved among Sly1 family members is not only autoinhibitory but also acts as a positive effector. An amphipathic lipid packing sensor (ALPS)-like helix within the loop directly binds high-curvature membranes. Membrane binding is required for relief of Sly1 autoinhibition and also allows Sly1 to directly tether incoming vesicles to the Qa-SNARE on the target organelle. The SLY1-20 mutation bypasses requirements for diverse tethering factors but loses this ability if the tethering activity is impaired. We propose that long-range tethers, including Golgins and multisubunit tethering complexes, hand off vesicles to Sly1, which then tethers at close range to initiate trans-SNARE complex assembly and fusion in the early secretory pathway.
Collapse
Affiliation(s)
- Mengtong Duan
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Rachael L. Plemel
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | | | - Ariel Lin
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Department of Biology, California State University, San Bernardino, CA, USA
| | | | - Una Nattermann
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Biophysics, Structure, and Design Graduate Program, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Joji Mima
- Institute for Protein Research, Osaka University, Osaka, Japan
| | | | - Alexey J. Merz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| |
Collapse
|
5
|
Mishra S, Rout M, Singh MK, Dehury B, Pati S. Classical molecular dynamics simulation identifies catechingallate as a promising antiviral polyphenol against MPOX palmitoylated surface protein. Comput Biol Chem 2024; 110:108070. [PMID: 38678726 DOI: 10.1016/j.compbiolchem.2024.108070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/04/2024] [Accepted: 04/06/2024] [Indexed: 05/01/2024]
Abstract
Cumulative global prevalence of the emergent monkeypox (MPX) infection in the non-endemic countries has been professed as a global public health predicament. Lack of effective MPX-specific treatments sets the baseline for designing the current study. This research work uncovers the effective use of known antiviral polyphenols against MPX viral infection, and recognises their mode of interaction with the target F13 protein, that plays crucial role in formation of enveloped virions. Herein, we have employed state-of-the-art machine learning based AlphaFold2 to predict the three-dimensional structure of F13 followed by molecular docking and all-atoms molecular dynamics (MD) simulations to investigate the differential mode of F13-polyphenol interactions. Our extensive computational approach identifies six potent polyphenols Rutin, Epicatechingallate, Catechingallate, Quercitrin, Isoquecitrin and Hyperoside exhibiting higher binding affinity towards F13, buried inside a positively charged binding groove. Intermolecular contact analysis of the docked and MD simulated complexes divulges three important residues Asp134, Ser137 and Ser321 that are observed to be involved in ligand binding through hydrogen bonds. Our findings suggest that ligand binding induces minor conformational changes in F13 to affect the conformation of the binding site. Concomitantly, essential dynamics of the six-MD simulated complexes reveals Catechin gallate, a known antiviral agent as a promising polyphenol targeting F13 protein, dominated with a dense network of hydrophobic contacts. However, assessment of biological activities of these polyphenols need to be confirmed through in vitro and in vivo assays, which may pave the way for development of new novel antiviral drugs.
Collapse
Affiliation(s)
- Sarbani Mishra
- Bioinformatics Division, ICMR-Regional Medical Research Centre, Nalco Square, Chandrasekharpur, Bhubaneswar, Odisha 751023, India
| | - Madhusmita Rout
- Bioinformatics Division, ICMR-Regional Medical Research Centre, Nalco Square, Chandrasekharpur, Bhubaneswar, Odisha 751023, India
| | - Mahender Kumar Singh
- Data Science Laboratory, National Brain Research Centre, Gurgaon, Haryana 122052, India
| | - Budheswar Dehury
- Bioinformatics Division, ICMR-Regional Medical Research Centre, Nalco Square, Chandrasekharpur, Bhubaneswar, Odisha 751023, India; Department of Bioinformatics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal 576104, India.
| | - Sanghamitra Pati
- Bioinformatics Division, ICMR-Regional Medical Research Centre, Nalco Square, Chandrasekharpur, Bhubaneswar, Odisha 751023, India.
| |
Collapse
|
6
|
Wang L, Wen Z, Liu SW, Zhang L, Finley C, Lee HJ, Fan HJS. Overview of AlphaFold2 and breakthroughs in overcoming its limitations. Comput Biol Med 2024; 176:108620. [PMID: 38761500 DOI: 10.1016/j.compbiomed.2024.108620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 05/01/2024] [Accepted: 05/14/2024] [Indexed: 05/20/2024]
Abstract
Predicting three-dimensional (3D) protein structures has been challenging for decades. The emergence of AlphaFold2 (AF2), a deep learning-based machine learning method developed by DeepMind, became a game changer in the protein folding community. AF2 can predict a protein's three-dimensional structure with high confidence based on its amino acid sequence. Accurate prediction of protein structures can dramatically accelerate our understanding of biological mechanisms and provide a solid foundation for reliable drug design. Although AF2 breaks through the barriers in predicting protein structures, many rooms remain to be further studied. This review provides a brief historical overview of the development of protein structure prediction, covering template-based, template-free, and machine learning-based methods. In addition to reviewing the potential benefits (Pros) and considerations (Cons) of using AF2, this review summarizes the diverse applications, including protein structure predictions, dynamic changes, point mutation, integration of language model and experimental data, protein complex, and protein-peptide interaction. It underscores recent advancements in efficiency, reliability, and broad application of AF2. This comprehensive review offers valuable insights into the applications of AF2 and AF2-inspired AI methods in structural biology and its potential for clinically significant drug target discovery.
Collapse
Affiliation(s)
- Lei Wang
- College of Chemical Engineering, Sichuan University of Science and Engineering, Zigong City, Sichuan Province, 64300, China
| | - Zehua Wen
- College of Chemical Engineering, Sichuan University of Science and Engineering, Zigong City, Sichuan Province, 64300, China
| | - Shi-Wei Liu
- College of Chemical Engineering, Sichuan University of Science and Engineering, Zigong City, Sichuan Province, 64300, China
| | - Lihong Zhang
- Digestive Department, Binhai New Area Hospital of TCM Tianjin, Tianjin, 300451, China
| | - Cierra Finley
- Department of Natural Sciences, Southwest Tennessee Community College, Memphis, TN, 38015, USA
| | - Ho-Jin Lee
- Department of Natural Sciences, Southwest Tennessee Community College, Memphis, TN, 38015, USA; Division of Natural & Mathematical Sciences, LeMoyne-Own College, Memphis, TN, 38126, USA.
| | - Hua-Jun Shawn Fan
- College of Chemical Engineering, Sichuan University of Science and Engineering, Zigong City, Sichuan Province, 64300, China.
| |
Collapse
|
7
|
Nandigrami P, Fiser A. Assessing the functional impact of protein binding site definition. Protein Sci 2024; 33:e5026. [PMID: 38757384 PMCID: PMC11099757 DOI: 10.1002/pro.5026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 05/01/2024] [Accepted: 05/03/2024] [Indexed: 05/18/2024]
Abstract
Many biomedical applications, such as classification of binding specificities or bioengineering, depend on the accurate definition of protein binding interfaces. Depending on the choice of method used, substantially different sets of residues can be classified as belonging to the interface of a protein. A typical approach used to verify these definitions is to mutate residues and measure the impact of these changes on binding. Besides the lack of exhaustive data, this approach also suffers from the fundamental problem that a mutation introduces an unknown amount of alteration into an interface, which potentially alters the binding characteristics of the interface. In this study we explore the impact of alternative binding site definitions on the ability of a protein to recognize its cognate ligand using a pharmacophore approach, which does not affect the interface. The study also shows that methods for protein binding interface predictions should perform above approximately F-score = 0.7 accuracy level to capture the biological function of a protein.
Collapse
Affiliation(s)
- Prithviraj Nandigrami
- Departments of Systems and Computational Biology, and BiochemistryAlbert Einstein College of MedicineBronxNew YorkUSA
| | - Andras Fiser
- Departments of Systems and Computational Biology, and BiochemistryAlbert Einstein College of MedicineBronxNew YorkUSA
| |
Collapse
|
8
|
Cai SW, Takai H, Zaug AJ, Dilgen TC, Cech TR, Walz T, de Lange T. POT1 recruits and regulates CST-Polα/primase at human telomeres. Cell 2024:S0092-8674(24)00493-8. [PMID: 38838667 DOI: 10.1016/j.cell.2024.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 03/12/2024] [Accepted: 05/01/2024] [Indexed: 06/07/2024]
Abstract
Telomere maintenance requires the extension of the G-rich telomeric repeat strand by telomerase and the fill-in synthesis of the C-rich strand by Polα/primase. At telomeres, Polα/primase is bound to Ctc1/Stn1/Ten1 (CST), a single-stranded DNA-binding complex. Like mutations in telomerase, mutations affecting CST-Polα/primase result in pathological telomere shortening and cause a telomere biology disorder, Coats plus (CP). We determined cryogenic electron microscopy structures of human CST bound to the shelterin heterodimer POT1/TPP1 that reveal how CST is recruited to telomeres by POT1. Our findings suggest that POT1 hinge phosphorylation is required for CST recruitment, and the complex is formed through conserved interactions involving several residues mutated in CP. Our structural and biochemical data suggest that phosphorylated POT1 holds CST-Polα/primase in an inactive, autoinhibited state until telomerase has extended the telomere ends. We propose that dephosphorylation of POT1 releases CST-Polα/primase into an active state that completes telomere replication through fill-in synthesis.
Collapse
Affiliation(s)
- Sarah W Cai
- Laboratory of Cell Biology and Genetics, The Rockefeller University, New York, NY 10065, USA; Laboratory of Molecular Electron Microscopy, The Rockefeller University, New York, NY 10065, USA
| | - Hiroyuki Takai
- Laboratory of Cell Biology and Genetics, The Rockefeller University, New York, NY 10065, USA
| | - Arthur J Zaug
- Department of Biochemistry, University of Colorado Boulder, Boulder, CO 80303, USA; BioFrontiers Institute, University of Colorado Boulder, Boulder, CO 80303, USA; Howard Hughes Medical Institute, University of Colorado Boulder, Boulder, CO 80303, USA
| | - Teague C Dilgen
- Laboratory of Cell Biology and Genetics, The Rockefeller University, New York, NY 10065, USA
| | - Thomas R Cech
- Department of Biochemistry, University of Colorado Boulder, Boulder, CO 80303, USA; BioFrontiers Institute, University of Colorado Boulder, Boulder, CO 80303, USA; Howard Hughes Medical Institute, University of Colorado Boulder, Boulder, CO 80303, USA
| | - Thomas Walz
- Laboratory of Molecular Electron Microscopy, The Rockefeller University, New York, NY 10065, USA.
| | - Titia de Lange
- Laboratory of Cell Biology and Genetics, The Rockefeller University, New York, NY 10065, USA.
| |
Collapse
|
9
|
Wu D, Yin R, Chen G, Ribeiro-Filho HV, Cheung M, Robbins PF, Mariuzza RA, Pierce BG. Structural characterization and AlphaFold modeling of human T cell receptor recognition of NRAS cancer neoantigens. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.595215. [PMID: 38826362 PMCID: PMC11142219 DOI: 10.1101/2024.05.21.595215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
T cell receptors (TCRs) that recognize cancer neoantigens are important for anti-cancer immune responses and immunotherapy. Understanding the structural basis of TCR recognition of neoantigens provides insights into their exquisite specificity and can enable design of optimized TCRs. We determined crystal structures of a human TCR in complex with NRAS Q61K and Q61R neoantigen peptides and HLA-A1 MHC, revealing the molecular underpinnings for dual recognition and specificity versus wild-type NRAS peptide. We then used multiple versions of AlphaFold to model the corresponding complex structures, given the challenge of immune recognition for such methods. Interestingly, one implementation of AlphaFold2 (TCRmodel2) was able to generate accurate models of the complexes, while AlphaFold3 also showed strong performance, although success was lower for other complexes. This study provides insights into TCR recognition of a shared cancer neoantigen, as well as the utility and practical considerations for using AlphaFold to model TCR-peptide-MHC complexes.
Collapse
Affiliation(s)
- Daichao Wu
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Laboratory of Structural Immunology, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
| | - Rui Yin
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Guodong Chen
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Laboratory of Structural Immunology, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Helder V. Ribeiro-Filho
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Melyssa Cheung
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Paul F. Robbins
- Surgery Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Roy A. Mariuzza
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Brian G. Pierce
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
10
|
Schmid N, Brandt D, Walasek C, Rolland C, Wittmann J, Fischer D, Müsken M, Kalinowski J, Thormann K. An autonomous plasmid as an inovirus phage satellite. Appl Environ Microbiol 2024; 90:e0024624. [PMID: 38597658 PMCID: PMC11107163 DOI: 10.1128/aem.00246-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 03/20/2024] [Indexed: 04/11/2024] Open
Abstract
Bacterial viruses (phages) are potent agents of lateral gene transfer and thus are important drivers of evolution. A group of mobile genetic elements, referred to as phage satellites, exploits phages to disseminate their own genetic material. Here, we isolated a novel member of the family Inoviridae, Shewanella phage Dolos, along with an autonomously replicating plasmid, pDolos. Dolos causes a chronic infection in its host Shewanella oneidensis by phage production with only minor effects on the host cell proliferation. When present, plasmid pDolos hijacks Dolos functions to be predominantly packaged into phage virions and released into the environment and, thus, acts as a phage satellite. pDolos can disseminate further genetic material encoding, e.g., resistances or fluorophores to host cells sensitive to Dolos infection. Given the rather simple requirements of a plasmid for takeover of an inovirus and the wide distribution of phages of this group, we speculate that similar phage-satellite systems are common among bacteria.IMPORTANCEPhage satellites are mobile genetic elements, which hijack phages to be transferred to other host cells. The vast majority of these phage satellites integrate within the host's chromosome, and they all carry remaining phage genes. Here, we identified a novel phage satellite, pDolos, which uses an inovirus for dissemination. pDolos (i) remains as an autonomously replicating plasmid within its host, (ii) does not carry recognizable phage genes, and (iii) is smaller than any other phage satellites identified so far. Thus, pDolos is the first member of a new class of phage satellites, which resemble natural versions of phagemids.
Collapse
Affiliation(s)
- Nicole Schmid
- Institute for Microbiology and Molecular Biology, Justus-Liebig-Universität Gießen, Gießen, Germany
| | - David Brandt
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Claudia Walasek
- Institute for Microbiology and Molecular Biology, Justus-Liebig-Universität Gießen, Gießen, Germany
| | - Clara Rolland
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany
| | - Johannes Wittmann
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany
| | - Dorian Fischer
- Institute for Microbiology and Molecular Biology, Justus-Liebig-Universität Gießen, Gießen, Germany
| | - Mathias Müsken
- Central Facility for Microscopy, Helmholtz Centre for Infection Research GmbH, Braunschweig, Germany
| | - Jörn Kalinowski
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Kai Thormann
- Institute for Microbiology and Molecular Biology, Justus-Liebig-Universität Gießen, Gießen, Germany
| |
Collapse
|
11
|
Zheng W. Predicting hotspots for disease-causing single nucleotide variants using sequences-based coevolution, network analysis, and machine learning. PLoS One 2024; 19:e0302504. [PMID: 38743747 PMCID: PMC11093321 DOI: 10.1371/journal.pone.0302504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 04/05/2024] [Indexed: 05/16/2024] Open
Abstract
To enable personalized medicine, it is important yet highly challenging to accurately predict disease-causing mutations in target proteins at high throughput. Previous computational methods have been developed using evolutionary information in combination with various biochemical and structural features of protein residues to discriminate neutral vs. deleterious mutations. However, the power of these methods is often limited because they either assume known protein structures or treat residues independently without fully considering their interactions. To address the above limitations, we build upon recent progress in machine learning, network analysis, and protein language models, and develop a sequences-based variant site prediction workflow based on the protein residue contact networks: 1. We employ and integrate various methods of building protein residue networks using state-of-the-art coevolution analysis tools (RaptorX, DeepMetaPSICOV, and SPOT-Contact) powered by deep learning. 2. We use machine learning algorithms (Random Forest, Gradient Boosting, and Extreme Gradient Boosting) to optimally combine 20 network centrality scores to jointly predict key residues as hot spots for disease mutations. 3. Using a dataset of 107 proteins rich in disease mutations, we rigorously evaluate the network scores individually and collectively (via machine learning). This work supports a promising strategy of combining an ensemble of network scores based on different coevolution analysis methods (and optionally predictive scores from other methods) via machine learning to predict hotspot sites of disease mutations, which will inform downstream applications of disease diagnosis and targeted drug design.
Collapse
Affiliation(s)
- Wenjun Zheng
- Department of Physics, State University of New York at Buffalo, Buffalo, NY, United States of America
| |
Collapse
|
12
|
Gu S, Yang Y, Zhao Y, Qiu J, Wang X, Tong HHY, Liu L, Wan X, Liu H, Hou T, Kang Y. Evaluation of AlphaFold2 Structures for Hit Identification across Multiple Scenarios. J Chem Inf Model 2024; 64:3630-3639. [PMID: 38630855 DOI: 10.1021/acs.jcim.3c01976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
The introduction of AlphaFold2 (AF2) has sparked significant enthusiasm and generated extensive discussion within the scientific community, particularly among drug discovery researchers. Although previous studies have addressed the performance of AF2 structures in virtual screening (VS), a more comprehensive investigation is still necessary considering the paramount importance of structural accuracy in drug design. In this study, we evaluate the performance of AF2 structures in VS across three common drug discovery scenarios: targets with holo, apo, and AF2 structures; targets with only apo and AF2 structures; and targets exclusively with AF2 structures. We utilized both the traditional physics-based Glide and the deep-learning-based scoring function RTMscore to rank the compounds in the DUD-E, DEKOIS 2.0, and DECOY data sets. The results demonstrate that, overall, the performance of VS on AF2 structures is comparable to that on apo structures but notably inferior to that on holo structures across diverse scenarios. Moreover, when a target has solely AF2 structure, selecting the holo structure of the target from different subtypes within the same protein family produces comparable results with the AF2 structure for VS on the data set of the AF2 structures, and significantly better results than the AF2 structures on its own data set. This indicates that utilizing AF2 structures for docking-based VS may not yield most satisfactory outcomes, even when solely AF2 structures are available. Moreover, we rule out the possibility that the variations in VS performance between the binding pockets of AF2 and holo structures arise from the differences in their biological assembly composition.
Collapse
Affiliation(s)
- Shukai Gu
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yuwei Yang
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Yihao Zhao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jiayue Qiu
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Xiaorui Wang
- State Key Laboratory of Quality Re-search in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao 999078, China
| | - Henry Hoi Yee Tong
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Liwei Liu
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Nanjing 210000, Jiangsu, China
| | - Xiaozhe Wan
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Nanjing 210000, Jiangsu, China
| | - Huanxiang Liu
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
13
|
Ahmad S, Demneh FM, Rehman B, Almanaa TN, Akhtar N, Pazoki-Toroudi H, Shojaeian A, Ghatrehsamani M, Sanami S. In silico design of a novel multi-epitope vaccine against HCV infection through immunoinformatics approaches. Int J Biol Macromol 2024; 267:131517. [PMID: 38621559 DOI: 10.1016/j.ijbiomac.2024.131517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 04/08/2024] [Accepted: 04/09/2024] [Indexed: 04/17/2024]
Abstract
Infection with the hepatitis C virus (HCV) is one of the causes of liver cancer, which is the world's sixth most prevalent and third most lethal cancer. The current treatments do not prevent reinfection; because they are expensive, their usage is limited to developed nations. Therefore, a prophylactic vaccine is essential to control this virus. Hence, in this study, an immunoinformatics method was applied to design a multi-epitope vaccine against HCV. The best B- and T-cell epitopes from conserved regions of the E2 protein of seven HCV genotypes were joined with the appropriate linkers to design a multi-epitope vaccine. In addition, cholera enterotoxin subunit B (CtxB) was included as an adjuvant in the vaccine construct. This study is the first to present this epitopes-adjuvant combination. The vaccine had acceptable physicochemical characteristics. The vaccine's 3D structure was predicted and validated. The vaccine's binding stability with Toll-like receptor 2 (TLR2) and TLR4 was confirmed using molecular docking and molecular dynamics (MD) simulation. The immune simulation revealed the vaccine's efficacy by increasing the population of B and T cells in response to vaccination. In silico expression in Escherichia coli (E. coli) was also successful.
Collapse
Affiliation(s)
- Sajjad Ahmad
- Department of Health and Biological Sciences, Abasyn University, Peshawar 25000, Pakistan; Gilbert and Rose-Marie Chagoury School of Medicine, Lebanese American University, Beirut, P.O. Box 36, Lebanon; Department of Natural Sciences, Lebanese American University, Beirut, P.O. Box 36, Lebanon
| | - Fatemeh Mobini Demneh
- Cellular and Molecular Research Center, Basic Health Sciences Institute, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | - Bushra Rehman
- Institute of Biotechnology and Microbiology, Bacha khan University, Charsadda, Pakistan
| | - Taghreed N Almanaa
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Nahid Akhtar
- School of Bioengineering and Biosciences, Lovely Professional University, Phagwara 144411, India
| | - Hamidreza Pazoki-Toroudi
- Department of Physiology & Physiology Research Center, Faculty of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Ali Shojaeian
- Research Center for Molecular Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Mahdi Ghatrehsamani
- Cellular and Molecular Research Center, Basic Health Sciences Institute, Shahrekord University of Medical Sciences, Shahrekord, Iran.
| | - Samira Sanami
- Abnormal Uterine Bleeding Research Center, Semnan University of Medical Sciences, Semnan, Iran.
| |
Collapse
|
14
|
Roterman I, Stapor K, Dułak D, Konieczny L. External Force Field for Protein Folding in Chaperonins-Potential Application in In Silico Protein Folding. ACS OMEGA 2024; 9:18412-18428. [PMID: 38680295 PMCID: PMC11044213 DOI: 10.1021/acsomega.4c00409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/26/2024] [Accepted: 03/29/2024] [Indexed: 05/01/2024]
Abstract
The present study discusses the influence of the TRiC chaperonin involved in the folding of the component of reovirus mu1/σ3. The TRiC chaperone is treated as a provider of a specific external force field in the fuzzy oil drop model during the structural formation of a target folded protein. The model also determines the status of the final product, which represents the structure directed by an external force field in the form of a chaperonin. This can be used for in silico folding as the process is environment-dependent. The application of the model enables the quantitative assessment of the folding dependence of an external force field, which appears to have universal application.
Collapse
Affiliation(s)
- Irena Roterman
- Department
of Bioinformatics and Telemedicine, Jagiellonian
University—Medical College, Medyczna 7, Kraków 30-688, Poland
| | - Katarzyna Stapor
- Faculty
of Automatic, Electronics and Computer Science, Department of Applied
Informatics, Silesian University of Technology, Akademicka 16, Gliwice 44-100, Poland
| | - Dawid Dułak
- ABB
Business Services Sp. z o.o, ul Żegańska 1, Warszawa 04-713, Poland
| | - Leszek Konieczny
- Chair
of Medical Biochemistry—Jagiellonian University—Medical
College, Kopernika 7, Kraków 31-034, Poland
| |
Collapse
|
15
|
Græsholt C, Brembu T, Volpe C, Bartosova Z, Serif M, Winge P, Nymark M. Zeaxanthin epoxidase 3 Knockout Mutants of the Model Diatom Phaeodactylum tricornutum Enable Commercial Production of the Bioactive Carotenoid Diatoxanthin. Mar Drugs 2024; 22:185. [PMID: 38667802 PMCID: PMC11051370 DOI: 10.3390/md22040185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 04/14/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
Carotenoids are pigments that have a range of functions in human health. The carotenoid diatoxanthin is suggested to have antioxidant, anti-inflammatory and chemo-preventive properties. Diatoxanthin is only produced by a few groups of microalgae, where it functions in photoprotection. Its large-scale production in microalgae is currently not feasible. In fact, rapid conversion into the inactive pigment diadinoxanthin is triggered when cells are removed from a high-intensity light source, which is the case during large-scale harvesting of microalgae biomass. Zeaxanthin epoxidase (ZEP) 2 and/or ZEP3 have been suggested to be responsible for the back-conversion of high-light accumulated diatoxanthin to diadinoxanthin in low-light in diatoms. Using CRISPR/Cas9 gene editing technology, we knocked out the ZEP2 and ZEP3 genes in the marine diatom Phaeodactylum tricornutum to investigate their role in the diadinoxanthin-diatoxanthin cycle and determine if one of the mutant strains could function as a diatoxanthin production line. Light-shift experiments proved that ZEP3 encodes the enzyme converting diatoxanthin to diadinoxanthin in low light. Loss of ZEP3 caused the high-light-accumulated diatoxanthin to be stable for several hours after the cultures had been returned to low light, suggesting that zep3 mutant strains could be suitable as commercial production lines of diatoxanthin.
Collapse
Affiliation(s)
- Cecilie Græsholt
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway (T.B.); (Z.B.); (M.S.); (P.W.)
| | - Tore Brembu
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway (T.B.); (Z.B.); (M.S.); (P.W.)
| | - Charlotte Volpe
- Department of Fisheries and New Biomarine Industry, SINTEF Ocean, 7010 Trondheim, Norway;
| | - Zdenka Bartosova
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway (T.B.); (Z.B.); (M.S.); (P.W.)
| | - Manuel Serif
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway (T.B.); (Z.B.); (M.S.); (P.W.)
| | - Per Winge
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway (T.B.); (Z.B.); (M.S.); (P.W.)
| | - Marianne Nymark
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway (T.B.); (Z.B.); (M.S.); (P.W.)
- Department of Fisheries and New Biomarine Industry, SINTEF Ocean, 7010 Trondheim, Norway;
| |
Collapse
|
16
|
Wesołowski P, Wales DJ, Pracht P. Multilevel Framework for Analysis of Protein Folding Involving Disulfide Bond Formation. J Phys Chem B 2024; 128:3145-3156. [PMID: 38512062 PMCID: PMC11000224 DOI: 10.1021/acs.jpcb.4c00104] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 03/22/2024]
Abstract
In this study, a three-layered multicenter ONIOM approach is implemented to characterize the naive folding pathway of bovine pancreatic trypsin inhibitor (BPTI). Each layer represents a distinct level of theory, where the initial layer, encompassing the entire protein, is modeled by a general all-atom force-field GFN-FF. An intermediate electronic structure layer consisting of three multicenter fragments is introduced with the state-of-the-art semiempirical tight-binding method GFN2-xTB. Higher accuracy, specifically addressing the breaking and formation of the three disulfide bonds, is achieved at the innermost layer using the composite DFT method r2SCAN-3c. Our analysis sheds light on the structural stability of BPTI, particularly the significance of interlinking disulfide bonds. The accuracy and efficiency of the multicenter QM/SQM/MM approach are benchmarked using the oxidative formation of cystine. For the folding pathway of BPTI, relative stabilities are investigated through the calculation of free energy contributions for selected intermediates, focusing on the impact of the disulfide bond. Our results highlight the intricate trade-off between accuracy and computational cost, demonstrating that the multicenter ONIOM approach provides a well-balanced and comprehensive solution to describe electronic structure effects in biomolecular systems. We conclude that multiscale energy landscape exploration provides a robust methodology for the study of intriguing biological targets.
Collapse
Affiliation(s)
- Patryk
A. Wesołowski
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - David J. Wales
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Philipp Pracht
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| |
Collapse
|
17
|
Wallace NS, Gadbery JE, Cohen CI, Kendall AK, Jackson LP. Tepsin binds LC3B to promote ATG9A trafficking and delivery. Mol Biol Cell 2024; 35:ar56. [PMID: 38381558 PMCID: PMC11064669 DOI: 10.1091/mbc.e23-09-0359-t] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 02/07/2024] [Accepted: 02/16/2024] [Indexed: 02/23/2024] Open
Abstract
Tepsin is an established accessory protein found in Adaptor Protein 4 (AP-4) coated vesicles, but the biological role of tepsin remains unknown. AP-4 vesicles originate at the trans-Golgi network (TGN) and target the delivery of ATG9A, a scramblase required for autophagosome biogenesis, to the cell periphery. Using in silico methods, we identified a putative LC3-Interacting Region (LIR) motif in tepsin. Biochemical experiments using purified recombinant proteins indicate tepsin directly binds LC3B preferentially over other members of the mammalian ATG8 family. Calorimetry and structural modeling data indicate this interaction occurs with micromolar affinity using the established LC3B LIR docking site. Loss of tepsin in cultured cells dysregulates ATG9A export from the TGN as well as ATG9A distribution at the cell periphery. Tepsin depletion in a mRFP-GFP-LC3B HeLa reporter cell line using siRNA knockdown increases autophagosome volume and number, but does not appear to affect flux through the autophagic pathway. Reintroduction of wild-type tepsin partially rescues ATG9A cargo trafficking defects. In contrast, reintroducing tepsin with a mutated LIR motif or missing N-terminus drives diffuse ATG9A subcellular distribution. Together, these data suggest roles for tepsin in cargo export from the TGN; ensuring delivery of ATG9A-positive vesicles; and in overall maintenance of autophagosome structure.
Collapse
Affiliation(s)
- Natalie S. Wallace
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37232
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37232
| | - John E. Gadbery
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37232
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37232
| | - Cameron I. Cohen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37232
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37232
| | - Amy K. Kendall
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37232
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37232
| | - Lauren P. Jackson
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37232
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37232
- Department of Biochemistry, Vanderbilt University, Nashville, TN 37232
| |
Collapse
|
18
|
Chen C, van der Hoorn RAL, Buscaill P. Releasing hidden MAMPs from precursor proteins in plants. TRENDS IN PLANT SCIENCE 2024; 29:428-436. [PMID: 37945394 DOI: 10.1016/j.tplants.2023.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/16/2023] [Accepted: 09/21/2023] [Indexed: 11/12/2023]
Abstract
The recognition of pathogens by plants at the cell surface is crucial for activating plant immunity. Plants employ pattern recognition receptors (PRRs) to detect microbe-associated molecular patterns (MAMPs). However, our knowledge of the release of peptide MAMPs from their precursor proteins is very limited. Here, we explore seven protein precursors of well-known MAMP peptides and discuss the likelihood of processing being required for their recognition based on structural models and public knowledge. This analysis indicates the existence of multiple extracellular events that are likely pivotal for pathogen perception but remain to be uncovered.
Collapse
Affiliation(s)
- Changlong Chen
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China; The Plant Chemetics Laboratory, Department of Biology, University of Oxford, Oxford, UK
| | | | - Pierre Buscaill
- The Plant Chemetics Laboratory, Department of Biology, University of Oxford, Oxford, UK
| |
Collapse
|
19
|
Döring M, Brux M, Paszkowski-Rogacz M, Guillem-Gloria PM, Buchholz F, Pisabarro MT, Theis M. Nucleolar protein TAAP1/ C22orf46 confers pro-survival signaling in non-small cell lung cancer. Life Sci Alliance 2024; 7:e202302257. [PMID: 38228372 DOI: 10.26508/lsa.202302257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 01/04/2024] [Accepted: 01/04/2024] [Indexed: 01/18/2024] Open
Abstract
Tumor cells subvert immune surveillance or lytic stress by harnessing inhibitory signals. Hence, bispecific antibodies have been developed to direct CTLs to the tumor site and foster immune-dependent cytotoxicity. Although applied with success, T cell-based immunotherapies are not universally effective partially because of the expression of pro-survival factors by tumor cells protecting them from apoptosis. Here, we report a CRISPR/Cas9 screen in human non-small cell lung cancer cells designed to identify genes that confer tumors with the ability to evade the cytotoxic effects of CD8+ T lymphocytes engaged by bispecific antibodies. We show that the gene C22orf46 facilitates pro-survival signals and that tumor cells devoid of C22orf46 expression exhibit increased susceptibility to T cell-induced apoptosis and stress by genotoxic agents. Although annotated as a non-coding gene, we demonstrate that C22orf46 encodes a nucleolar protein, hereafter referred to as "Tumor Apoptosis Associated Protein 1," up-regulated in lung cancer, which displays remote homologies to the BH domain containing Bcl-2 family of apoptosis regulators. Collectively, the findings establish TAAP1/C22orf46 as a pro-survival oncogene with implications to therapy.
Collapse
Affiliation(s)
- Marietta Döring
- https://ror.org/042aqky30 National Center for Tumor Diseases/University Cancer Center (NCT/UCC): German Cancer Research Center (DKFZ) Heidelberg, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
| | - Melanie Brux
- https://ror.org/042aqky30 National Center for Tumor Diseases/University Cancer Center (NCT/UCC): German Cancer Research Center (DKFZ) Heidelberg, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- https://ror.org/00e7dfm13 Medical Systems Biologyhttps://ror.org/042aqky30 , Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Maciej Paszkowski-Rogacz
- https://ror.org/00e7dfm13 Medical Systems Biologyhttps://ror.org/042aqky30 , Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Pedro M Guillem-Gloria
- https://ror.org/042aqky30 Structural Bioinformatics, BIOTEC, Technische Universität Dresden, Dresden, Germany
| | - Frank Buchholz
- https://ror.org/042aqky30 National Center for Tumor Diseases/University Cancer Center (NCT/UCC): German Cancer Research Center (DKFZ) Heidelberg, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- https://ror.org/00e7dfm13 Medical Systems Biologyhttps://ror.org/042aqky30 , Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- German Cancer Research Center (DKFZ), Heidelberg and German Cancer Consortium (DKTK) Partner Site, Dresden, Germany
| | - M Teresa Pisabarro
- https://ror.org/042aqky30 Structural Bioinformatics, BIOTEC, Technische Universität Dresden, Dresden, Germany
| | - Mirko Theis
- https://ror.org/042aqky30 National Center for Tumor Diseases/University Cancer Center (NCT/UCC): German Cancer Research Center (DKFZ) Heidelberg, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- https://ror.org/00e7dfm13 Medical Systems Biologyhttps://ror.org/042aqky30 , Medical Faculty Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
20
|
Zhang J, Durham J, Qian Cong. Revolutionizing protein-protein interaction prediction with deep learning. Curr Opin Struct Biol 2024; 85:102775. [PMID: 38330793 DOI: 10.1016/j.sbi.2024.102775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/31/2023] [Accepted: 01/05/2024] [Indexed: 02/10/2024]
Abstract
Protein-protein interactions (PPIs) are pivotal for driving diverse biological processes, and any disturbance in these interactions can lead to disease. Thus, the study of PPIs has been a central focus in biology. Recent developments in deep learning methods, coupled with the vast genomic sequence data, have significantly boosted the accuracy of predicting protein structures and modeling protein complexes, approaching levels comparable to experimental techniques. Herein, we review the latest advances in the computational methods for modeling 3D protein complexes and the prediction of protein interaction partners, emphasizing the application of deep learning methods deriving from coevolution analysis. The review also highlights biomedical applications of PPI prediction and outlines challenges in the field.
Collapse
Affiliation(s)
- Jing Zhang
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; HaroldC.Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA. https://twitter.com/jzhang_genome
| | - Jesse Durham
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; HaroldC.Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; HaroldC.Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
21
|
Tang Y, Moretti R, Meiler J. Recent Advances in Automated Structure-Based De Novo Drug Design. J Chem Inf Model 2024; 64:1794-1805. [PMID: 38485516 DOI: 10.1021/acs.jcim.4c00247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
As the number of determined and predicted protein structures and the size of druglike 'make-on-demand' libraries soar, the time-consuming nature of structure-based computer-aided drug design calls for innovative computational algorithms. De novo drug design introduces in silico heuristics to accelerate searching in the vast chemical space. This review focuses on recent advances in structure-based de novo drug design, ranging from conventional fragment-based methods, evolutionary algorithms, and Metropolis Monte Carlo methods to deep generative models. Due to the historical limitation of de novo drug design generating readily available drug-like molecules, we highlight the synthetic accessibility efforts in each category and the benchmarking strategies taken to validate the proposed framework.
Collapse
Affiliation(s)
- Yidan Tang
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Rocco Moretti
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
- Institute of Drug Discovery, Faculty of Medicine, University of Leipzig, 04103 Leipzig, Germany
| |
Collapse
|
22
|
Harihar B, Saravanan KM, Gromiha MM, Selvaraj S. Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design. Mol Biotechnol 2024:10.1007/s12033-024-01119-4. [PMID: 38498284 DOI: 10.1007/s12033-024-01119-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 02/10/2024] [Indexed: 03/20/2024]
Abstract
Inter-residue interactions in protein structures provide valuable insights into protein folding and stability. Understanding these interactions can be helpful in many crucial applications, including rational design of therapeutic small molecules and biologics, locating functional protein sites, and predicting protein-protein and protein-ligand interactions. The process of developing machine learning models incorporating inter-residue interactions has been improved recently. This review highlights the theoretical models incorporating inter-residue interactions in predicting folding and unfolding rates of proteins. Utilizing contact maps to depict inter-residue interactions aids researchers in developing computer models for detecting remote homologs and interface residues within protein-protein complexes which, in turn, enhances our knowledge of the relationship between sequence and structure of proteins. Further, the application of contact maps derived from inter-residue interactions is highlighted in the field of drug discovery. Overall, this review presents an extensive assessment of the significant models that use inter-residue interactions to investigate folding rates, unfolding rates, remote homology, and drug development, providing potential future advancements in constructing efficient computational models in structural biology.
Collapse
Affiliation(s)
- Balasubramanian Harihar
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Konda Mani Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, Tamil Nadu, 600073, India
| | - Michael M Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
23
|
Wu X, Lin H, Bai R, Duan H. Deep learning for advancing peptide drug development: Tools and methods in structure prediction and design. Eur J Med Chem 2024; 268:116262. [PMID: 38387334 DOI: 10.1016/j.ejmech.2024.116262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/06/2024] [Accepted: 02/17/2024] [Indexed: 02/24/2024]
Abstract
Peptides can bind challenging disease targets with high affinity and specificity, offering enormous opportunities for addressing unmet medical needs. However, peptides' unique features, including smaller size, increased structural flexibility, and limited data availability, pose additional challenges to the design process compared to proteins. This review explores the dynamic field of peptide therapeutics, leveraging deep learning to enhance structure prediction and design. Our exploration encompasses various facets of peptide research, ranging from dataset curation handling to model development. As deep learning technologies become more refined, we channel our efforts into peptide structure prediction and design, aligning with the fundamental principles of structure-activity relationships in drug development. To guide researchers in harnessing the potential of deep learning to advance peptide drug development, our insights comprehensively explore current challenges and future directions of peptide therapeutics.
Collapse
Affiliation(s)
- Xinyi Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Huitian Lin
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, PR China
| | - Renren Bai
- School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, PR China.
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China.
| |
Collapse
|
24
|
Woods H, Leman JK, Meiler J. Modeling membrane geometries implicitly in Rosetta. Protein Sci 2024; 33:e4908. [PMID: 38358133 PMCID: PMC10868433 DOI: 10.1002/pro.4908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 02/16/2024]
Abstract
Interactions between membrane proteins (MPs) and lipid bilayers are critical for many cellular functions. In the Rosetta molecular modeling suite, the implicit membrane energy function is based on a "slab" model, which represent the membrane as a flat bilayer. However, in nature membranes often have a curvature that is important for function and/or stability. Even more prevalent, in structural biology research MPs are reconstituted in model membrane systems such as micelles, bicelles, nanodiscs, or liposomes. Thus, we have modified the existing membrane energy potentials within the RosettaMP framework to allow users to model MPs in different membrane geometries. We show that these modifications can be utilized in core applications within Rosetta such as structure refinement, protein-protein docking, and protein design. For MP structures found in curved membranes, refining these structures in curved, implicit membranes produces higher quality models with structures closer to experimentally determined structures. For MP systems embedded in multiple membranes, representing both membranes results in more favorable scores compared to only representing one of the membranes. Modeling MPs in geometries mimicking the membrane model system used in structure determination can improve model quality and model discrimination.
Collapse
Affiliation(s)
- Hope Woods
- Center of Structural Biology, Vanderbilt UniversityNashvilleTennesseeUSA
- Chemical and Physical Biology ProgramVanderbilt UniversityNashvilleTennesseeUSA
| | | | - Jens Meiler
- Center of Structural Biology, Vanderbilt UniversityNashvilleTennesseeUSA
- Department of ChemistryVanderbilt UniversityNashvilleTennesseeUSA
- Institute for Drug Discovery, Leipzig University Medical SchoolLeipzigGermany
| |
Collapse
|
25
|
Qayyum MZ, Imashimizu M, Leanca M, Vishwakarma RK, Riaz-Bradley A, Yuzenkova Y, Murakami KS. Structure and function of the Si3 insertion integrated into the trigger loop/helix of cyanobacterial RNA polymerase. Proc Natl Acad Sci U S A 2024; 121:e2311480121. [PMID: 38354263 PMCID: PMC10895346 DOI: 10.1073/pnas.2311480121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 01/17/2024] [Indexed: 02/16/2024] Open
Abstract
Cyanobacteria and evolutionarily related chloroplasts of algae and plants possess unique RNA polymerases (RNAPs) with characteristics that distinguish them from canonical bacterial RNAPs. The largest subunit of cyanobacterial RNAP (cyRNAP) is divided into two polypeptides, β'1 and β'2, and contains the largest known lineage-specific insertion domain, Si3, located in the middle of the trigger loop and spanning approximately half of the β'2 subunit. In this study, we present the X-ray crystal structure of Si3 and the cryo-EM structures of the cyRNAP transcription elongation complex plus the NusG factor with and without incoming nucleoside triphosphate (iNTP) bound at the active site. Si3 has a well-ordered and elongated shape that exceeds the length of the main body of cyRNAP, fits into cavities of cyRNAP in the absence of iNTP bound at the active site and shields the binding site of secondary channel-binding proteins such as Gre and DksA. A small transition from the trigger loop to the trigger helix upon iNTP binding results in a large swing motion of Si3; however, this transition does not affect the catalytic activity of cyRNAP due to its minimal contact with cyRNAP, NusG, or DNA. This study provides a structural framework for understanding the evolutionary significance of these features unique to cyRNAP and chloroplast RNAP and may provide insights into the molecular mechanism of transcription in specific environment of photosynthetic organisms and organelle.
Collapse
Affiliation(s)
- M. Zuhaib Qayyum
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA16802
| | - Masahiko Imashimizu
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA16802
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba305-8565, Japan
| | - Miron Leanca
- The Centre for Bacterial Cell Biology, Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon TyneNE2 4HH, United Kingdom
| | - Rishi K. Vishwakarma
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA16802
| | - Amber Riaz-Bradley
- The Centre for Bacterial Cell Biology, Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon TyneNE2 4HH, United Kingdom
| | - Yulia Yuzenkova
- The Centre for Bacterial Cell Biology, Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon TyneNE2 4HH, United Kingdom
| | - Katsuhiko S. Murakami
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA16802
| |
Collapse
|
26
|
Corum MR, Venkannagari H, Hryc CF, Baker ML. Predictive modeling and cryo-EM: A synergistic approach to modeling macromolecular structure. Biophys J 2024; 123:435-450. [PMID: 38268190 PMCID: PMC10912932 DOI: 10.1016/j.bpj.2024.01.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 01/09/2024] [Accepted: 01/18/2024] [Indexed: 01/26/2024] Open
Abstract
Over the last 15 years, structural biology has seen unprecedented development and improvement in two areas: electron cryo-microscopy (cryo-EM) and predictive modeling. Once relegated to low resolutions, single-particle cryo-EM is now capable of achieving near-atomic resolutions of a wide variety of macromolecular complexes. Ushered in by AlphaFold, machine learning has powered the current generation of predictive modeling tools, which can accurately and reliably predict models for proteins and some complexes directly from the sequence alone. Although they offer new opportunities individually, there is an inherent synergy between these techniques, allowing for the construction of large, complex macromolecular models. Here, we give a brief overview of these approaches in addition to illustrating works that combine these techniques for model building. These examples provide insight into model building, assessment, and limitations when integrating predictive modeling with cryo-EM density maps. Together, these approaches offer the potential to greatly accelerate the generation of macromolecular structural insights, particularly when coupled with experimental data.
Collapse
Affiliation(s)
- Michael R Corum
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas
| | - Harikanth Venkannagari
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas
| | - Corey F Hryc
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas
| | - Matthew L Baker
- Department of Biochemistry and Molecular Biology, McGovern Medical School at the University of Texas Health Science Center, Houston, Texas.
| |
Collapse
|
27
|
Bekar-Cesaretli AA, Khan O, Nguyen T, Kozakov D, Joseph-Mccarthy D, Vajda S. Conservation of Hot Spots and Ligand Binding Sites in Protein Models by AlphaFold2. J Chem Inf Model 2024; 64:960-973. [PMID: 38253327 PMCID: PMC10922769 DOI: 10.1021/acs.jcim.3c01761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The neural network-based program AlphaFold2 (AF2) provides high accuracy structure prediction for a large fraction of globular proteins. An important question is whether these models are accurate enough for reliably docking small ligands. Several recent papers and the results of CASP15 reveal that local conformational errors reduce the success rates of direct ligand docking. Here, we focus on the ability of the models to conserve the location of binding hot spots, regions on the protein surface that significantly contribute to the binding free energy of the protein-ligand interaction. Clusters of hot spots predict the location and even the druggability of binding sites, and hence are important for computational drug discovery. The hot spots are determined by protein mapping that is based on the distribution of small fragment-sized probes on the protein surface and is less sensitive to local conformation than docking. Mapping models taken from the AlphaFold Protein Structure Database show that identifying binding sites is more reliable than docking, but the success rates are still 5% to 10% lower than based on mapping X-ray structures. The drop in accuracy is particularly large for models of multidomain proteins. However, both the model binding sites and the mapping results can be substantially improved by generating AF2 models for the ligand binding domains of interest rather than the entire proteins and even more if using forced sampling with multiple initial seeds. The mapping of such models tends to reach the accuracy of results obtained by mapping the X-ray structures.
Collapse
Affiliation(s)
| | - Omeir Khan
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, US
| | - Thu Nguyen
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11794, US
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, US
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, US
| | - Diane Joseph-Mccarthy
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215
| | - Sandor Vajda
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, US
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215
| |
Collapse
|
28
|
Okabe T, Aoi R, Yokota A, Tamiya-Ishitsuka H, Jiang Y, Sasaki A, Tsuneda S, Noda N. Arg-73 of the RNA endonuclease MazF in Salmonella enterica subsp. arizonae contributes to guanine and uracil recognition in the cleavage sequence. J Biol Chem 2024; 300:105636. [PMID: 38199572 PMCID: PMC10864209 DOI: 10.1016/j.jbc.2024.105636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/14/2023] [Accepted: 12/29/2023] [Indexed: 01/12/2024] Open
Abstract
The sequence-specific endoribonuclease MazF is widely conserved among prokaryotes. Approximately 20 different MazF cleavage sequences have been discovered, varying from three to seven nucleotides in length. Although MazFs from various prokaryotes were found, the cleavage sequences of most MazFs are unknown. Here, we characterized the conserved MazF of Salmonella enterica subsp. arizonae (MazF-SEA). Using massive parallel sequencing and fluorometric assays, we revealed that MazF-SEA preferentially cleaves the sequences U∧ACG and U∧ACU (∧ represents cleavage sites). In addition, we predicted the 3D structure of MazF-SEA using AlphaFold2 and aligned it with the crystal structure of RNA-bound Bacillus subtilis MazF to evaluate RNA interactions. We found Arg-73 of MazF-SEA interacts with RNAs containing G and U at the third position from the cleavage sites (U∧ACG and U∧ACU). We then obtained the mutated MazF-SEA R73L protein to evaluate the significance of Arg-73 interaction with RNAs containing G and U at this position. We also used fluorometric and kinetic assays and showed the enzymatic activity of MazF-SEA R73L for the sequence UACG and UACU was significantly decreased. These results suggest Arg-73 is essential for recognizing G and U at the third position from the cleavage sites. This is the first study to our knowledge to identify a single residue responsible for RNA recognition by MazF. Owing to its high specificity and ribosome-independence, MazF is useful for RNA cleavage in vitro. These results will likely contribute to increasing the diversity of MazF specificity and to furthering the application of MazF in RNA engineering.
Collapse
Affiliation(s)
- Takuma Okabe
- Department of Life Science and Medical Bioscience, Waseda University, Tokyo, Japan; Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan
| | - Rie Aoi
- Department of Life Science and Medical Bioscience, Waseda University, Tokyo, Japan; Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan
| | - Akiko Yokota
- Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan
| | - Hiroko Tamiya-Ishitsuka
- Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan
| | - Yunong Jiang
- Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan; Graduate School of Comprehensive Human Sciences, University of Tsukuba, Ibaraki, Japan
| | - Akira Sasaki
- Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan
| | - Satoshi Tsuneda
- Department of Life Science and Medical Bioscience, Waseda University, Tokyo, Japan.
| | - Naohiro Noda
- Department of Life Science and Medical Bioscience, Waseda University, Tokyo, Japan; Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan; School of Integrative and Global Majors, University of Tsukuba, Ibaraki, Japan.
| |
Collapse
|
29
|
Zheng W, Wuyun Q, Li Y, Zhang C, Freddolino PL, Zhang Y. Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data. Nat Methods 2024; 21:279-289. [PMID: 38167654 PMCID: PMC10864179 DOI: 10.1038/s41592-023-02130-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 11/13/2023] [Indexed: 01/05/2024]
Abstract
Leveraging iterative alignment search through genomic and metagenome sequence databases, we report the DeepMSA2 pipeline for uniform protein single- and multichain multiple-sequence alignment (MSA) construction. Large-scale benchmarks show that DeepMSA2 MSAs can remarkably increase the accuracy of protein tertiary and quaternary structure predictions compared with current state-of-the-art methods. An integrated pipeline with DeepMSA2 participated in the most recent CASP15 experiment and created complex structural models with considerably higher quality than the AlphaFold2-Multimer server (v.2.2.0). Detailed data analyses show that the major advantage of DeepMSA2 lies in its balanced alignment search and effective model selection, and in the power of integrating huge metagenomics databases. These results demonstrate a new avenue to improve deep learning protein structure prediction through advanced MSA construction and provide additional evidence that optimization of input information to deep learning-based structure prediction methods must be considered with as much care as the design of the predictor itself.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - P Lydia Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
- Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
30
|
Ieremie I, Ewing RM, Niranjan M. Protein language models meet reduced amino acid alphabets. Bioinformatics 2024; 40:btae061. [PMID: 38310333 PMCID: PMC10872054 DOI: 10.1093/bioinformatics/btae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 12/14/2023] [Accepted: 01/30/2024] [Indexed: 02/05/2024] Open
Abstract
MOTIVATION Protein language models (PLMs), which borrowed ideas for modelling and inference from natural language processing, have demonstrated the ability to extract meaningful representations in an unsupervised way. This led to significant performance improvement in several downstream tasks. Clustering amino acids based on their physical-chemical properties to achieve reduced alphabets has been of interest in past research, but their application to PLMs or folding models is unexplored. RESULTS Here, we investigate the efficacy of PLMs trained on reduced amino acid alphabets in capturing evolutionary information, and we explore how the loss of protein sequence information impacts learned representations and downstream task performance. Our empirical work shows that PLMs trained on the full alphabet and a large number of sequences capture fine details that are lost in alphabet reduction methods. We further show the ability of a structure prediction model(ESMFold) to fold CASP14 protein sequences translated using a reduced alphabet. For 10 proteins out of the 50 targets, reduced alphabets improve structural predictions with LDDT-Cα differences of up to 19%. AVAILABILITY AND IMPLEMENTATION Trained models and code are available at github.com/Ieremie/reduced-alph-PLM.
Collapse
Affiliation(s)
- Ioan Ieremie
- Vision, Learning & Control Group, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Rob M Ewing
- Biological Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Mahesan Niranjan
- Vision, Learning & Control Group, University of Southampton, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
31
|
Schaeffer RD, Zhang J, Medvedev KE, Kinch LN, Cong Q, Grishin NV. ECOD domain classification of 48 whole proteomes from AlphaFold Structure Database using DPAM2. PLoS Comput Biol 2024; 20:e1011586. [PMID: 38416793 PMCID: PMC10927120 DOI: 10.1371/journal.pcbi.1011586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 03/11/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024] Open
Abstract
Protein structure prediction has now been deployed widely across several different large protein sets. Large-scale domain annotation of these predictions can aid in the development of biological insights. Using our Evolutionary Classification of Protein Domains (ECOD) from experimental structures as a basis for classification, we describe the detection and cataloging of domains from 48 whole proteomes deposited in the AlphaFold Database. On average, we can provide positive classification (either of domains or other identifiable non-domain regions) for 90% of residues in all proteomes. We classified 746,349 domains from 536,808 proteins comprised of over 226,424,000 amino acid residues. We examine the varying populations of homologous groups in both eukaryotes and bacteria. In addition to containing a higher fraction of disordered regions and unassigned domains, eukaryotes show a higher proportion of repeated proteins, both globular and small repeats. We enumerate those highly populated domains that are shared in both eukaryotes and bacteria, such as the Rossmann domains, TIM barrels, and P-loop domains. Additionally, we compare the sampling of homologous groups from this whole proteome set against our stable ECOD reference and discuss groups that have been enriched by structure predictions. Finally, we discuss the implication of these results for protein target selection for future classification strategies for very large protein sets.
Collapse
Affiliation(s)
- R. Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Jing Zhang
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Kirill E. Medvedev
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Lisa N. Kinch
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Qian Cong
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Nick V. Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| |
Collapse
|
32
|
Zhang S, Li J, Chen SJ. Machine learning in RNA structure prediction: Advances and challenges. Biophys J 2024:S0006-3495(24)00067-5. [PMID: 38297836 DOI: 10.1016/j.bpj.2024.01.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/08/2024] [Accepted: 01/24/2024] [Indexed: 02/02/2024] Open
Abstract
RNA molecules play a crucial role in various biological processes, with their functionality closely tied to their structures. The remarkable advancements in machine learning techniques for protein structure prediction have shown promise in the field of RNA structure prediction. In this perspective, we discuss the advances and challenges encountered in constructing machine learning-based models for RNA structure prediction. We explore topics including model building strategies, specific challenges involved in predicting RNA secondary (2D) and tertiary (3D) structures, and approaches to these challenges. In addition, we highlight the advantages and challenges of constructing RNA language models. Given the rapid advances of machine learning techniques, we anticipate that machine learning-based models will serve as important tools for predicting RNA structures, thereby enriching our understanding of RNA structures and their corresponding functions.
Collapse
Affiliation(s)
- Sicheng Zhang
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri
| | - Jun Li
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri
| | - Shi-Jie Chen
- Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri; Department of Biochemistry, University of Missouri, Columbia, Missouri.
| |
Collapse
|
33
|
Ngo K, Yarov-Yarovoy V, Clancy CE, Vorobyov I. Harnessing AlphaFold to reveal state secrets: Prediction of hERG closed and inactivated states. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.27.577468. [PMID: 38352360 PMCID: PMC10862728 DOI: 10.1101/2024.01.27.577468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
To design safe, selective, and effective new therapies, there must be a deep understanding of the structure and function of the drug target. One of the most difficult problems to solve has been resolution of discrete conformational states of transmembrane ion channel proteins. An example is KV11.1 (hERG), comprising the primary cardiac repolarizing current, IKr. hERG is a notorious drug anti-target against which all promising drugs are screened to determine potential for arrhythmia. Drug interactions with the hERG inactivated state are linked to elevated arrhythmia risk, and drugs may become trapped during channel closure. However, the structural details of multiple conformational states have remained elusive. Here, we guided AlphaFold2 to predict plausible hERG inactivated and closed conformations, obtaining results consistent with myriad available experimental data. Drug docking simulations demonstrated hERG state-specific drug interactions aligning well with experimental results, revealing that most drugs bind more effectively in the inactivated state and are trapped in the closed state. Molecular dynamics simulations demonstrated ion conduction that aligned with earlier studies. Finally, we identified key molecular determinants of state transitions by analyzing interaction networks across closed, open, and inactivated states in agreement with earlier mutagenesis studies. Here, we demonstrate a readily generalizable application of AlphaFold2 as a novel method to predict discrete protein conformations and novel linkages from structure to function.
Collapse
Affiliation(s)
- Khoa Ngo
- Biophysics Graduate Group, University of California, Davis, CA
- Department of Physiology and Membrane Biology, University of California, Davis, CA
- Center for Precision Medicine and Data Science, University of California, Davis, CA
| | - Vladimir Yarov-Yarovoy
- Department of Physiology and Membrane Biology, University of California, Davis, CA
- Department of Anesthesiology and Pain Medicine, University of California, Davis, CA
| | - Colleen E. Clancy
- Department of Physiology and Membrane Biology, University of California, Davis, CA
- Department of Pharmacology, University of California, Davis, CA
- Center for Precision Medicine and Data Science, University of California, Davis, CA
| | - Igor Vorobyov
- Department of Physiology and Membrane Biology, University of California, Davis, CA
- Department of Pharmacology, University of California, Davis, CA
| |
Collapse
|
34
|
Amaya-Rodriguez CA, Carvajal-Zamorano K, Bustos D, Alegría-Arcos M, Castillo K. A journey from molecule to physiology and in silico tools for drug discovery targeting the transient receptor potential vanilloid type 1 (TRPV1) channel. Front Pharmacol 2024; 14:1251061. [PMID: 38328578 PMCID: PMC10847257 DOI: 10.3389/fphar.2023.1251061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 12/14/2023] [Indexed: 02/09/2024] Open
Abstract
The heat and capsaicin receptor TRPV1 channel is widely expressed in nerve terminals of dorsal root ganglia (DRGs) and trigeminal ganglia innervating the body and face, respectively, as well as in other tissues and organs including central nervous system. The TRPV1 channel is a versatile receptor that detects harmful heat, pain, and various internal and external ligands. Hence, it operates as a polymodal sensory channel. Many pathological conditions including neuroinflammation, cancer, psychiatric disorders, and pathological pain, are linked to the abnormal functioning of the TRPV1 in peripheral tissues. Intense biomedical research is underway to discover compounds that can modulate the channel and provide pain relief. The molecular mechanisms underlying temperature sensing remain largely unknown, although they are closely linked to pain transduction. Prolonged exposure to capsaicin generates analgesia, hence numerous capsaicin analogs have been developed to discover efficient analgesics for pain relief. The emergence of in silico tools offered significant techniques for molecular modeling and machine learning algorithms to indentify druggable sites in the channel and for repositioning of current drugs aimed at TRPV1. Here we recapitulate the physiological and pathophysiological functions of the TRPV1 channel, including structural models obtained through cryo-EM, pharmacological compounds tested on TRPV1, and the in silico tools for drug discovery and repositioning.
Collapse
Affiliation(s)
- Cesar A. Amaya-Rodriguez
- Centro Interdisciplinario de Neurociencia de Valparaíso, Facultad de Ciencias, Universidad de Valparaíso, Valparaíso, Chile
- Departamento de Fisiología y Comportamiento Animal, Facultad de Ciencias Naturales, Exactas y Tecnología, Universidad de Panamá, Ciudad de Panamá, Panamá
| | - Karina Carvajal-Zamorano
- Centro Interdisciplinario de Neurociencia de Valparaíso, Facultad de Ciencias, Universidad de Valparaíso, Valparaíso, Chile
| | - Daniel Bustos
- Centro de Investigación de Estudios Avanzados del Maule (CIEAM), Vicerrectoría de Investigación y Postgrado Universidad Católica del Maule, Talca, Chile
- Laboratorio de Bioinformática y Química Computacional, Departamento de Medicina Traslacional, Facultad de Medicina, Universidad Católica del Maule, Talca, Chile
| | - Melissa Alegría-Arcos
- Núcleo de Investigación en Data Science, Facultad de Ingeniería y Negocios, Universidad de las Américas, Santiago, Chile
| | - Karen Castillo
- Centro Interdisciplinario de Neurociencia de Valparaíso, Facultad de Ciencias, Universidad de Valparaíso, Valparaíso, Chile
- Centro de Investigación de Estudios Avanzados del Maule (CIEAM), Vicerrectoría de Investigación y Postgrado Universidad Católica del Maule, Talca, Chile
| |
Collapse
|
35
|
Qayyum MZ, Imashimizu M, Leanca M, Vishwakarma RK, Riaz-Bradley A, Yuzenkova Y, Murakami KS. Structure and function of the Si3 insertion integrated into the trigger loop/helix of cyanobacterial RNA polymerase. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.11.575193. [PMID: 38260627 PMCID: PMC10802570 DOI: 10.1101/2024.01.11.575193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Cyanobacteria and evolutionarily related chloroplasts of algae and plants possess unique RNA polymerases (RNAPs) with characteristics that distinguish from canonical bacterial RNAPs. The largest subunit of cyanobacterial RNAP (cyRNAP) is divided into two polypeptides, β'1 and β'2, and contains the largest known lineage-specific insertion domain, Si3, located in the middle of the trigger loop and spans approximately half of the β'2 subunit. In this study, we present the X-ray crystal structure of Si3 and the cryo-EM structures of the cyRNAP transcription elongation complex plus the NusG factor with and without incoming nucleoside triphosphate (iNTP) bound at the active site. Si3 has a well-ordered and elongated shape that exceeds the length of the main body of cyRNAP, fits into cavities of cyRNAP and shields the binding site of secondary channel-binding proteins such as Gre and DksA. A small transition from the trigger loop to the trigger helix upon iNTP binding at the active site results in a large swing motion of Si3; however, this transition does not affect the catalytic activity of cyRNAP due to its minimal contact with cyRNAP, NusG or DNA. This study provides a structural framework for understanding the evolutionary significance of these features unique to cyRNAP and chloroplast RNAP and may provide insights into the molecular mechanism of transcription in specific environment of photosynthetic organisms.
Collapse
Affiliation(s)
- M. Zuhaib Qayyum
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Current address: Protein Technologies Center, Inspiration4 Advanced Research Center, Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Masahiko Imashimizu
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA 16802, USA
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, 305-8565 Japan
| | - Miron Leanca
- The Centre for Bacterial Cell Biology, Newcastle University, UK
| | - Rishi K. Vishwakarma
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | | | - Yulia Yuzenkova
- The Centre for Bacterial Cell Biology, Newcastle University, UK
| | - Katsuhiko S. Murakami
- Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, The Center for Structural Biology, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
36
|
Ohno S, Manabe N, Yamaguchi Y. Prediction of protein structure and AI. J Hum Genet 2024:10.1038/s10038-023-01215-4. [PMID: 38177398 DOI: 10.1038/s10038-023-01215-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 12/10/2023] [Indexed: 01/06/2024]
Abstract
AlphaFold, an artificial intelligence (AI)-based tool for predicting the 3D structure of proteins, is now widely recognized for its high accuracy and versatility in the folding of human proteins. AlphaFold is useful for understanding structure-function relationships from protein 3D structure models and can serve as a template or a reference for experimental structural analysis including X-ray crystallography, NMR and cryo-EM analysis. Its use is expanding among researchers, not only in structural biology but also in other research fields. Researchers are currently exploring the full potential of AlphaFold-generated protein models. Predicting disease severity caused by missense mutations is one such application. This article provides an overview of the 3D structural modeling of AlphaFold based on deep learning techniques and highlights the challenges in predicting the pathogenicity of missense mutations.
Collapse
Affiliation(s)
- Shiho Ohno
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan
| | - Noriyoshi Manabe
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan
| | - Yoshiki Yamaguchi
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan.
| |
Collapse
|
37
|
Hussain A, Brooks III CL. Guiding discovery of protein sequence-structure-function modeling. Bioinformatics 2024; 40:btae002. [PMID: 38195719 PMCID: PMC10789314 DOI: 10.1093/bioinformatics/btae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/05/2023] [Accepted: 01/08/2024] [Indexed: 01/11/2024] Open
Abstract
MOTIVATION Protein engineering techniques are key in designing novel catalysts for a wide range of reactions. Although approaches vary in their exploration of the sequence-structure-function paradigm, they are often hampered by the labor-intensive steps of protein expression and screening. In this work, we describe the development and testing of a high-throughput in silico sequence-structure-function pipeline using AlphaFold2 and fast Fourier transform docking that is benchmarked with enantioselectivity and reactivity predictions for an ancestral sequence library of fungal flavin-dependent monooxygenases. RESULTS The predicted enantioselectivities and reactivities correlate well with previously described screens of an experimentally available subset of these proteins and capture known changes in enantioselectivity across the phylogenetic tree representing ancestorial proteins from this family. With this pipeline established as our functional screen, we apply ensemble decision tree models and explainable AI techniques to build sequence-function models and extract critical residues within the binding site and the second-sphere residues around this site. We demonstrate that the top-identified key residues in the control of enantioselectivity and reactivity correspond to experimentally verified residues. The in silico sequence-to-function pipeline serves as an accelerated framework to inform protein engineering efforts from vast informative sequence landscapes contained in protein families, ancestral resurrects, and directed evolution campaigns. AVAILABILITY Jupyter notebooks detailing the sequence-structure-function pipeline are available at https://github.com/BrooksResearchGroup-UM/seq_struct_func.
Collapse
Affiliation(s)
- Azam Hussain
- Department of Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor, MI 48109-1055, United States
| | - Charles L Brooks III
- Department of Chemistry, University of Michigan, Ann Arbor, MI 48109-1055, United States
| |
Collapse
|
38
|
Cappelli L, Cinelli P, Perrotta A, Veggi D, Audagnotto M, Tuscano G, Pansegrau W, Bartolini E, Rinaudo D, Cozzi R. Computational structure-based approach to study chimeric antigens using a new protein scaffold displaying foreign epitopes. FASEB J 2024; 38:e23326. [PMID: 38019196 DOI: 10.1096/fj.202202130r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 10/24/2023] [Accepted: 11/08/2023] [Indexed: 11/30/2023]
Abstract
The identification and recombinant production of functional antigens and/or epitopes of pathogens represent a crucial step for the development of an effective protein-based vaccine. Many vaccine targets are outer membrane proteins anchored into the lipidic bilayer through an extended hydrophobic portion making their recombinant production challenging. Moreover, only the extracellular loops, and not the hydrophobic regions, are naturally exposed to the immune system. In this work, the Domain 3 (D3) from Group B Streptococcus (GBS) pilus 2a backbone protein has been identified and engineered to be used as a scaffold for the display of extracellular loops of two Neisseria gonorrhoeae membrane proteins (PorB.1b and OpaB). A computational structure-based approach has been applied to the design of both the scaffold and the model antigens. Once identified the best D3 engineerable site, several different chimeric D3 displaying PorB.1b and OpaB extracellular loops were produced as soluble proteins. Each molecule has been characterized in terms of solubility, stability, and ability to correctly display the foreign epitope. This antigen dissection strategy allowed the identification of most immunogenic extracellular loops of both PorB.1b and OpaB gonococcal antigens. The crystal structure of chimeric D3 displaying PorB.1b immunodominant loop has been obtained confirming that the engineerization did not alter the predicted native structure of this epitope. Taken together, the reported data suggest that D3 is a novel protein scaffold for epitope insertion and display, and a valid alternative to the production of whole membrane protein antigens. Finally, this work describes a generalized computational structure-based approach for the identification, design, and dissection of epitopes in target antigens through chimeric proteins.
Collapse
Affiliation(s)
- Luigia Cappelli
- Dipartimento di Farmacia e Biotecnologie - FaBiT, University of Bologna, Bologna, Italy
- GSK, Siena, Italy
| | - Paolo Cinelli
- Dipartimento di Farmacia e Biotecnologie - FaBiT, University of Bologna, Bologna, Italy
- GSK, Siena, Italy
| | - Andrea Perrotta
- GSK, Siena, Italy
- Dipartimento di Scienze della Vita, University of Siena, Siena, Italy
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Yin R, Pierce BG. Evaluation of AlphaFold antibody-antigen modeling with implications for improving predictive accuracy. Protein Sci 2024; 33:e4865. [PMID: 38073135 PMCID: PMC10751731 DOI: 10.1002/pro.4865] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/01/2023] [Accepted: 12/07/2023] [Indexed: 12/26/2023]
Abstract
High resolution antibody-antigen structures provide critical insights into immune recognition and can inform therapeutic design. The challenges of experimental structural determination and the diversity of the immune repertoire underscore the necessity of accurate computational tools for modeling antibody-antigen complexes. Initial benchmarking showed that despite overall success in modeling protein-protein complexes, AlphaFold and AlphaFold-Multimer have limited success in modeling antibody-antigen interactions. In this study, we performed a thorough analysis of AlphaFold's antibody-antigen modeling performance on 427 nonredundant antibody-antigen complex structures, identifying useful confidence metrics for predicting model quality, and features of complexes associated with improved modeling success. Notably, we found that the latest version of AlphaFold improves near-native modeling success to over 30%, versus approximately 20% for a previous version, while increased AlphaFold sampling gives approximately 50% success. With this improved success, AlphaFold can generate accurate antibody-antigen models in many cases, while additional training or other optimization may further improve performance.
Collapse
Affiliation(s)
- Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| |
Collapse
|
40
|
Song J, Kurgan L. Availability of web servers significantly boosts citations rates of bioinformatics methods for protein function and disorder prediction. BIOINFORMATICS ADVANCES 2023; 3:vbad184. [PMID: 38146538 PMCID: PMC10749743 DOI: 10.1093/bioadv/vbad184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 12/08/2023] [Accepted: 12/15/2023] [Indexed: 12/27/2023]
Abstract
Motivation Development of bioinformatics methods is a long, complex and resource-hungry process. Hundreds of these tools were released. While some methods are highly cited and used, many suffer relatively low citation rates. We empirically analyze a large collection of recently released methods in three diverse protein function and disorder prediction areas to identify key factors that contribute to increased citations. Results We show that provision of a working web server significantly boosts citation rates. On average, methods with working web servers generate three times as many citations compared to tools that are available as only source code, have no code and no server, or are no longer available. This observation holds consistently across different research areas and publication years. We also find that differences in predictive performance are unlikely to impact citation rates. Overall, our empirical results suggest that a relatively low-cost investment into the provision and long-term support of web servers would substantially increase the impact of bioinformatics tools.
Collapse
Affiliation(s)
- Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
- Monash Data Futures Institute, Monash University, Clayton, VIC 3800, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
41
|
Kinch LN, Schaeffer RD, Zhang J, Cong Q, Orth K, Grishin N. Insights into virulence: structure classification of the Vibrio parahaemolyticus RIMD mobilome. mSystems 2023; 8:e0079623. [PMID: 38014954 PMCID: PMC10734457 DOI: 10.1128/msystems.00796-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 10/17/2023] [Indexed: 11/29/2023] Open
Abstract
IMPORTANCE The pandemic Vpar strain RIMD causes seafood-borne illness worldwide. Previous comparative genomic studies have revealed pathogenicity islands in RIMD that contribute to the success of the strain in infection. However, not all virulence determinants have been identified, and many of the proteins encoded in known pathogenicity islands are of unknown function. Based on the EOCD database, we used evolution-based classification of structure models for the RIMD proteome to improve our functional understanding of virulence determinants acquired by the pandemic strain. We further identify and classify previously unknown mobile protein domains as well as fast evolving residue positions in structure models that contribute to virulence and adaptation with respect to a pre-pandemic strain. Our work highlights key contributions of phage in mediating seafood born illness, suggesting this strain balances its avoidance of phage predators with its successful colonization of human hosts.
Collapse
Affiliation(s)
- Lisa N. Kinch
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - R. Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Jing Zhang
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Qian Cong
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Kim Orth
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Nick Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
42
|
Mahony J, Goulet A, van Sinderen D, Cambillau C. Partial Atomic Model of the Tailed Lactococcal Phage TP901-1 as Predicted by AlphaFold2: Revelations and Limitations. Viruses 2023; 15:2440. [PMID: 38140681 PMCID: PMC10747895 DOI: 10.3390/v15122440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 12/08/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
Bacteria are engaged in a constant battle against preying viruses, called bacteriophages (or phages). These remarkable nano-machines pack and store their genomes in a capsid and inject it into the cytoplasm of their bacterial prey following specific adhesion to the host cell surface. Tailed phages possessing dsDNA genomes are the most abundant phages in the bacterial virosphere, particularly those with long, non-contractile tails. All tailed phages possess a nano-device at their tail tip that specifically recognizes and adheres to a suitable host cell surface receptor, being proteinaceous and/or saccharidic. Adhesion devices of tailed phages infecting Gram-positive bacteria are highly diverse and, for the majority, remain poorly understood. Their long, flexible, multi-domain-encompassing tail limits experimental approaches to determine their complete structure. We have previously shown that the recently developed protein structure prediction program AlphaFold2 can overcome this limitation by predicting the structures of phage adhesion devices with confidence. Here, we extend this approach and employ AlphaFold2 to determine the structure of a complete phage, the lactococcal P335 phage TP901-1. Herein we report the structures of its capsid and neck, its extended tail, and the complete adhesion device, the baseplate, which was previously partially determined using X-ray crystallography.
Collapse
Affiliation(s)
- Jennifer Mahony
- School of Microbiology & APC Microbiome Ireland, University College Cork, T12 K8AF Cork, Ireland;
| | - Adeline Goulet
- Laboratoire d’Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie, Bioénergies et Biotechnologie (IMM), Aix-Marseille Université—CNRS, UMR 7255, 13009 Marseille, France;
| | - Douwe van Sinderen
- School of Microbiology & APC Microbiome Ireland, University College Cork, T12 K8AF Cork, Ireland;
| | - Christian Cambillau
- School of Microbiology & APC Microbiome Ireland, University College Cork, T12 K8AF Cork, Ireland;
- Laboratoire d’Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie, Bioénergies et Biotechnologie (IMM), Aix-Marseille Université—CNRS, UMR 7255, 13009 Marseille, France;
| |
Collapse
|
43
|
Debrine AM, Karplus PA, Rockey DD. A structural foundation for studying chlamydial polymorphic membrane proteins. Microbiol Spectr 2023; 11:e0324223. [PMID: 37882824 PMCID: PMC10715098 DOI: 10.1128/spectrum.03242-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 09/18/2023] [Indexed: 10/27/2023] Open
Abstract
IMPORTANCE Infections by bacteria in the genus Chlamydia cause a range of widespread and potentially debilitating conditions in humans and other animals. We analyzed predicted structures of a family of proteins that are potential vaccine targets found in all Chlamydia spp. Our findings deepen the understanding of protein structure, provide a descriptive framework for discussion of the protein structure, and outline regions of the proteins that may be key targets in host-microbe interactions and anti-chlamydial immunity.
Collapse
Affiliation(s)
- Abigail M. Debrine
- Department of Biomedical Sciences, Oregon State University, Corvallis, Oregon, USA
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon, USA
| | - P. Andrew Karplus
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon, USA
| | - Daniel D. Rockey
- Department of Biomedical Sciences, Oregon State University, Corvallis, Oregon, USA
| |
Collapse
|
44
|
Lensink MF, Brysbaert G, Raouraoua N, Bates PA, Giulini M, Honorato RV, van Noort C, Teixeira JMC, Bonvin AMJJ, Kong R, Shi H, Lu X, Chang S, Liu J, Guo Z, Chen X, Morehead A, Roy RS, Wu T, Giri N, Quadir F, Chen C, Cheng J, Del Carpio CA, Ichiishi E, Rodriguez‐Lumbreras LA, Fernandez‐Recio J, Harmalkar A, Chu L, Canner S, Smanta R, Gray JJ, Li H, Lin P, He J, Tao H, Huang S, Roel‐Touris J, Jimenez‐Garcia B, Christoffer CW, Jain AJ, Kagaya Y, Kannan H, Nakamura T, Terashi G, Verburgt JC, Zhang Y, Zhang Z, Fujuta H, Sekijima M, Kihara D, Khan O, Kotelnikov S, Ghani U, Padhorny D, Beglov D, Vajda S, Kozakov D, Negi SS, Ricciardelli T, Barradas‐Bautista D, Cao Z, Chawla M, Cavallo L, Oliva R, Yin R, Cheung M, Guest JD, Lee J, Pierce BG, Shor B, Cohen T, Halfon M, Schneidman‐Duhovny D, Zhu S, Yin R, Sun Y, Shen Y, Maszota‐Zieleniak M, Bojarski KK, Lubecka EA, Marcisz M, Danielsson A, Dziadek L, Gaardlos M, Gieldon A, Liwo A, Samsonov SA, Slusarz R, Zieba K, Sieradzan AK, Czaplewski C, Kobayashi S, Miyakawa Y, Kiyota Y, Takeda‐Shitaka M, Olechnovic K, Valancauskas L, Dapkunas J, Venclovas C, Wallner B, Yang L, Hou C, He X, Guo S, Jiang S, Ma X, Duan R, Qui L, Xu X, Zou X, Velankar S, Wodak SJ. Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment. Proteins 2023; 91:1658-1683. [PMID: 37905971 PMCID: PMC10841881 DOI: 10.1002/prot.26609] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 09/22/2023] [Accepted: 09/28/2023] [Indexed: 11/02/2023]
Abstract
We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.
Collapse
Affiliation(s)
- Marc F. Lensink
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Guillaume Brysbaert
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Nessim Raouraoua
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Paul A. Bates
- Biomolecular Modeling LaboratoryThe Francis Crick InstituteLondonUK
| | - Marco Giulini
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Rodrigo V. Honorato
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Charlotte van Noort
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Joao M. C. Teixeira
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Xufeng Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Jian Liu
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Zhiye Guo
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Xiao Chen
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Alex Morehead
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Raj S. Roy
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Tianqi Wu
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Nabin Giri
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Farhan Quadir
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Chen Chen
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jianlin Cheng
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | | | - Eichiro Ichiishi
- International University of Health and Welfare (IUHV Hospital)Nasushiobara‐CityJapan
| | - Luis A. Rodriguez‐Lumbreras
- Instituto de Ciencias de la Vida y del Vino (ICVV)CSIC ‐ Universidad de La Rioja ‐ Gobierno de La RiojaLogronoSpain
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
| | - Juan Fernandez‐Recio
- Instituto de Ciencias de la Vida y del Vino (ICVV)CSIC ‐ Universidad de La Rioja ‐ Gobierno de La RiojaLogronoSpain
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
| | - Ameya Harmalkar
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Lee‐Shin Chu
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Sam Canner
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Rituparna Smanta
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Jeffrey J. Gray
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
- Program in Molecular BiophysicsJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Hao Li
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Peicong Lin
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Jiahua He
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Huanyu Tao
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Sheng‐You Huang
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Jorge Roel‐Touris
- Protein Design and Modeling Lab, Dept. of Structural BiologyMolecular Biology Institute of Barcelona (IBMB‐CSIC)BarcelonaSpain
| | | | | | - Anika J. Jain
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Yuki Kagaya
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Harini Kannan
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
- Dept. of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology MadrasChennaiIndia
| | - Tsukasa Nakamura
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Genki Terashi
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Jacob C. Verburgt
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Yuanyuan Zhang
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Zicong Zhang
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Hayato Fujuta
- Dept. of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology MadrasChennaiIndia
| | | | - Daisuke Kihara
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | | | | | | | | | | | | | | | - Surendra S. Negi
- Sealy Center for Structural Biology and Molecular BiophysicsUniversity of Texas Medical BranchGalvestonTexasUSA
| | | | | | - Zhen Cao
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
| | - Mohit Chawla
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
| | - Luigi Cavallo
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
- Department of Chemistry and BiologyUniversity of SalernoFiscianoItaly
| | | | - Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Melyssa Cheung
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Chemistry and BiochemistryUniversity of MarylandCollege ParkMarylandUSA
| | - Johnathan D. Guest
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Jessica Lee
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Ben Shor
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | - Tomer Cohen
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | - Matan Halfon
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | | | - Shaowen Zhu
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Rujie Yin
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Yuanfei Sun
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Yang Shen
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
- Department of Computer Science and EngineeringTexas A&M UniversityCollege StationTexasUSA
- Institute of Biosciences and Technology and Department of Translational Medical SciencesTexas A&M UniversityHoustonTexasUSA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yuta Miyakawa
- School of PharmacyKitasato UniversityMinato‐kuTokyoJapan
| | - Yasuomi Kiyota
- School of PharmacyKitasato UniversityMinato‐kuTokyoJapan
| | | | - Kliment Olechnovic
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Lukas Valancauskas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Justas Dapkunas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Ceslovas Venclovas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Bjorn Wallner
- Bioinformatics Division, Department of Physics, Chemistry, and BiologyLinkoping UniversityLinköpingSweden
| | - Lin Yang
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
- School of Aerospace, Mechanical and Mechatronic EngineeringThe University of SydneyNew South WalesAustralia
| | - Chengyu Hou
- School of Electronics and Information EngineeringHarbin Institute of TechnologyHarbinChina
| | - Xiaodong He
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
- Shenzhen STRONG Advanced Materials Research Institute Col, LtdShenzhenPeople's Republic of China
| | - Shuai Guo
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Shenda Jiang
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Xiaoliang Ma
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Rui Duan
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Liming Qui
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Xianjin Xu
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
- Dept. of Physics and AstronomyUniversity of MissouriColumbiaMissouriUSA
- Dept. of BiochemistryUniversity of MissouriColumbiaMissouriUSA
- Institute for Data Science and InformaticsUniversity of MissouriColumbiaMissouriUSA
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)HinxtonCambridgeUK
| | | |
Collapse
|
45
|
Tame JRH. Using symmetry to drive new protein assemblies. Nat Chem 2023; 15:1653-1654. [PMID: 38036648 DOI: 10.1038/s41557-023-01369-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Affiliation(s)
- Jeremy R H Tame
- Graduate School of Medical Life Science, Yokohama City University, Tsurumi, Yokohama, Japan.
| |
Collapse
|
46
|
Lv Q, Zhou F, Liu X, Zhi L. Artificial intelligence in small molecule drug discovery from 2018 to 2023: Does it really work? Bioorg Chem 2023; 141:106894. [PMID: 37776682 DOI: 10.1016/j.bioorg.2023.106894] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/24/2023] [Accepted: 09/25/2023] [Indexed: 10/02/2023]
Abstract
Utilizing artificial intelligence (AI) in drug design represents an advanced approach for identifying targets and developing new drugs. Integrating AI techniques significantly reduces the workload involved in drug development and enhances the efficiency of early-stage drug discovery. This review aims to present a comprehensive overview of the utilization of AI methods in the field of small drug design, with a specific focus on four key areas: protein structure prediction, molecular virtual screening, molecular design, and absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction. Additionally, the role and limitations of AI in drug development are explored, and the impact of AI on decision-making processes is studied. It is important to note that while AI can bring numerous benefits to the early stage of drug development, the direction and quality of decision-making should still be emphasized, as AI should be considered as a tool rather than a decisive factor.
Collapse
Affiliation(s)
- Qi Lv
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China
| | - Feilong Zhou
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China
| | - Xinhua Liu
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China.
| | - Liping Zhi
- School of Health Management, Anhui Medical University Hefei, 230032, PR China.
| |
Collapse
|
47
|
Malhotra N, Khatri S, Kumar A, Arun A, Daripa P, Fatihi S, Venkadesan S, Jain N, Thukral L. AI-based AlphaFold2 significantly expands the structural space of the autophagy pathway. Autophagy 2023; 19:3201-3220. [PMID: 37516933 PMCID: PMC10621275 DOI: 10.1080/15548627.2023.2238578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/08/2023] [Accepted: 07/14/2023] [Indexed: 07/31/2023] Open
Abstract
ABBREVIATIONS AF2: AlphaFold2; AF2-Mult: AlphaFold2 multimer; ATG: autophagy-related; CTD: C-terminal domain; ECTD: extreme C-terminal domain; FR: flexible region; MD: molecular dynamics; NTD: N-terminal domain; pLDDT: predicted local distance difference test; UBL: ubiquitin-like.
Collapse
Affiliation(s)
- Nidhi Malhotra
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
| | - Shantanu Khatri
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSir), Ghaziabad, India
| | - Ajit Kumar
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSir), Ghaziabad, India
| | - Akanksha Arun
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSir), Ghaziabad, India
| | - Purba Daripa
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
| | - Saman Fatihi
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSir), Ghaziabad, India
| | | | - Niyati Jain
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
| | - Lipi Thukral
- Computational Structural Biology Lab, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSir), Ghaziabad, India
| |
Collapse
|
48
|
Oda T. Improving protein structure prediction with extended sequence similarity searches and deep-learning-based refinement in CASP15. Proteins 2023; 91:1712-1723. [PMID: 37485822 DOI: 10.1002/prot.26551] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/23/2023] [Accepted: 06/28/2023] [Indexed: 07/25/2023]
Abstract
The human predictor team PEZYFoldings got first place with the assessor's formulae (3rd place with Global Distance Test Total Score [GDT-TS]) in the single-domain category and 10th place in the multimer category in Critical Assessment of Structure Prediction 15. In this paper, I describe the exact method used by PEZYFoldings in the competition. As AlphaFold2 and AlphaFold-Multimer, developed by DeepMind, were state-of-the-art structure prediction tools, it was assumed that enhancing the input and output of the tools was an effective strategy to obtain the highest accuracy for structure prediction. Therefore, I used additional tools and databases to collect evolutionarily related sequences and introduced a deep-learning-based model in the refinement step. In addition to these modifications, manual interventions were performed to address various tasks. Detailed analyses were performed after the competition to identify the main contributors to performance. Comparing the number of evolutionarily related sequences I used with those of the other teams that provided AlphaFold2's baseline predictions revealed that an extensive sequence similarity search was one of the main contributors. Nonetheless, there were specific targets for which I could not identify any evolutionarily related sequences, resulting in my inability to construct accurate structures for these targets. Notably, I noticed that I had gained large Z-scores with the subunits of H1137, for which I performed manual domain parsing considering the interfaces between the subunits. This finding implies that the manual intervention contributed to my performance. The influence of the refinement model on the accuracy of structure prediction was minimal. I could have predicted structures with a similar level of accuracy without employing the refinement model. However, from the perspective of accuracy self-estimate, many structures demonstrated improvement after refinement. This improvement likely had a substantial influence on improving my position in the assessor's formulae rankings. These results highlight the opportunities for improvement in (1) multimer prediction, (2) building of larger and more diverse databases, and (3) developing tools to predict structures from primary sequences alone. In addition, transferring the manual intervention process to automation is a future concern.
Collapse
|
49
|
Kryshtafovych A, Rigden DJ. To split or not to split: CASP15 targets and their processing into tertiary structure evaluation units. Proteins 2023; 91:1558-1570. [PMID: 37254889 PMCID: PMC10687315 DOI: 10.1002/prot.26533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 05/02/2023] [Accepted: 05/18/2023] [Indexed: 06/01/2023]
Abstract
Processing of CASP15 targets into evaluation units (EUs) and assigning them to evolutionary-based prediction classes is presented in this study. The targets were first split into structural domains based on compactness and similarity to other proteins. Models were then evaluated against these domains and their combinations. The domains were joined into larger EUs if predictors' performance on the combined units was similar to that on individual domains. Alternatively, if most predictors performed better on the individual domains, then they were retained as EUs. As a result, 112 evaluation units were created from 77 tertiary structure prediction targets. The EUs were assigned to four prediction classes roughly corresponding to target difficulty categories in previous CASPs: TBM (template-based modeling, easy or hard), FM (free modeling), and the TBM/FM overlap category. More than a third of CASP15 EUs were attributed to the historically most challenging FM class, where homology or structural analogy to proteins of known fold cannot be detected.
Collapse
Affiliation(s)
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
50
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XV. Proteins 2023; 91:1539-1549. [PMID: 37920879 PMCID: PMC10843301 DOI: 10.1002/prot.26617] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023]
Abstract
Computing protein structure from amino acid sequence information has been a long-standing grand challenge. Critical assessment of structure prediction (CASP) conducts community experiments aimed at advancing solutions to this and related problems. Experiments are conducted every 2 years. The 2020 experiment (CASP14) saw major progress, with the second generation of deep learning methods delivering accuracy comparable with experiment for many single proteins. There is an expectation that these methods will have much wider application in computational structural biology. Here we summarize results from the most recent experiment, CASP15, in 2022, with an emphasis on new deep learning-driven progress. Other papers in this special issue of proteins provide more detailed analysis. For single protein structures, the AlphaFold2 deep learning method is still superior to other approaches, but there are two points of note. First, although AlphaFold2 was the core of all the most successful methods, there was a wide variety of implementation and combination with other methods. Second, using the standard AlphaFold2 protocol and default parameters only produces the highest quality result for about two thirds of the targets, and more extensive sampling is required for the others. The major advance in this CASP is the enormous increase in the accuracy of computed protein complexes, achieved by the use of deep learning methods, although overall these do not fully match the performance for single proteins. Here too, AlphaFold2 based method perform best, and again more extensive sampling than the defaults is often required. Also of note are the encouraging early results on the use of deep learning to compute ensembles of macromolecular structures. Critically for the usability of computed structures, for both single proteins and protein complexes, deep learning derived estimates of both local and global accuracy are of high quality, however the estimates in interface regions are slightly less reliable. CASP15 also included computation of RNA structures for the first time. Here, the classical approaches produced better agreement with experiment than the new deep learning ones, and accuracy is limited. Also, for the first time, CASP included the computation of protein-ligand complexes, an area of special interest for drug design. Here too, classical methods were still superior to deep learning ones. Many new approaches were discussed at the CASP conference, and it is clear methods will continue to advance.
Collapse
Affiliation(s)
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Maya Topf
- Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| | | | - John Moult
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA, and Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| |
Collapse
|