1
|
Li Y, Duan Z, Li Z, Xue W. Data and AI-driven synthetic binding protein discovery. Trends Pharmacol Sci 2025; 46:132-144. [PMID: 39755458 DOI: 10.1016/j.tips.2024.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Revised: 12/02/2024] [Accepted: 12/06/2024] [Indexed: 01/06/2025]
Abstract
Synthetic binding proteins (SBPs) are a class of protein binders that are artificially created and do not exist naturally. Their broad applications in tackling challenges of research, diagnostics, and therapeutics have garnered significant interest. Traditional protein engineering is pivotal to the discovery of SBPs. Recently, this discovery has been significantly accelerated by computational approaches, such as molecular modeling and artificial intelligence (AI). Furthermore, while numerous bioinformatics databases offer a wealth of resources that fuel SBP discovery, the full potential of these data has not yet been fully exploited. In this review, we present a comprehensive overview of SBP data ecosystem and methodologies in SBP discovery, highlighting the critical role of high-quality data and AI technologies in accelerating the discovery of innovative SBPs with promising applications in pharmacological sciences.
Collapse
Affiliation(s)
- Yanlin Li
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Zixin Duan
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Zhenwen Li
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Weiwei Xue
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China; Western (Chongqing) Collaborative Innovation Center for Intelligent Diagnostics and Digital Medicine, Chongqing National Biomedicine Industry Park, Chongqing 401329, China.
| |
Collapse
|
2
|
Liu Z, Zhang C, Zhang Q, Zhang Y, Yu DJ. TM-search: An Efficient and Effective Tool for Protein Structure Database Search. J Chem Inf Model 2024; 64:1043-1049. [PMID: 38270339 DOI: 10.1021/acs.jcim.3c01455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
The quickly increasing size of the Protein Data Bank is challenging biologists to develop a more scalable protein structure alignment tool for fast structure database search. Although many protein structure search algorithms and programs have been designed and implemented for this purpose, most require a large amount of computational time. We propose a novel protein structure search approach, TM-search, which is based on the pairwise structure alignment program TM-align and a new iterative clustering algorithm. Benchmark tests demonstrate that TM-search is 27 times faster than a TM-align full database search while still being able to identify ∼90% of all high TM-score hits, which is 2-10 times more than other existing programs such as Foldseek, Dali, and PSI-BLAST.
Collapse
Affiliation(s)
- Zi Liu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, Michigan 48109-2218, United States
| | - Qidi Zhang
- Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, Michigan 48109-2218, United States
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing 210094, China
| |
Collapse
|
3
|
Schierholz L, Brown CR, Helena-Bueno K, Uversky VN, Hirt RP, Barandun J, Melnikov SV. A Conserved Ribosomal Protein Has Entirely Dissimilar Structures in Different Organisms. Mol Biol Evol 2024; 41:msad254. [PMID: 37987564 PMCID: PMC10764239 DOI: 10.1093/molbev/msad254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 10/23/2023] [Accepted: 11/16/2023] [Indexed: 11/22/2023] Open
Abstract
Ribosomes from different species can markedly differ in their composition by including dozens of ribosomal proteins that are unique to specific lineages but absent in others. However, it remains unknown how ribosomes acquire new proteins throughout evolution. Here, to help answer this question, we describe the evolution of the ribosomal protein msL1/msL2 that was recently found in ribosomes from the parasitic microorganism clade, microsporidia. We show that this protein has a conserved location in the ribosome but entirely dissimilar structures in different organisms: in each of the analyzed species, msL1/msL2 exhibits an altered secondary structure, an inverted orientation of the N-termini and C-termini on the ribosomal binding surface, and a completely transformed 3D fold. We then show that this fold switching is likely caused by changes in the ribosomal msL1/msL2-binding site, specifically, by variations in rRNA. These observations allow us to infer an evolutionary scenario in which a small, positively charged, de novo-born unfolded protein was first captured by rRNA to become part of the ribosome and subsequently underwent complete fold switching to optimize its binding to its evolving ribosomal binding site. Overall, our work provides a striking example of how a protein can switch its fold in the context of a complex biological assembly, while retaining its specificity for its molecular partner. This finding will help us better understand the origin and evolution of new protein components of complex molecular assemblies-thereby enhancing our ability to engineer biological molecules, identify protein homologs, and peer into the history of life on Earth.
Collapse
Affiliation(s)
- Léon Schierholz
- Department of Molecular Biology, Laboratory for Molecular Infection Medicine Sweden, Umeå Centre for Microbial Research, Science for Life Laboratory, Umeå University, Umeå 901 87, Sweden
| | - Charlotte R Brown
- Biosciences Institute, Newcastle University School of Medicine, Newcastle upon Tyne NE2 4HH, UK
| | - Karla Helena-Bueno
- Biosciences Institute, Newcastle University School of Medicine, Newcastle upon Tyne NE2 4HH, UK
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Robert P Hirt
- Biosciences Institute, Newcastle University School of Medicine, Newcastle upon Tyne NE2 4HH, UK
| | - Jonas Barandun
- Department of Molecular Biology, Laboratory for Molecular Infection Medicine Sweden, Umeå Centre for Microbial Research, Science for Life Laboratory, Umeå University, Umeå 901 87, Sweden
| | - Sergey V Melnikov
- Biosciences Institute, Newcastle University School of Medicine, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
4
|
Gollapalli P, Rudrappa S, Kumar V, Santosh Kumar HS. Domain Architecture Based Methods for Comparative Functional Genomics Toward Therapeutic Drug Target Discovery. J Mol Evol 2023; 91:598-615. [PMID: 37626222 DOI: 10.1007/s00239-023-10129-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 08/06/2023] [Indexed: 08/27/2023]
Abstract
Genes duplicate, mutate, recombine, fuse or fission to produce new genes, or when genes are formed from de novo, novel functions arise during evolution. Researchers have tried to quantify the causes of these molecular diversification processes to know how these genes increase molecular complexity over a period of time, for instance protein domain organization. In contrast to global sequence similarity, protein domain architectures can capture key structural and functional characteristics, making them better proxies for describing functional equivalence. In Prokaryotes and eukaryotes it has proven that, domain designs are retained over significant evolutionary distances. Protein domain architectures are now being utilized to categorize and distinguish evolutionarily related proteins and find homologs among species that are evolutionarily distant from one another. Additionally, structural information stored in domain structures has accelerated homology identification and sequence search methods. Tools for functional protein annotation have been developed to discover, protein domain content, domain order, domain recurrence, and domain position as all these contribute to the prediction of protein functional accuracy. In this review, an attempt is made to summarise facts and speculations regarding the use of protein domain architecture and modularity to identify possible therapeutic targets among cellular activities based on the understanding their linked biological processes.
Collapse
Affiliation(s)
- Pavan Gollapalli
- Center for Bioinformatics and Biostatistics, Nitte (Deemed to be University), Mangalore, Karnataka, 575018, India
| | - Sushmitha Rudrappa
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India
| | - Vadlapudi Kumar
- Department of Biochemistry, Davangere University, Shivagangothri, Davangere, Karnataka, 577007, India
| | - Hulikal Shivashankara Santosh Kumar
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India.
| |
Collapse
|
5
|
Kakoulidis P, Vlachos IS, Thanos D, Blatch GL, Emiris IZ, Anastasiadou E. Identifying and profiling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon. Commun Biol 2023; 6:752. [PMID: 37468602 PMCID: PMC10356814 DOI: 10.1038/s42003-023-05076-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 06/26/2023] [Indexed: 07/21/2023] Open
Abstract
Using protein structure to predict function, interactions, and evolutionary history is still an open challenge, with existing approaches relying extensively on protein homology and families. Here, we present Machaon, a data-driven method combining orientation invariant metrics on phi-psi angles, inter-residue contacts and surface complexity. It can be readily applied on whole structures or segments-such as domains and binding sites. Machaon was applied on SARS-CoV-2 Spike monomers of native, Delta and Omicron variants and identified correlations with a wide range of viral proteins from close to distant taxonomy ranks, as well as host proteins, such as ACE2 receptor. Machaon's meta-analysis of the results highlights structural, chemical and transcriptional similarities between the Spike monomer and human proteins, indicating a multi-level viral mimicry. This extended analysis also revealed relationships of the Spike protein with biological processes such as ubiquitination and angiogenesis and highlighted different patterns in virus attachment among the studied variants. Available at: https://machaonweb.com .
Collapse
Affiliation(s)
- Panos Kakoulidis
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Ilisia, 157 84, Athens, Greece
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou St., 115 27, Athens, Greece
| | - Ioannis S Vlachos
- Broad Institute of MIT and Harvard, Merkin Building, 415 Main St., Cambridge, MA, 02142, USA
- Cancer Research Institute, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA, 02215, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA, 02215, USA
- Harvard Medical School, 25 Shattuck Street, Boston, MA, 02115, USA
- Spatial Technologies Unit, Harvard Medical School Initiative for RNA Medicine, Dana Building, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA, 02215, USA
| | - Dimitris Thanos
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou St., 115 27, Athens, Greece
| | - Gregory L Blatch
- Biomedical Biotechnology Research Unit, Department of Biochemistry and Microbiology, Rhodes University, PO Box 94, Makhanda (Grahamstown) 6140, Eastern Cape, South Africa
- Biomedical and Drug Discovery Research Group, Faculty of Health Sciences, Higher Colleges of Technology, PO 25026, Sharjah, UAE
- Institute for Health and Sport, Victoria University, Melbourne, PO Box 14428, VIC 8001, Melbourne, Australia
- The Vice Chancellery, The University of Notre Dame Australia, PO Box 1225, WA 6959, Fremantle, Australia
| | - Ioannis Z Emiris
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Ilisia, 157 84, Athens, Greece
- ATHENA Research and Innovation Center, Artemidos 6 & Epidavrou 15125, Marousi, Greece
| | - Ema Anastasiadou
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou St., 115 27, Athens, Greece.
| |
Collapse
|
6
|
Wannitikul P, Wattana-Amorn P, Sathitnaitham S, Sakulkoo J, Suttangkakul A, Wonnapinij P, Bassel GW, Simister R, Gomez LD, Vuttipongchaikij S. Disruption of a DUF247 Containing Protein Alters Cell Wall Polysaccharides and Reduces Growth in Arabidopsis. PLANTS (BASEL, SWITZERLAND) 2023; 12:1977. [PMID: 37653894 PMCID: PMC10221614 DOI: 10.3390/plants12101977] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/09/2023] [Accepted: 05/10/2023] [Indexed: 09/02/2023]
Abstract
Plant cell wall biosynthesis is a complex process that requires proteins and enzymes from glycan synthesis to wall assembly. We show that disruption of At3g50120 (DUF247-1), a member of the DUF247 multigene family containing 28 genes in Arabidopsis, results in alterations to the structure and composition of cell wall polysaccharides and reduced growth and plant size. An ELISA using cell wall antibodies shows that the mutants also exhibit ~50% reductions in xyloglucan (XyG), glucuronoxylan (GX) and heteromannan (HM) epitopes in the NaOH fraction and ~50% increases in homogalacturonan (HG) epitopes in the CDTA fraction. Furthermore, the polymer sizes of XyGs and GXs are reduced with concomitant increases in short-chain polymers, while those of HGs and mHGs are slightly increased. Complementation using 35S:DUF247-1 partially recovers the XyG and HG content, but not those of GX and HM, suggesting that DUF247-1 is more closely associated with XyGs and HGs. DUF247-1 is expressed throughout Arabidopsis, particularly in vascular and developing tissues, and its disruption affects the expression of other gene members, indicating a regulatory control role within the gene family. Our results demonstrate that DUF247-1 is required for normal cell wall composition and structure and Arabidopsis growth.
Collapse
Affiliation(s)
- Pitchaporn Wannitikul
- Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand; (P.W.); (S.S.); (J.S.); (A.S.); (P.W.)
| | - Pakorn Wattana-Amorn
- Special Research Unit for Advanced Magnetic Resonance and Center of Excellence for Innovation in Chemistry, Department of Chemistry, Faculty of Science, Kasetsart University, Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand;
| | - Sukhita Sathitnaitham
- Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand; (P.W.); (S.S.); (J.S.); (A.S.); (P.W.)
| | - Jenjira Sakulkoo
- Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand; (P.W.); (S.S.); (J.S.); (A.S.); (P.W.)
| | - Anongpat Suttangkakul
- Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand; (P.W.); (S.S.); (J.S.); (A.S.); (P.W.)
- Center of Advanced studies for Tropical Natural Resources, Kasetsart University, Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand
| | - Passorn Wonnapinij
- Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand; (P.W.); (S.S.); (J.S.); (A.S.); (P.W.)
- Center of Advanced studies for Tropical Natural Resources, Kasetsart University, Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| | - George W. Bassel
- School of Life Sciences, The University of Warwick, Coventry CV4 7AL, UK;
| | - Rachael Simister
- CNAP, Department of Biology, University of York, Heslington, York YO10 5DD, UK; (R.S.); (L.D.G.)
| | - Leonardo D. Gomez
- CNAP, Department of Biology, University of York, Heslington, York YO10 5DD, UK; (R.S.); (L.D.G.)
| | - Supachai Vuttipongchaikij
- Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand; (P.W.); (S.S.); (J.S.); (A.S.); (P.W.)
- Center of Advanced studies for Tropical Natural Resources, Kasetsart University, Ngam Wong Wan Road, Chattuchak, Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| |
Collapse
|
7
|
Nawaz MS, Fournier-Viger P, He Y, Zhang Q. PSAC-PDB: Analysis and classification of protein structures. Comput Biol Med 2023; 158:106814. [PMID: 36989742 DOI: 10.1016/j.compbiomed.2023.106814] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 03/09/2023] [Accepted: 03/20/2023] [Indexed: 03/29/2023]
Abstract
This paper presents a novel framework, called PSAC-PDB, for analyzing and classifying protein structures from the Protein Data Bank (PDB). PSAC-PDB first finds, analyze and identifies protein structures in PDB that are similar to a protein structure of interest using a protein structure comparison tool. Second, the amino acids (AA) sequences of identified protein structures (obtained from PDB), their aligned amino acids (AAA) and aligned secondary structure elements (ASSE) (obtained by structural alignment), and frequent AA (FAA) patterns (discovered by sequential pattern mining), are used for the reliable detection/classification of protein structures. Eleven classifiers are used and their performance is compared using six evaluation metrics. Results show that three classifiers perform well on overall, and that FAA patterns can be used to efficiently classify protein structures in place of providing the whole AA sequences, AAA or ASSE. Furthermore, better classification results are obtained using AAA of protein structures rather than AA sequences. PSAC-PDB also performed better than state-of-the-art approaches for SARS-CoV-2 genome sequences classification.
Collapse
|
8
|
Aderinwale T, Bharadwaj V, Christoffer C, Terashi G, Zhang Z, Jahandideh R, Kagaya Y, Kihara D. Real-time structure search and structure classification for AlphaFold protein models. Commun Biol 2022; 5:316. [PMID: 35383281 PMCID: PMC8983703 DOI: 10.1038/s42003-022-03261-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/11/2022] [Indexed: 11/17/2022] Open
Abstract
Last year saw a breakthrough in protein structure prediction, where the AlphaFold2 method showed a substantial improvement in the modeling accuracy. Following the software release of AlphaFold2, predicted structures by AlphaFold2 for proteins in 21 species were made publicly available via the AlphaFold Database. Here, to facilitate structural analysis and application of AlphaFold2 models, we provide the infrastructure, 3D-AF-Surfer, which allows real-time structure-based search for the AlphaFold2 models. In 3D-AF-Surfer, structures are represented with 3D Zernike descriptors (3DZD), which is a rotationally invariant, mathematical representation of 3D shapes. We developed a neural network that takes 3DZDs of proteins as input and retrieves proteins of the same fold more accurately than direct comparison of 3DZDs. Using 3D-AF-Surfer, we report structure classifications of AlphaFold2 models and discuss the correlation between confidence levels of AlphaFold2 models and intrinsic disordered regions.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Vijay Bharadwaj
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
9
|
Yadav NS, Kumar P, Singh I. Structural and functional analysis of protein. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00026-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
10
|
In silico Characterization of Biofilm-Associated Protein (Bap) Identified in a Multi-drug Resistant Acinetobacter baumannii Clinical Isolate. JOURNAL OF MEDICAL MICROBIOLOGY AND INFECTIOUS DISEASES 2021. [DOI: 10.52547/jommid.9.4.210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
11
|
Beaudoin CA, Jamasb AR, Alsulami AF, Copoiu L, van Tonder AJ, Hala S, Bannerman BP, Thomas SE, Vedithi SC, Torres PH, Blundell TL. Predicted structural mimicry of spike receptor-binding motifs from highly pathogenic human coronaviruses. Comput Struct Biotechnol J 2021; 19:3938-3953. [PMID: 34234921 PMCID: PMC8249111 DOI: 10.1016/j.csbj.2021.06.041] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 06/26/2021] [Accepted: 06/27/2021] [Indexed: 12/19/2022] Open
Abstract
Potential coronavirus spike protein mimicry revealed by structural comparison. Human and non-human protein potential interactions with virus identified. Predicted structural mimicry corroborated by protein–protein docking. Epitope-based alignments may help guide vaccine efforts.
Viruses often encode proteins that mimic host proteins in order to facilitate infection. Little work has been done to understand the potential mimicry of the SARS-CoV-2, SARS-CoV, and MERS-CoV spike proteins, particularly the receptor-binding motifs, which could be important in determining tropism and druggability of the virus. Peptide and epitope motifs have been detected on coronavirus spike proteins using sequence homology approaches; however, comparing the three-dimensional shape of the protein has been shown as more informative in predicting mimicry than sequence-based comparisons. Here, we use structural bioinformatics software to characterize potential mimicry of the three coronavirus spike protein receptor-binding motifs. We utilize sequence-independent alignment tools to compare structurally known protein models with the receptor-binding motifs and verify potential mimicked interactions with protein docking simulations. Both human and non-human proteins were returned for all three receptor-binding motifs. For example, all three were similar to several proteins containing EGF-like domains: some of which are endogenous to humans, such as thrombomodulin, and others exogenous, such as Plasmodium falciparum MSP-1. Similarity to human proteins may reveal which pathways the spike protein is co-opting, while analogous non-human proteins may indicate shared host interaction partners and overlapping antibody cross-reactivity. These findings can help guide experimental efforts to further understand potential interactions between human and coronavirus proteins.
Collapse
Affiliation(s)
- Christopher A. Beaudoin
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
- Corresponding authors.
| | - Arian R. Jamasb
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
- Department of Computer Science & Technology, University of Cambridge, JJ Thomson Ave, Cambridge CB3 0FD, United Kingdom
| | - Ali F. Alsulami
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
| | - Liviu Copoiu
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
| | - Andries J. van Tonder
- Department of Veterinary Medicine, University of Cambridge, Madingley Rd, Cambridge CB3 0ES, United Kingdom
| | - Sharif Hala
- King Abdullah International Medical Research Centre – Ministry of National Guard Health Affairs, Jeddah, Saudi Arabia
- King Saud bin Abdulaziz University for Health Sciences, Jeddah, Saudi Arabia
| | - Bridget P. Bannerman
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
| | - Sherine E. Thomas
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
| | - Sundeep Chaitanya Vedithi
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
| | - Pedro H.M. Torres
- Laboratório de Modelagem e Dinâmica Molecular, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Tom L. Blundell
- Department of Biochemistry, Sanger Building, University of Cambridge, Tennis Court Rd, Cambridge CB2 1GA, United Kingdom
- Corresponding authors.
| |
Collapse
|
12
|
Queirós P, Delogu F, Hickl O, May P, Wilmes P. Mantis: flexible and consensus-driven genome annotation. Gigascience 2021; 10:giab042. [PMID: 34076241 PMCID: PMC8170692 DOI: 10.1093/gigascience/giab042] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 03/22/2021] [Accepted: 05/14/2021] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND The rapid development of the (meta-)omics fields has produced an unprecedented amount of high-resolution and high-fidelity data. Through the use of these datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation can be described as the identification of regions of interest (i.e., domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, challenges remain in terms of speed, flexibility, and reproducibility. In the big data era, it is also increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, and thus overcoming some limitations in overly relying on computationally generated data from single sources. RESULTS We implemented a protein annotation tool, Mantis, which uses database identifiers intersection and text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for the customization of reference data and execution parameters, and is reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which significantly improved annotation performance compared to sequence-wide annotation. The parallelized implementation of Mantis results in short runtimes while also outputting high coverage and high-quality protein function annotations. CONCLUSIONS Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license at https://github.com/PedroMTQ/mantis.
Collapse
Affiliation(s)
- Pedro Queirós
- Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Francesco Delogu
- Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Oskar Hickl
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Patrick May
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Paul Wilmes
- Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| |
Collapse
|
13
|
Wegrzyn K, Zabrocka E, Bury K, Tomiczek B, Wieczor M, Czub J, Uciechowska U, Moreno-Del Alamo M, Walkow U, Grochowina I, Dutkiewicz R, Bujnicki JM, Giraldo R, Konieczny I. Defining a novel domain that provides an essential contribution to site-specific interaction of Rep protein with DNA. Nucleic Acids Res 2021; 49:3394-3408. [PMID: 33660784 PMCID: PMC8034659 DOI: 10.1093/nar/gkab113] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 02/04/2021] [Accepted: 02/10/2021] [Indexed: 12/24/2022] Open
Abstract
An essential feature of replication initiation proteins is their ability to bind to DNA. In this work, we describe a new domain that contributes to a replication initiator sequence-specific interaction with DNA. Applying biochemical assays and structure prediction methods coupled with DNA–protein crosslinking, mass spectrometry, and construction and analysis of mutant proteins, we identified that the replication initiator of the broad host range plasmid RK2, in addition to two winged helix domains, contains a third DNA-binding domain. The phylogenetic analysis revealed that the composition of this unique domain is typical within the described TrfA-like protein family. Both in vitro and in vivo experiments involving the constructed TrfA mutant proteins showed that the newly identified domain is essential for the formation of the protein complex with DNA, contributes to the avidity for interaction with DNA, and the replication activity of the initiator. The analysis of mutant proteins, each containing a single substitution, showed that each of the three domains composing TrfA is essential for the formation of the protein complex with DNA. Furthermore, the new domain, along with the winged helix domains, contributes to the sequence specificity of replication initiator interaction within the plasmid replication origin.
Collapse
Affiliation(s)
- Katarzyna Wegrzyn
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Elzbieta Zabrocka
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Katarzyna Bury
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Bartlomiej Tomiczek
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Milosz Wieczor
- Department of Physical Chemistry, Gdańsk University of Technology, Narutowicza 11/12, 80-233 Gdańsk, Poland
| | - Jacek Czub
- Department of Physical Chemistry, Gdańsk University of Technology, Narutowicza 11/12, 80-233 Gdańsk, Poland
| | - Urszula Uciechowska
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - María Moreno-Del Alamo
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas - CSIC, E28040 Madrid, Spain
| | - Urszula Walkow
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Igor Grochowina
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Rafal Dutkiewicz
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Księcia Trojdena 4, 02-109 Warsaw, Poland.,Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, Umultowska 89, 61-614 Poznan, Poland
| | - Rafael Giraldo
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas - CSIC, E28040 Madrid, Spain
| | - Igor Konieczny
- Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdansk, University of Gdansk, Abrahama 58, 80-307 Gdansk, Poland
| |
Collapse
|
14
|
Scarborough AM, Flaherty JN, Hunter OV, Liu K, Kumar A, Xing C, Tu BP, Conrad NK. SAM homeostasis is regulated by CFI m-mediated splicing of MAT2A. eLife 2021; 10:e64930. [PMID: 33949310 PMCID: PMC8139829 DOI: 10.7554/elife.64930] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 05/03/2021] [Indexed: 12/14/2022] Open
Abstract
S-adenosylmethionine (SAM) is the methyl donor for nearly all cellular methylation events. Cells regulate intracellular SAM levels through intron detention of MAT2A, the only SAM synthetase expressed in most cells. The N6-adenosine methyltransferase METTL16 promotes splicing of the MAT2A detained intron by an unknown mechanism. Using an unbiased CRISPR knock-out screen, we identified CFIm25 (NUDT21) as a regulator of MAT2A intron detention and intracellular SAM levels. CFIm25 is a component of the cleavage factor Im (CFIm) complex that regulates poly(A) site selection, but we show it promotes MAT2A splicing independent of poly(A) site selection. CFIm25-mediated MAT2A splicing induction requires the RS domains of its binding partners, CFIm68 and CFIm59 as well as binding sites in the detained intron and 3´ UTR. These studies uncover mechanisms that regulate MAT2A intron detention and reveal a previously undescribed role for CFIm in splicing and SAM metabolism.
Collapse
Affiliation(s)
- Anna M Scarborough
- Department of Microbiology, UT Southwestern Medical CenterDallasUnited States
| | - Juliana N Flaherty
- Department of Microbiology, UT Southwestern Medical CenterDallasUnited States
| | - Olga V Hunter
- Department of Microbiology, UT Southwestern Medical CenterDallasUnited States
| | - Kuanqing Liu
- Department of Biochemistry, UT Southwestern Medical CenterDallasUnited States
| | - Ashwani Kumar
- Eugene McDermott Center for Human Growth and Development, UT Southwestern Medical CenterDallasUnited States
| | - Chao Xing
- Eugene McDermott Center for Human Growth and Development, UT Southwestern Medical CenterDallasUnited States
- Department of Bioinformatics, UT Southwestern Medical CenterDallasUnited States
- Department of Population and Data Sciences, UT Southwestern Medical CenterDallasUnited States
| | - Benjamin P Tu
- Department of Biochemistry, UT Southwestern Medical CenterDallasUnited States
| | - Nicholas K Conrad
- Department of Microbiology, UT Southwestern Medical CenterDallasUnited States
| |
Collapse
|
15
|
Stereochemical Trajectories of a Two-Component Regulatory System PmrA/B in a Colistin-Resistant Acinetobacter baumannii Clinical Isolate. IRANIAN BIOMEDICAL JOURNAL 2021. [PMID: 33653023 PMCID: PMC8183390 DOI: 10.52547/ibj.25.3.193] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Background: There is limited information on the 3D prediction and modeling of the colistin resistance-associated proteins PmrA/B TCS in Acinetobacter baumannii. We aimed to evaluate the stereochemical structure and domain characterization of PmrA/B in an A. baumannii isolate resistant to high-level colistin, using bioinformatics tools. Methods: The species of the isolate and its susceptibility to colistin were confirmed by PCR-sequencing and MIC assay, respectively. For 3D prediction of the PmrA/B, we used 16 template models with the highest quality (e-value <1 × 10−50). Results: Prediction of the PmrA structure revealed a monomeric non-redundant protein consisting of 28 α-helices and 22 β-sheets. The PmrA DNA-binding motif displayed three antiparallel α-helices, followed by three β-sheets, and was bond to the major groove of DNA by intermolecular van der Waals bonds through amino acids Lys, Asp, His, and Arg, respectively. Superimposition of the deduced PmrA 3D structure with the closely related PmrA protein model (GenBank no. WP_071210493.1) revealed no distortion in conformation, due to Glu→Lys substitution at position 218. Similarly, the PmrB protein structure displayed 24 α-helices and 13 β-sheets. In our case, His251 acted as a phosphate receptor in the HisKA domain. The amino acid substitutions were mainly observed at the putative N-terminus region of the protein. Furthermore, two substitutions (Lys21→Ser and Ser28→Arg) in the transmembrane domain were detected. Conclusion: TheDNA-binding motif of PmrA is highly conserved, though the N-terminal fragment of PmrB showed a high rate of base substitutions. This research provides valuable insights into the mechanism of colistin resistance in A. baumannii.
Collapse
|
16
|
Alsadat Mahmoudian R, Lotfi Gharaie M, Abbaszadegan R, Forghanifard MM, Abbaszadegan MR. Interaction between LINC-ROR and Stemness State in Gastric Cancer Cells with Helicobacter pylori Infection. IRANIAN BIOMEDICAL JOURNAL 2021; 25:157-168. [PMID: 33745265 PMCID: PMC8183384 DOI: 10.29252/ibj.25.3.157] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Accepted: 09/13/2020] [Indexed: 02/05/2023]
Abstract
BACKGROUND Large intergenic non-coding RNA regulator of reprogramming (LINC-ROR), as a cancer-related Long non-coding RNA, has vital roles in stem cell survival, pluripotency, differentiation, and self-renewal in human embryonic stem cell. However, cancer-related molecular mech¬anisms, its functional roles, and clinical value of LINC-ROR in gastric cancer (GC) remain unclear. In this study, we aimed to investigate probable interplay between LINC-ROR with SALL4 stemness regulator and their role with the development of the disease. METHODS The mRNA expression profile of LINC-ROR and SALL4 was assessed in tumoral and adjacent non-cancerous tissues of GC patients, using quantitative real-time PCR. RESULTS Significant LINC-ROR underexpression and SALL4 overexpression were observed in 55.81% and 75.58% (p < 0.0001) of samples, respectively. The expression of LINC-ROR and SALL4 were significantly correlated with each other (p = 0.044). There was an association between the underexpression of LINC-ROR and sex, stage of tumor progression, tumor type, and location of tumor (p < 0.05), and Helicobacter pylori infection with SALL4 expression (p = 0.036). There were also significant correlations between concomitant mRNA expression of SALL4 and LINC-ROR in tumors located at distal noncardiac, positive for H. pylori infection, tumors with invasion into the muscle layer of the stomach, and grade II tumor (p < 0.05). CONCLUSION The clinical results of the SALL4-LINC-ROR association propose a probable functional interaction between these markers in tumor maintenance and aggressiveness. Our study can help to understand one of the mechanisms involved in the progression of gastric cancer through the function of these regulators.
Collapse
Affiliation(s)
| | - Maryam Lotfi Gharaie
- Immunology Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Division of Physiology, Department of Basic Science, Faculty of Veterinary Medicine, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Roya Abbaszadegan
- Immunology Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | | |
Collapse
|
17
|
Dutta A, Batish M, Parashar V. Structural basis of KdpD histidine kinase binding to the second messenger c-di-AMP. J Biol Chem 2021; 296:100771. [PMID: 33989637 PMCID: PMC8214093 DOI: 10.1016/j.jbc.2021.100771] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 05/03/2021] [Accepted: 05/07/2021] [Indexed: 11/17/2022] Open
Abstract
The KdpDE two-component system regulates potassium homeostasis and virulence in various bacterial species. The KdpD histidine kinases (HK) of this system contain a universal stress protein (USP) domain which binds to the second messenger cyclic-di-adenosine monophosphate (c-di-AMP) for regulating transcriptional output from this two-component system in Firmicutes such as Staphylococcus aureus. However, the structural basis of c-di-AMP specificity within the KdpD-USP domain is not well understood. Here, we resolved a 2.3 Å crystal structure of the S. aureus KdpD-USP domain (USPSa) complexed with c-di-AMP. Binding affinity analyses of USPSa mutants targeting the observed USPSa:c-di-AMP structural interface enabled the identification of the sequence residues that are required for c-di-AMP specificity. Based on the conservation of these residues in other Firmicutes, we identified the binding motif, (A/G/C)XSXSX2N(Y/F), which allowed us to predict c-di-AMP binding in other KdpD HKs. Furthermore, we found that the USPSa domain contains structural features distinct from the canonical standalone USPs that bind ATP as a preferred ligand. These features include inward-facing conformations of its β1-α1 and β4-α4 loops, a short α2 helix, the absence of a triphosphate-binding Walker A motif, and a unique dual phospho-ligand binding mode. It is therefore likely that USPSa-like domains in KdpD HKs represent a novel subfamily of the USPs.
Collapse
Affiliation(s)
- Anirudha Dutta
- Department of Medical and Molecular Sciences, University of Delaware, Newark, Delaware, USA
| | - Mona Batish
- Department of Medical and Molecular Sciences, University of Delaware, Newark, Delaware, USA
| | - Vijay Parashar
- Department of Medical and Molecular Sciences, University of Delaware, Newark, Delaware, USA.
| |
Collapse
|