1
|
Monzon AM, Arrías PN, Elofsson A, Mier P, Andrade-Navarro MA, Bevilacqua M, Clementel D, Bateman A, Hirsh L, Fornasari MS, Parisi G, Piovesan D, Kajava AV, Tosatto SCE. A STRP-ed definition of Structured Tandem Repeats in Proteins. J Struct Biol 2023; 215:108023. [PMID: 37652396 DOI: 10.1016/j.jsb.2023.108023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/31/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023]
Abstract
Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Dept. of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131 Padova, Italy
| | - Paula Nazarena Arrías
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Arne Elofsson
- Dept. of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Tomtebodavägen 23, 171 21 Solna, Sweden
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Martina Bevilacqua
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Clementel
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Layla Hirsh
- Dept. of Engineering, Faculty of Science and Engineering, Pontifical Catholic University of Peru, Av. Universitaria 1801 San Miguel, Lima 32, Lima, Peru
| | - Maria Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy.
| |
Collapse
|
2
|
López-Luis MA, Soriano-Pérez EE, Parada-Fabián JC, Torres J, Maldonado-Rodríguez R, Méndez-Tenorio A. A Proposal for a Consolidated Structural Model of the CagY Protein of Helicobacter pylori. Int J Mol Sci 2023; 24:16781. [PMID: 38069104 PMCID: PMC10706595 DOI: 10.3390/ijms242316781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/17/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open
Abstract
CagY is the largest and most complex protein from Helicobacter pylori's (Hp) type IV secretion system (T4SS), playing a critical role in the modulation of gastric inflammation and risk for gastric cancer. CagY spans from the inner to the outer membrane, forming a channel through which Hp molecules are injected into human gastric cells. Yet, a tridimensional structure has been reported for only short segments of the protein. This intricate protein was modeled using different approaches, including homology modeling, ab initio, and deep learning techniques. The challengingly long middle repeat region (MRR) was modeled using deep learning and optimized using equilibrium molecular dynamics. The previously modeled segments were assembled into a 1595 aa chain and a 14-chain CagY multimer structure was assembled by structural alignment. The final structure correlated with published structures and allowed to show how the multimer may form the T4SS channel through which CagA and other molecules are translocated to gastric cells. The model confirmed that MRR, the most polymorphic and complex region of CagY, presents numerous cysteine residues forming disulfide bonds that stabilize the protein and suggest this domain may function as a contractile region playing an essential role in the modulating activity of CagY on tissue inflammation.
Collapse
Affiliation(s)
- Mario Angel López-Luis
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Campus Lázaro Cárdenas, Mexico City 11340, Mexico; (M.A.L.-L.); (E.E.S.-P.); (J.C.P.-F.); (R.M.-R.)
| | - Eva Elda Soriano-Pérez
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Campus Lázaro Cárdenas, Mexico City 11340, Mexico; (M.A.L.-L.); (E.E.S.-P.); (J.C.P.-F.); (R.M.-R.)
| | - José Carlos Parada-Fabián
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Campus Lázaro Cárdenas, Mexico City 11340, Mexico; (M.A.L.-L.); (E.E.S.-P.); (J.C.P.-F.); (R.M.-R.)
| | - Javier Torres
- Unidad de Investigación en Enfermedades Infecciosas, UMAE Pediatría, Instituto Mexicano del Seguro Social, Mexico City 06720, Mexico;
| | - Rogelio Maldonado-Rodríguez
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Campus Lázaro Cárdenas, Mexico City 11340, Mexico; (M.A.L.-L.); (E.E.S.-P.); (J.C.P.-F.); (R.M.-R.)
| | - Alfonso Méndez-Tenorio
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Campus Lázaro Cárdenas, Mexico City 11340, Mexico; (M.A.L.-L.); (E.E.S.-P.); (J.C.P.-F.); (R.M.-R.)
| |
Collapse
|
3
|
Mesdaghi S, Price RM, Madine J, Rigden DJ. Deep Learning-based structure modelling illuminates structure and function in uncharted regions of β-solenoid fold space. J Struct Biol 2023; 215:108010. [PMID: 37544372 DOI: 10.1016/j.jsb.2023.108010] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/19/2023] [Accepted: 08/03/2023] [Indexed: 08/08/2023]
Abstract
Repeat proteins are common in all domains of life and exhibit a wide range of functions. One class of repeat protein contains solenoid folds where the repeating unit consists of β-strands separated by tight turns. β-solenoids have distinguishing structural features such as handedness, twist, oligomerisation state, coil shape and size which give rise to their diversity. Characterised β-solenoid repeat proteins are known to form regions in bacterial and viral virulence factors, antifreeze proteins and functional amyloids. For many of these proteins, the experimental structure has not been solved, as they are difficult to crystallise or model. Here we use various deep learning-based structure-modelling methods to discover novel predicted β-solenoids, perform structural database searches to mine further structural neighbours and relate their predicted structure to possible functions. We find both eukaryotic and prokaryotic adhesins, confirming a known functional linkage between adhesin function and the β-solenoid fold. We further identify exceptionally long, flat β-solenoid folds as possible structures of mucin tandem repeat regions and unprecedentedly small β-solenoid structures. Additionally, we characterise a novel β-solenoid coil shape, the FapC Greek key β-solenoid as well as plausible complexes between it and other proteins involved in Pseudomonas functional amyloid fibres.
Collapse
Affiliation(s)
- Shahram Mesdaghi
- The University of Liverpool, Institute of Systems, Molecular & Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7ZB, United Kingdom; Computational Biology Facility, MerseyBio, University of Liverpool, Crown Street, Liverpool L69 7ZB, United Kingdom
| | - Rebecca M Price
- The University of Liverpool, Institute of Systems, Molecular & Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7ZB, United Kingdom
| | - Jillian Madine
- The University of Liverpool, Institute of Systems, Molecular & Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7ZB, United Kingdom.
| | - Daniel J Rigden
- The University of Liverpool, Institute of Systems, Molecular & Integrative Biology, Biosciences Building, Crown Street, Liverpool L69 7ZB, United Kingdom.
| |
Collapse
|
4
|
Xiong W, Cai J, Li R, Wen C, Tan H. Rare Variant Analysis and Molecular Dynamics Simulation in Alzheimer’s Disease Identifies Exonic Variants in FLG. Genes (Basel) 2022; 13. [PMID: 35627223 PMCID: PMC9141140 DOI: 10.3390/genes13050838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 05/03/2022] [Accepted: 05/04/2022] [Indexed: 02/05/2023] Open
Abstract
Background: Although an increasing number of common variants contributing to Alzheimer’s disease (AD) are uncovered by genome-wide association studies, they can only explain less than half of the heritability of AD. Rare variant association studies (RVAS) has become an increasingly important area to explain the risk or trait variability of AD. Method: To investigate the potential rare variants that cause AD, we screened 70,209 rare variants from two cohorts of a 175 AD cohort and a 214 cognitively normal cohort from the Alzheimer’s Disease Neuroimaging Initiative database. MIRARE, a novel RVAS method, was performed on 232 non-synonymous variants selected by ANNOVAR annotation. Molecular docking and molecular dynamics (MD) simulation were adopted to verify the interaction between the chosen functional variants and BACE1. Results: MIRAGE analysis revealed significant associations between AD and six potential pathogenic genes, including PREX2, FLG, DHX16, NID2, ZnF585B and ZnF875. Only interactions between FLG (including wild type and rs3120654(SER742TYR)) and BACE1 were verified by molecular docking and MD simulation. The interaction of FLG(SER742TYR) with BACE1 was greater than that of wildtype FLG with BACE1. Conclusions: According to the literature search, bio-informatics analysis, and molecular docking and MD simulation, we find non-synonymous rare variants in six genes, especially FLG(rs3120654), that may play key roles in AD.
Collapse
Affiliation(s)
- Weixue Xiong
- Department of Preventive Medicine, Shantou University Medical College, Shantou 515000, China
| | - Jiahui Cai
- Department of Preventive Medicine, Shantou University Medical College, Shantou 515000, China
| | - Ruijia Li
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230000, China;
| | - Canhong Wen
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230000, China
| | - Haizhu Tan
- Department of Preventive Medicine, Shantou University Medical College, Shantou 515000, China
| | | |
Collapse
|
5
|
Patiño-Galindo JÁ, Filip I, Chowdhury R, Maranas CD, Sorger PK, AlQuraishi M, Rabadan R. Recombination and lineage-specific mutations linked to the emergence of SARS-CoV-2. Genome Med 2021; 13:124. [PMID: 34362430 PMCID: PMC8343217 DOI: 10.1186/s13073-021-00943-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 07/24/2021] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND The emergence of SARS-CoV-2 underscores the need to better understand the evolutionary processes that drive the emergence and adaptation of zoonotic viruses in humans. In the betacoronavirus genus, which also includes SARS-CoV and MERS-CoV, recombination frequently encompasses the receptor binding domain (RBD) of the Spike protein, which is responsible for viral binding to host cell receptors. In this work, we reconstruct the evolutionary events that have accompanied the emergence of SARS-CoV-2, with a special emphasis on the RBD and its adaptation for binding to its receptor, human ACE2. METHODS By means of phylogenetic and recombination analyses, we found evidence of a recombination event in the RBD involving ancestral linages to both SARS-CoV and SARS-CoV-2. We then assessed the effect of this recombination at protein level by reconstructing the RBD of the closest ancestors to SARS-CoV-2, SARS-CoV, and other Sarbecoviruses, including the most recent common ancestor of the recombining clade. The resulting information was used to measure and compare, in silico, their ACE2-binding affinities using the physics-based trRosetta algorithm. RESULTS We show that, through an ancestral recombination event, SARS-CoV and SARS-CoV-2 share an RBD sequence that includes two insertions (positions 432-436 and 460-472), as well as the variants 427N and 436Y. Both 427N and 436Y belong to a helix that interacts directly with the human ACE2 (hACE2) receptor. Reconstruction of ancestral states, combined with protein-binding affinity analyses, suggests that the recombination event involving ancestral strains of SARS-CoV and SARS-CoV-2 led to an increased affinity for hACE2 binding and that alleles 427N and 436Y significantly enhanced affinity as well. CONCLUSIONS We report an ancestral recombination event affecting the RBD of both SARS-CoV and SARS-CoV-2 that was associated with an increased binding affinity to hACE2. Structural modeling indicates that ancestors of SARS-CoV-2 may have acquired the ability to infect humans decades ago. The binding affinity with the human receptor would have been subsequently boosted in SARS-CoV and SARS-CoV-2 through further mutations in RBD.
Collapse
Affiliation(s)
- Juan Ángel Patiño-Galindo
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Departments of Systems Biology and Biomedical Informatics, Columbia University, New York, NY, USA
| | - Ioan Filip
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Departments of Systems Biology and Biomedical Informatics, Columbia University, New York, NY, USA
| | - Ratul Chowdhury
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Peter K Sorger
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Mohammed AlQuraishi
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Raul Rabadan
- Program for Mathematical Genomics, Columbia University, New York, NY, USA.
- Departments of Systems Biology and Biomedical Informatics, Columbia University, New York, NY, USA.
| |
Collapse
|
6
|
Abstract
All currently known architectures of outer-membrane beta barrels (OMBBs) have only one barrel. While the vast majority function as oligomers, with barrels from different chains packing against each other in the membrane, it was assumed that these multiple chains are needed to form multibarrel structures. And yet, here we show that multibarrel chains exist. Using state-of-the-art sequence and structure analysis tools, we report the discovery of more than 30 multibarrel architectures from gram-negative bacteria. The discovery of these architectures reveals another interesting chapter in OMBB evolution and has implications for protein engineering. The evolutionary advantages of multibarrels are yet to be discovered. Outer-membrane beta barrels (OMBBs) are found in the outer membrane of gram-negative bacteria and eukaryotic organelles. OMBBs fold as antiparallel β-sheets that close onto themselves, forming pores that traverse the membrane. Currently known structures include only one barrel, of 8 to 36 strands, per chain. The lack of multi-OMBB chains is surprising, as most OMBBs form oligomers, and some function only in this state. Using a combination of sensitive sequence comparison methods and coevolutionary analysis tools, we identify many proteins combining multiple beta barrels within a single chain; combinations that include eight-stranded barrels prevail. These multibarrels seem to be the result of independent, lineage-specific fusion and amplification events. The absence of multibarrels that are universally conserved in bacteria with an outer membrane, coupled with their frequent de novo genesis, suggests that their functions are not essential but rather beneficial in specific environments. Adjacent barrels of complementary function within the same chain may allow for functions beyond those of the individual barrels.
Collapse
|
7
|
Delucchi M, Näf P, Bliven S, Anisimova M. TRAL 2.0: Tandem Repeat Detection With Circular Profile Hidden Markov Models and Evolutionary Aligner. FRONTIERS IN BIOINFORMATICS 2021; 1:691865. [PMID: 36303789 PMCID: PMC9581039 DOI: 10.3389/fbinf.2021.691865] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 06/11/2021] [Indexed: 11/13/2022] Open
Abstract
The Tandem Repeat Annotation Library (TRAL) focuses on analyzing tandem repeat units in genomic sequences. TRAL can integrate and harmonize tandem repeat annotations from a large number of external tools, and provides a statistical model for evaluating and filtering the detected repeats. TRAL version 2.0 includes new features such as a module for identifying repeats from circular profile hidden Markov models, a new repeat alignment method based on the progressive Poisson Indel Process, an improved installation procedure and a docker container. TRAL is an open-source Python 3 library and is available, together with documentation and tutorials viavital-it.ch/software/tral.
Collapse
Affiliation(s)
- Matteo Delucchi
- Institute of Applied Simulations, School of Life Sciences und Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Paulina Näf
- Institute of Applied Simulations, School of Life Sciences und Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Spencer Bliven
- Institute of Applied Simulations, School of Life Sciences und Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Laboratory for Scientific Computing and Modelling, Paul Scherrer Institute, Villigen PSI, Villigen, Switzerland
| | - Maria Anisimova
- Institute of Applied Simulations, School of Life Sciences und Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- *Correspondence: Maria Anisimova,
| |
Collapse
|