1
|
Romei M, Carpentier M, Chomilier J, Lecointre G. Origins and Functional Significance of Eukaryotic Protein Folds. J Mol Evol 2023; 91:854-864. [PMID: 38060007 DOI: 10.1007/s00239-023-10136-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 10/03/2023] [Indexed: 12/08/2023]
Abstract
Folds are the architecture and topology of a protein domain. Categories of folds are very few compared to the astronomical number of sequences. Eukaryotes have more protein folds than Archaea and Bacteria. These folds are of two types: shared with Archaea and/or Bacteria on one hand and specific to eukaryotic clades on the other hand. The first kind of folds is inherited from the first endosymbiosis and confirms the mixed origin of eukaryotes. In a dataset of 1073 folds whose presence or absence has been evidenced among 210 species equally distributed in the three super-kingdoms, we have identified 28 eukaryotic folds unambiguously inherited from Bacteria and 40 eukaryotic folds unambiguously inherited from Archaea. Compared to previous studies, the repartition of informational function is higher than expected for folds originated from Bacteria and as high as expected for folds inherited from Archaea. The second type of folds is specifically eukaryotic and associated with an increase of new folds within eukaryotes distributed in particular clades. Reconstructed ancestral states coupled with dating of each node on the tree of life provided fold appearance rates. The rate is on average twice higher within Eukaryota than within Bacteria or Archaea. The highest rates are found in the origins of eukaryotes, holozoans, metazoans, metazoans stricto sensu, and vertebrates: the roots of these clades correspond to bursts of fold evolution. We could correlate the functions of some of the fold synapomorphies within eukaryotes with significant evolutionary events. Among them, we find evidence for the rise of multicellularity, adaptive immune system, or virus folds which could be linked to an ecological shift made by tetrapods.
Collapse
Affiliation(s)
- Martin Romei
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France
- IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHN, Paris, France
| | - Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France.
| | - Jacques Chomilier
- IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHN, Paris, France
| | - Guillaume Lecointre
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France
| |
Collapse
|
2
|
Carpentier M, Chomilier J. Analyses of Mutation Displacements from Homology Models. Methods Mol Biol 2023; 2627:195-210. [PMID: 36959449 DOI: 10.1007/978-1-0716-2974-1_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
Evaluation of the structural perturbations introduced by a single amino acid mutation is the main issue for protein structural biology. We propose here to present some recent advances in methods, allowing the splitting of distortion between the actual substitution effect and the contribution of the local flexibility of the position where the mutation occurs. Its main drawback is the need of many structures with a single mutation in each of them. To bypass this difficulty, we propose to use molecular modeling tools, with several software enabling us to build a model from a template, given the sequence. As a proof of concept, we rely on a gold standard, the human lysozyme. Both wild-type and three mutant structures are available in the PDB. Two of these mutations result in amyloid fibril formation, and the last one is neutral. As a conclusion, irrespective of the algorithm used for modeling, side chain conformations at the site of mutation are reliable, although long-range effects are out of reach of these tools.
Collapse
Affiliation(s)
- Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France.
| | - Jacques Chomilier
- Sorbonne Université, BiBiP, IMPMC, UMR 7590, CNRS, MNHN, Paris, France
| |
Collapse
|
3
|
Romei M, Sapriel G, Imbert P, Jamay T, Chomilier J, Lecointre G, Carpentier M. Protein folds as synapomorphies of the tree of life. Evolution 2022; 76:1706-1719. [PMID: 35765784 PMCID: PMC9541633 DOI: 10.1111/evo.14550] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 05/17/2022] [Accepted: 05/31/2022] [Indexed: 01/22/2023]
Abstract
Several studies showed that folds (topology of protein secondary structures) distribution in proteomes may be a global proxy to build phylogeny. Then, some folds should be synapomorphies (derived characters exclusively shared among taxa). However, previous studies used methods that did not allow synapomorphy identification, which requires congruence analysis of folds as individual characters. Here, we map SCOP folds onto a sample of 210 species across the tree of life (TOL). Congruence is assessed using retention index of each fold for the TOL, and principal component analysis for deeper branches. Using a bicluster mapping approach, we define synapomorphic blocks of folds (SBF) sharing similar presence/absence patterns. Among the 1232 folds, 20% are universally present in our TOL, whereas 54% are reliable synapomorphies. These results are similar with CATH and ECOD databases. Eukaryotes are characterized by a large number of them, and several SBFs clearly support nested eukaryotic clades (divergence times from 1100 to 380 mya). Although clearly separated, the three superkingdoms reveal a strong mosaic pattern. This pattern is consistent with the dual origin of eukaryotes and witness secondary endosymbiosis in their phothosynthetic clades. Our study unveils direct analysis of folds synapomorphies as key characters to unravel evolutionary history of species.
Collapse
Affiliation(s)
- Martin Romei
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance,IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHNParisFrance
| | - Guillaume Sapriel
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance,UFR des sciences de la santéUniversité Versailles‐St‐QuentinVersaillesFrance
| | - Pierre Imbert
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| | - Théo Jamay
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| | | | - Guillaume Lecointre
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| | - Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| |
Collapse
|
4
|
Carpentier M, Chomilier J. Analyses of displacements resulting from a point mutation in proteins. J Struct Biol 2020; 211:107543. [PMID: 32522553 DOI: 10.1016/j.jsb.2020.107543] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 04/28/2020] [Accepted: 05/31/2020] [Indexed: 11/19/2022]
Abstract
The effects of a single residue substitution on the protein backbone are frequently quite small and there are many other potential sources of structural variation for protein. We present here a methodology considering different sources of distortions in order to isolate the very effect of the mutation. To validate our methodology, we consider a well-studied family with many single mutants: the human lysozyme. Most of the perturbations are expected to be at the very localisation of the mutation, but in many cases the effects are propagated at long range. We show that the distances between the mutated residue and the 5% most disturbed residues exponentially decreases. One third of the affected residues are in direct contact with the mutated position; the remaining two thirds are potential allosteric effects. We confirm the reliability of the residues identified as significantly perturbed by comparing our results to experimental studies. We confirm with the present method all the previously identified perturbations. This study shows that mutations have long-range impact on protein backbone that can be detected, although the displacement of the affected atoms is small.
Collapse
Affiliation(s)
- Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE, 57 rue Cuvier, CP 50, 75005 Paris, France.
| | - Jacques Chomilier
- Sorbonne Université, BiBiP IMPMC UMR 7590, CNRS, MNHN, Paris, France.
| |
Collapse
|
5
|
Carpentier M, Chomilier J. Protein multiple alignments: sequence-based versus structure-based programs. Bioinformatics 2020; 35:3970-3980. [PMID: 30942864 DOI: 10.1093/bioinformatics/btz236] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 03/05/2019] [Accepted: 04/02/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Multiple sequence alignment programs have proved to be very useful and have already been evaluated in the literature yet not alignment programs based on structure or both sequence and structure. In the present article we wish to evaluate the added value provided through considering structures. RESULTS We compared the multiple alignments resulting from 25 programs either based on sequence, structure or both, to reference alignments deposited in five databases (BALIBASE 2 and 3, HOMSTRAD, OXBENCH and SISYPHUS). On the whole, the structure-based methods compute more reliable alignments than the sequence-based ones, and even than the sequence+structure-based programs whatever the databases. Two programs lead, MAMMOTH and MATRAS, nevertheless the performances of MUSTANG, MATT, 3DCOMB, TCOFFEE+TM_ALIGN and TCOFFEE+SAP are better for some alignments. The advantage of structure-based methods increases at low levels of sequence identity, or for residues in regular secondary structures or buried ones. Concerning gap management, sequence-based programs set less gaps than structure-based programs. Concerning the databases, the alignments of the manually built databases are more challenging for the programs. AVAILABILITY AND IMPLEMENTATION All data and results presented in this study are available at: http://wwwabi.snv.jussieu.fr/people/mathilde/download/AliMulComp/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France
| | - Jacques Chomilier
- Sorbonne Université, MNHN, CNRS, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC), BiBiP, Paris, France
| |
Collapse
|
6
|
Stratmann D, Pathmanathan JS, Postic G, Rey J, Chomilier J. TEF 2.0: a graph-based method for decomposing protein structures into closed loops. J Biomol Struct Dyn 2018; 37:4140-4150. [PMID: 30585105 DOI: 10.1080/07391102.2018.1546230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Dirk Stratmann
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC) , Paris , France
| | - Jananan S Pathmanathan
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC) , Paris , France
| | - Guillaume Postic
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC) , Paris , France
| | - Julien Rey
- Sorbonne Paris Cité, Université Paris Diderot , Paris , France
| | - Jacques Chomilier
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC) , Paris , France
| |
Collapse
|
7
|
Jusot M, Stratmann D, Vaisset M, Chomilier J, Cortés J. Exhaustive Exploration of the Conformational Landscape of Small Cyclic Peptides Using a Robotics Approach. J Chem Inf Model 2018; 58:2355-2368. [DOI: 10.1021/acs.jcim.8b00375] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Maud Jusot
- Sorbonne Université, MNHN, CNRS, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
- LAAS-CNRS, Université de Toulouse, CNRS, 31400 Toulouse, France
| | - Dirk Stratmann
- Sorbonne Université, MNHN, CNRS, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Marc Vaisset
- LAAS-CNRS, Université de Toulouse, CNRS, 31400 Toulouse, France
| | - Jacques Chomilier
- Sorbonne Université, MNHN, CNRS, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Juan Cortés
- LAAS-CNRS, Université de Toulouse, CNRS, 31400 Toulouse, France
| |
Collapse
|
8
|
Rodriguez PM, Stratmann D, Duprat E, Papandreou N, Acuna R, Lacroix Z, Chomilier J. Correlating topology and thermodynamics to predict protein structure sensitivity to point mutations. Bio-Algorithms and Med-Systems 2018. [DOI: 10.1515/bams-2018-0026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractThe relation between distribution of hydrophobic amino acids along with protein chains and their structure is far from being completely understood. No reliable method allowsab initioprediction of the folded structure from this distribution of physicochemical properties, even when they are highly degenerated by considering only two classes: hydrophobic and polar. Establishment of long-range hydrophobic three dimension (3D) contacts is essential for the formation of the nucleus, a key process in the early steps of protein folding. Thus, a large number of 3D simulation studies were developed to challenge this issue. They are nowadays evaluated in a specific chapter of the molecular modeling competition, Critical Assessment of Protein Structure Prediction. We present here a simulation of the early steps of the folding process for 850 proteins, performed in a discrete 3D space, which results in peaks in the predicted distribution of intra-chain noncovalent contacts. The residues located at these peak positions tend to be buried in the core of the protein and are expected to correspond to critical positions in the sequence, important both for folding and structural (or similarly, energetic in the thermodynamic hypothesis) stability. The degree of stabilization or destabilization due to a point mutation at the critical positions involved in numerous contacts is estimated from the calculated folding free energy difference between mutated and native structures. The results show that these critical positions are not tolerant towards mutation. This simulation of the noncovalent contacts only needs a sequence as input, and this paper proposes a validation of the method by comparison with the prediction of stability by well-established programs.
Collapse
|
9
|
Postic G, Hamelryck T, Chomilier J, Stratmann D. MyPMFs: a simple tool for creating statistical potentials to assess protein structural models. Biochimie 2018; 151:37-41. [PMID: 29857183 DOI: 10.1016/j.biochi.2018.05.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2018] [Accepted: 05/25/2018] [Indexed: 01/18/2023]
Abstract
Evaluating the model quality of protein structures that evolve in environments with particular physicochemical properties requires scoring functions that are adapted to their specific residue compositions and/or structural characteristics. Thus, computational methods developed for structures from the cytosol cannot work properly on membrane or secreted proteins. Here, we present MyPMFs, an easy-to-use tool that allows users to train statistical potentials of mean force (PMFs) on the protein structures of their choice, with all parameters being adjustable. We demonstrate its use by creating an accurate statistical potential for transmembrane protein domains. We also show its usefulness to study the influence of the physical environment on residue interactions within protein structures. Our open-source software is freely available for download at https://github.com/bibip-impmc/mypmfs.
Collapse
Affiliation(s)
- Guillaume Postic
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC), Paris, France.
| | - Thomas Hamelryck
- Bioinformatics Centre, Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Image Section, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Jacques Chomilier
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC), Paris, France
| | - Dirk Stratmann
- Sorbonne Université, UMR 7590 CNRS, MNHN, IRD, Institut de Minéralogie de Physique des Matériaux et de Cosmochimie (IMPMC), Paris, France
| |
Collapse
|
10
|
Shanthirabalan S, Chomilier J, Carpentier M. Structural effects of point mutations in proteins. Proteins 2018; 86:853-867. [DOI: 10.1002/prot.25499] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 03/19/2018] [Accepted: 03/20/2018] [Indexed: 12/21/2022]
Affiliation(s)
- Suvethigaa Shanthirabalan
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE; Paris France
| | | | - Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB), Sorbonne Université, MNHN, CNRS, EPHE; Paris France
- Sorbonne Université, CNRS, MNHN, IRD, IMPMC, BiBiP; Paris France
| |
Collapse
|
11
|
Hernández-Torres J, Rodríguez-Buitrago JA, Chomilier J. <i>Hydrophobic Cluster Analysis</i> (HCA) revela la existencia de isoformas de baja identidad de Lacasas (EC 1.10.3.2) en 7 </i>Phyla</i> de bacteria. Actual Biol 2017. [DOI: 10.17533/udea.acbi.329378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Se implementó la metodología Hydrophobic Cluster Analysis (HCA) para evaluar la presencia y distribución de las lacasas (p-difenol:oxígeno-oxidorreductasas) en Bacteria. Por su alto nivel de resolución en entornos de baja identidad (< 30%), el análisis estructural mediante HCA fue capaz de validar secuencias huérfanas alojadas en las bases de datos Genbank y Swissprot. Se proponen consensos (LOGO) para los centros de cobre, ̇tiles para la identificación de nuevas lacasas. De la misma manera que en Coprinus cinereus, en la lacasa CotA de Bacillus subtilis se encontraron evidencias de un evento de duplicación intragénica [aminoácidos 1 a 178 (dominio A) y 303 a 503 (50 residuos del dominio B y todo el dominio C)]. La identidad entre los duplicados se calculó en 9%. Estos resultados permiten concluir que las lacasas de Bacteria y Eucarya tuvieron un ancestro común.
Collapse
|
12
|
Abstract
Summary Scientific legacy workflows are often developed over many years, poorly documented and implemented with scripting languages. In the context of our cross-disciplinary projects we face the problem of maintaining such scientific workflows. This paper presents the Workflow Instrumentation for Structure Extraction (WISE) method used to process several ad-hoc legacy workflows written in Python and automatically produce their workflow structural skeleton. Unlike many existing methods, WISE does not assume input workflows to be preprocessed in a known workflow formalism. It is also able to identify and analyze calls to external tools. We present the method and report its results on several scientific workflows.
Collapse
Affiliation(s)
- Ruben Acuña
- 1Scientific Data Management Laboratory, Arizona State University, Tempe, AZ 85287, http://bioinformatics.engineering.asu.edu/, United States of America
| | - Jacques Chomilier
- 2Institut de Minéralogie, de Physique des Milieux Condensés et de Cosmochimie (IMPMC), Centre National de la Recherche Scientifique (CNRS), Institut de Recherche pour le Développement (IRD), Muséum National d’Histoire Naturelle (MNHN), Université Pierre et Marie Curie, Sorbonne Universités, 4 Place Jussieu, Paris, France
- 3Ressource Parisienne de Bioinformatique Structurale (RPBS), Université Paris Diderot, 35 Rue Hélène Brion, Paris, France
| | - Zoé Lacroix
- 4Scientific Data Management Laboratory, Arizona State University, Tempe, AZ 85287, http://bioinformatics.engineering.asu.edu/ France
- 5Institut de Minéralogie, de Physique des Milieux Condensés et de Cosmochimie (IMPMC), Centre National de la Recherche Scientifique (CNRS), Institut de Recherche pour le Développement (IRD), Muséum National d’Histoire Naturelle (MNHN), Université Pierre et Marie Curie, Sorbonne Universités, 4 Place Jussieu, Paris, France
| |
Collapse
|
13
|
Salinas Castellanos LC, Chomilier J, Hernández-Torres J. Recombination of chl-fus gene (Plastid Origin) downstream of hop: a locus of chromosomal instability. BMC Genomics 2015; 16:573. [PMID: 26238241 PMCID: PMC4522979 DOI: 10.1186/s12864-015-1780-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 07/14/2015] [Indexed: 11/26/2022] Open
Abstract
Background The co-chaperone Hop [heat shock protein (HSP) organizing protein] has been shown to act as an adaptor for protein folding and maturation, in concert with Hsp70 and Hsp90. The hop gene is of eukaryotic origin. Likewise, the chloroplast elongation factor G (cEF-G) catalyzes the translocation step in chloroplast protein synthesis. The chl-fus gene, which encodes the cEF-G protein, is of plastid origin. Both proteins, Hop and cEF-G, derived from domain duplications. It was demonstrated that the nuclear chl-fus gene locates in opposite orientation to a hop gene in Glycine max. We explored 53 available plant genomes from Chlorophyta to higher plants, to determine whether the chl-fus gene was transferred directly downstream of the primordial hop in the proto-eukaryote host cell. Since both genes came from exon/module duplication events, we wanted to explore the involvement of introns in the early origin and the ensuing evolutionary changes in gene structure. Results We reconstructed the evolutionary history of the two convergent plant genes, on the basis of their gene structure, microsynteny and microcolinearity, from 53 plant nuclear genomes. Despite a high degree (72 %) of microcolinearity among vascular plants, our results demonstrate that their adjacency was a product of chromosomal rearrangements. Based on predicted exon − intron structures, we inferred the molecular events giving rise to the current form of genes. Therefore, we propose a simple model of exon/module shuffling by intronic recombinations in which phase-0 introns were essential for domain duplication, and a phase-1 intron for transit peptide recruiting. Finally, we demonstrate a natural susceptibility of the intergenic region to recombine or delete, seriously threatening the integrity of the chl-fus gene for the future. Conclusions Our results are consistent with the interpretation that the chl-fus gene was transferred from the chloroplast to a chromosome different from that of hop, in the primitive photosynthetic eukaryote, and much later before the appearance of angiosperms, it was recombined downstream of hop. Exon/module shuffling mediated by symmetric intron phases (i.e., phase-0 introns) was essential for gene evolution. The intergenic region is prone to recombine, risking the integrity of both genes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1780-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Jacques Chomilier
- IMPMC, UPMC, CNRS UMR 7590, MNHN, IRD, Paris, France and RPBS, Paris, France.
| | - Jorge Hernández-Torres
- Laboratorio de Biología Molecular, Escuela de Biología, Universidad Industrial de Santander, Apartado Aéreo 678, Bucaramanga, Colombia.
| |
Collapse
|
14
|
Banach M, Prudhomme N, Carpentier M, Duprat E, Papandreou N, Kalinowska B, Chomilier J, Roterman I. Contribution to the prediction of the fold code: application to immunoglobulin and flavodoxin cases. PLoS One 2015; 10:e0125098. [PMID: 25915049 PMCID: PMC4411048 DOI: 10.1371/journal.pone.0125098] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 03/20/2015] [Indexed: 12/19/2022] Open
Abstract
Background Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model. Results We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.
Collapse
Affiliation(s)
- Mateusz Banach
- Department of Bioinformatics and Telemedicine, Medical College, Jagiellonian University, Krakow, Poland
| | - Nicolas Prudhomme
- Protein Structure Prediction group, IMPMC, UPMC & CNRS, Paris, France
| | - Mathilde Carpentier
- Protein Structure Prediction group, IMPMC, UPMC & CNRS, Paris, France
- RPBS, 35 rue Hélène Brion, 75013, Paris, France
| | - Elodie Duprat
- Protein Structure Prediction group, IMPMC, UPMC & CNRS, Paris, France
- RPBS, 35 rue Hélène Brion, 75013, Paris, France
| | - Nikolaos Papandreou
- Genetics Department, Agricultural University of Athens, Iera Odos 75, Athens, Greece
| | - Barbara Kalinowska
- Department of Bioinformatics and Telemedicine, Medical College, Jagiellonian University, Krakow, Poland
| | - Jacques Chomilier
- Protein Structure Prediction group, IMPMC, UPMC & CNRS, Paris, France
- RPBS, 35 rue Hélène Brion, 75013, Paris, France
- * E-mail: (JC); (IR)
| | - Irena Roterman
- Department of Bioinformatics and Telemedicine, Medical College, Jagiellonian University, Krakow, Poland
- * E-mail: (JC); (IR)
| |
Collapse
|
15
|
Chomilier J. Some insights into the transition state ensemble of the folding of globular proteins. Bio-Algorithms and Med-Systems 2014. [DOI: 10.1515/bams-2014-0020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
16
|
Abstract
This work analyzes proteins which contain an immunoglobulin fold, focusing on their hydrophobic core structure. The "fuzzy oil drop" model was used to measure the regularity of hydrophobicity distribution in globular domains belonging to proteins which exhibit the above-mentioned fold. Light-chain IgG domains are found to frequently contain regular hydrophobic cores, unlike the corresponding heavy-chain domains. Enzymes and DNA binding proteins present in the data-set are found to exhibit poor accordance with the hydrophobic core model.
Collapse
Affiliation(s)
- M Banach
- a Department of Bioinformatics and Telemedicine , Collegium Medicum, Jagiellonian University , Krakow , Poland
| | | | | | | |
Collapse
|
17
|
Piwowar M, Banach M, Konieczny L, Chomilier J, Roterman I. Structural role of exons in hemoglobin. Bio-Algorithms and Med-Systems 2013. [DOI: 10.1515/bams-2013-0009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
18
|
Lonquety M, Chomilier J, Papandreou N, Lacroix Z. Prediction of Stability upon Point Mutation in the Context of the Folding Nucleus. OMICS: A Journal of Integrative Biology 2010; 14:151-6. [DOI: 10.1089/omi.2009.0022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Mathieu Lonquety
- Scientific Data Management Laboratory, Arizona State University, Tempe, Arizona
- Protein Structure Prediction, IMPMC, Université Pierre et Marie Curie, UMR 7590 CNRS, Paris, France
| | - Jacques Chomilier
- Protein Structure Prediction, IMPMC, Université Pierre et Marie Curie, UMR 7590 CNRS, Paris, France
| | | | - Zoé Lacroix
- Scientific Data Management Laboratory, Arizona State University, Tempe, Arizona
- Pharmaceutical Genomics Division, Translational Genomics Research Institute, Scottsdale, Arizona
| |
Collapse
|
19
|
Thireou T, Atlamazoglou V, Papandreou N, Lonquety M, Chomilier J, Eliopoulos E. Quantitative Prediction of Critical Amino Acid Positions for Protein Folding. Protein Pept Lett 2009; 16:1342-9. [DOI: 10.2174/092986609789353673] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 04/13/2009] [Indexed: 11/22/2022]
|
20
|
Prudhomme N, Chomilier J. Prediction of the protein folding core: application to the immunoglobulin fold. Biochimie 2009; 91:1465-74. [PMID: 19665046 DOI: 10.1016/j.biochi.2009.07.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2009] [Accepted: 07/30/2009] [Indexed: 11/27/2022]
Abstract
We propose an algorithm that allows predicting residues important for the formation of the structure of globular proteins. It relies on a simulation that detects the amino acids presenting a maximum number of neighbours during the early steps of the folding process. They have been called MIR (Most Interacting Residues). Independently, description of the protein structures in fragments with closed ends shows the correlation between these extremities and the core of the globules. These fragments are of rather constant length, typically between 20 and 25 amino acids, and we have previously shown that their extremities are preferentially occupied by MIR. Introduction of rules derived from this fragment analysis of tertiary structures allows to smooth the distribution of MIR, for a better match between TEF ends and MIR. In order to assess this prediction of the folding core, a large family of structures has been used, with sequences as different as possible. A dataset of 56 immunoglobulin structures of various functions but common fold has been used in this study. This fold was chosen because it is one of the most populated with a large amount of data available on its nucleus. In the immunoglobulin domain, "functional and structural load is clearly separated: loops are responsible for binding and recognition while interactions between several residues of the buried core provide stability and fast folding"[1]. We then determined the positions susceptible of high importance for the folding process to occur and compared them to published data, either to High Throw Out Order (HTOO), Conservatism of Conservatism (CoC) or Phi value experiments. It results a reasonable agreement between the positions that we predict and experimental data. Besides, our prediction goes beyond the simple use of a null solvent accessibility of amino acids as a criterion to predict the core. We find the same quality of our prediction on the flavodoxin like superfamily.
Collapse
Affiliation(s)
- Nicolas Prudhomme
- Protein Structure Prediction, IMPMC, CNRS UMR 7590, Paris 6 University, 75015 Paris, France
| | | |
Collapse
|
21
|
Hernández Torres J, Papandreou N, Chomilier J. Sequence analyses reveal that a TPR-DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR-DP domains and prokaryotic GerD proteins. Cell Stress Chaperones 2009; 14:281-9. [PMID: 18987995 PMCID: PMC2728264 DOI: 10.1007/s12192-008-0083-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2008] [Accepted: 09/15/2008] [Indexed: 11/27/2022] Open
Abstract
The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR-DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR-DP domains.
Collapse
Affiliation(s)
- Jorge Hernández Torres
- Laboratorio de Biología Molecular, Escuela de Biología, Universidad Industrial de Santander, Apartado Aéreo 678, Bucaramanga, Colombia.
| | | | | |
Collapse
|
22
|
Abstract
SPROUTS (Structural Prediction for pRotein fOlding UTility System) is a new database that provides access to various structural data sets and integrated functionalities not yet available to the community. The originality of the SPROUTS database is the ability to gain access to a variety of structural analyses at one place and with a strong interaction between them. SPROUTS currently combines data pertaining to 429 structures that capture representative folds and results related to the prediction of critical residues expected to belong to the folding nucleus: the MIR (Most Interacting Residues), the description of the structures in terms of modular fragments: the TEF (Tightened End Fragments), and the calculation at each position of the free energy change gradient upon mutation by one of the 19 amino acids. All database results can be displayed and downloaded in textual files and Excel spreadsheets and visualized on the protein structure. SPROUTS is a unique resource to access as well as visualize state-of-the-art characteristics of protein folding and analyse the effect of point mutations on protein structure. It is available at http://bioinformatics.eas.asu.edu/sprouts.html.
Collapse
Affiliation(s)
- Mathieu Lonquety
- Scientific Data Management Laboratory, Arizona State University, Tempe AZ 85282-5706, USA
| | | | | | | |
Collapse
|
23
|
Hernández Torres J, Maldonado MAA, Chomilier J. Tandem duplications of a degenerated GTP-binding domain at the origin of GTPase receptors Toc159 and thylakoidal SRP. Biochem Biophys Res Commun 2007; 364:325-31. [PMID: 17950698 DOI: 10.1016/j.bbrc.2007.10.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2007] [Accepted: 10/03/2007] [Indexed: 11/28/2022]
Abstract
The evolutionary origin of some nuclear encoded proteins that translocate proteins across the chloroplast envelope remains unknown. Therefore, sequences of GTPase proteins constituting the Arabidopsis thaliana translocon at the outer membrane of chloroplast (atToc) complexes were analyzed by means of HCA. In particular, atToc159 and related proteins (atToc132, atToc120, and atToc90) do not have proven homologues of prokaryotic or eukaryotic ancestry. We established that the three domains commonly referred to as A, G, and M originate from the GTPase G domain, tandemly repeated, and probably evolving toward an unstructured conformation in the case of the A domain. It resulted from this study a putative common ancestor for these proteins and a new domain definition, in particular the splitting of A into three domains (A1, A2, and A3), has been proposed. The family of Toc159, previously containing A. thaliana and Pisum sativum, has been extended to Medicago truncatula and Populus trichocarpa and it has been revised for Oryza sativa. They have also been compared to GTPase subunits involved in the cpSRP system. A distant homology has been revealed among Toc and cpSRP GTP-hydrolyzing proteins of A. thaliana, and repetitions of a GTPase domain were also found in cpSRP protein receptors, by means of HCA analysis.
Collapse
Affiliation(s)
- Jorge Hernández Torres
- Laboratorio de Biología Molecular, CINBIN, Universidad Industrial de Santander, Bucaramanga, Apartado Aéreo 678, Colombia.
| | | | | |
Collapse
|
24
|
Leite TB, Gomes D, Miteva M, Chomilier J, Villoutreix B, Tufféry P. Frog: a FRee Online druG 3D conformation generator. Nucleic Acids Res 2007; 35:W568-72. [PMID: 17485475 PMCID: PMC1933180 DOI: 10.1093/nar/gkm289] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2007] [Revised: 04/06/2007] [Accepted: 04/12/2007] [Indexed: 01/21/2023] Open
Abstract
In silico screening methods based on the 3D structures of the ligands or of the proteins have become an essential tool to facilitate the drug discovery process. To achieve such process, the 3D structures of the small chemical compounds have to be generated. In addition, for ligand-based screening computations or hierarchical structure-based screening projects involving a rigid-body docking step, it is necessary to generate multi-conformer 3D models for each input ligand to increase the efficiency of the search. However, most academic or commercial compound collections are delivered in 1D SMILES (simplified molecular input line entry system) format or in 2D SDF (structure data file), highlighting the need for free 1D/2D to 3D structure generators. Frog is an on-line service aimed at generating 3D conformations for drug-like compounds starting from their 1D or 2D descriptions. Given the atomic constitution of the molecules and connectivity information, Frog can identify the different unambiguous isomers corresponding to each compound, and generate single or multiple low-to-medium energy 3D conformations, using an assembly process that does not presently consider ring flexibility. Tests show that Frog is able to generate bioactive conformations close to those observed in crystallographic complexes. Frog can be accessed at http://bioserv.rpbs.jussieu.fr/Frog.html.
Collapse
Affiliation(s)
- T. Bohme Leite
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR 726, Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris cedex 05, Equipe de Bioinformatique Structurale et Drug Design - INSERM U648 - Université Paris 5, 45 rue des Sts Peres, 75006 Paris and Département de Biologie Structurale - CNRS UMR 7590 - Universités Paris 6 et 7, Université Pierre et Marie CURIE - 4 place Jussieu - case postale 115 - 75252 Paris cedex 05, France
| | - D. Gomes
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR 726, Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris cedex 05, Equipe de Bioinformatique Structurale et Drug Design - INSERM U648 - Université Paris 5, 45 rue des Sts Peres, 75006 Paris and Département de Biologie Structurale - CNRS UMR 7590 - Universités Paris 6 et 7, Université Pierre et Marie CURIE - 4 place Jussieu - case postale 115 - 75252 Paris cedex 05, France
| | - M.A. Miteva
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR 726, Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris cedex 05, Equipe de Bioinformatique Structurale et Drug Design - INSERM U648 - Université Paris 5, 45 rue des Sts Peres, 75006 Paris and Département de Biologie Structurale - CNRS UMR 7590 - Universités Paris 6 et 7, Université Pierre et Marie CURIE - 4 place Jussieu - case postale 115 - 75252 Paris cedex 05, France
| | - J. Chomilier
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR 726, Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris cedex 05, Equipe de Bioinformatique Structurale et Drug Design - INSERM U648 - Université Paris 5, 45 rue des Sts Peres, 75006 Paris and Département de Biologie Structurale - CNRS UMR 7590 - Universités Paris 6 et 7, Université Pierre et Marie CURIE - 4 place Jussieu - case postale 115 - 75252 Paris cedex 05, France
| | - B.O. Villoutreix
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR 726, Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris cedex 05, Equipe de Bioinformatique Structurale et Drug Design - INSERM U648 - Université Paris 5, 45 rue des Sts Peres, 75006 Paris and Département de Biologie Structurale - CNRS UMR 7590 - Universités Paris 6 et 7, Université Pierre et Marie CURIE - 4 place Jussieu - case postale 115 - 75252 Paris cedex 05, France
| | - P. Tufféry
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR 726, Université Paris 7, case 7113, 2, place Jussieu, 75251 Paris cedex 05, Equipe de Bioinformatique Structurale et Drug Design - INSERM U648 - Université Paris 5, 45 rue des Sts Peres, 75006 Paris and Département de Biologie Structurale - CNRS UMR 7590 - Universités Paris 6 et 7, Université Pierre et Marie CURIE - 4 place Jussieu - case postale 115 - 75252 Paris cedex 05, France
| |
Collapse
|
25
|
Stevanin G, Santorelli FM, Azzedine H, Coutinho P, Chomilier J, Denora PS, Martin E, Ouvrard-Hernandez AM, Tessa A, Bouslam N, Lossos A, Charles P, Loureiro JL, Elleuch N, Confavreux C, Cruz VT, Ruberg M, Leguern E, Grid D, Tazir M, Fontaine B, Filla A, Bertini E, Durr A, Brice A. Mutations in SPG11, encoding spatacsin, are a major cause of spastic paraplegia with thin corpus callosum. Nat Genet 2007; 39:366-72. [PMID: 17322883 DOI: 10.1038/ng1980] [Citation(s) in RCA: 232] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2006] [Accepted: 01/18/2007] [Indexed: 11/08/2022]
Abstract
Autosomal recessive hereditary spastic paraplegia (ARHSP) with thin corpus callosum (TCC) is a common and clinically distinct form of familial spastic paraplegia that is linked to the SPG11 locus on chromosome 15 in most affected families. We analyzed 12 ARHSP-TCC families, refined the SPG11 candidate interval and identified ten mutations in a previously unidentified gene expressed ubiquitously in the nervous system but most prominently in the cerebellum, cerebral cortex, hippocampus and pineal gland. The mutations were either nonsense or insertions and deletions leading to a frameshift, suggesting a loss-of-function mechanism. The identification of the function of the gene will provide insight into the mechanisms leading to the degeneration of the corticospinal tract and other brain structures in this frequent form of ARHSP.
Collapse
Affiliation(s)
- Giovanni Stevanin
- INSERM, UMR679, Federal Institute for Neuroscience Research, Pitié-Salpêtrière Hospital, Paris, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Alland C, Moreews F, Boens D, Carpentier M, Chiusa S, Lonquety M, Renault N, Wong Y, Cantalloube H, Chomilier J, Hochez J, Pothier J, Villoutreix BO, Zagury JF, Tufféry P. RPBS: a web resource for structural bioinformatics. Nucleic Acids Res 2005; 33:W44-9. [PMID: 15980507 PMCID: PMC1160237 DOI: 10.1093/nar/gki477] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
RPBS (Ressource Parisienne en Bioinformatique Structurale) is a resource dedicated primarily to structural bioinformatics. It is the result of a joint effort by several teams to set up an interface that offers original and powerful methods in the field. As an illustration, we focus here on three such methods uniquely available at RPBS: AUTOMAT for sequence databank scanning, YAKUSA for structure databank scanning and WLOOP for homology loop modelling. The RPBS server can be accessed at and the specific services at .
Collapse
Affiliation(s)
| | | | - D. Boens
- Department of Structural Biology, IMPMC, CNRS UMR 7590Paris, France
| | | | - S. Chiusa
- Department of Structural Biology, IMPMC, CNRS UMR 7590Paris, France
| | - M. Lonquety
- Department of Structural Biology, IMPMC, CNRS UMR 7590Paris, France
| | - N. Renault
- Department of Structural Biology, IMPMC, CNRS UMR 7590Paris, France
| | | | - H. Cantalloube
- Chaire de Bioinformatique, Conservatoire National des Arts et MétiersParis, France
| | - J. Chomilier
- Department of Structural Biology, IMPMC, CNRS UMR 7590Paris, France
| | | | | | | | - J.-F. Zagury
- Chaire de Bioinformatique, Conservatoire National des Arts et MétiersParis, France
| | - P. Tufféry
- To whom correspondence should be addressed. Tel: +33 1 44 27 77 33; Fax: +33 1 43 26 38 30;
| |
Collapse
|
27
|
Bertin-Maghit SM, Capini CJ, Bessis N, Chomilier J, Muller S, Abbas A, Autin L, Spadoni JL, Rappaport J, Therwath A, Boissier MC, Zagury JF. Improvement of collagen-induced arthritis by active immunization against murine IL-1β peptides designed by molecular modelling. Vaccine 2005; 23:4228-35. [PMID: 16005738 DOI: 10.1016/j.vaccine.2005.03.030] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2004] [Revised: 03/02/2005] [Accepted: 03/31/2005] [Indexed: 10/25/2022]
Abstract
Interleukin-1beta (IL-1beta) is a crucial cytokine in inflammation processes and has been implicated in the pathogenesis of several chronic inflammatory diseases. Strategies designed to blocking IL-1beta by passive administration of inhibitors (mAbs, IL-1 receptor antagonist) have previously demonstrated efficacy in rheumatoid arthritis (RA). Using molecular modelling, we have defined three murine IL-1beta peptide regions characterized by their close proximity to the receptor. Synthetic peptides corresponding to these regions, in cyclic and linear form, were delivered as immunogens in Swiss mice, resulting in significant levels of autoantibodies directed against the native murine IL-1beta cytokine as determined by ELISA and by an assay for neutralization of IL-1beta biological activity. More importantly, one of the cyclic peptides showed a protective effect against inflammation and articular destruction in DBA/1 mouse collagen-induced arthritis, a model of RA. The high rate of success observed for active immunization against cytokine peptides in vivo suggests that the in silico approach to autoantigen design may be a promising avenue for the development of anti-cytokine immunotherapeutics.
Collapse
|
28
|
Abstract
Database scanning programs such as BLAST and FASTA are used nowadays by most biologists for the post-genomic processing of DNA or protein sequence information (in particular to retrieve the structure/function of uncharacterized proteins). Unfortunately, their results can be polluted by identical alignments (called redundancies) coming from the same protein or DNA sequences present in different entries of the database. This makes the efficient use of the listed alignments difficult. Pretreatment of databases has been proposed to suppress strictly identical entries. However, there still remain many identical alignments since redundancies may occur locally for entries corresponding to various fragments of the same sequence or for entries corresponding to very homologous sequences but differing at the level of a few residues such as ortholog proteins. In the present work, we show that redundant alignments can be indeed numerous even when working with a pretreated non-redundant data bank, going as high as 60% of the output results according to the query and the bank. Therefore the accuracy and the efficiency of the post-genomic work will be greatly increased if these redundancies are removed. To solve this up to now unaddressed problem, we have developed an algorithm that allows for the efficient and safe suppression of all the redundancies with no loss of information. This algorithm is based on various filtering steps that we describe here in the context of the Automat similarity search program, and such an algorithm should also be added to the other similarity search programs (BLAST, FASTA, etc...).
Collapse
Affiliation(s)
- Hubert Cantalloube
- Groupe Bioinformatique, Génomique et Traitement des Pathologies du Système Immunitaire, INSERM EMI0355, 15 rue de l'Ecole de Médecine, 75006 Paris, France
| | | | | | | | | | | |
Collapse
|
29
|
Abstract
The description of globular protein structures as an ensemble of contiguous 'closed loops' or 'tightened end fragments' reveals fold elements crucial for the formation of stable structures and for navigating the very process of protein folding. These are the ends of the loops, which are spatially close to each other but are situated apart in the polypeptide chain by 25-30 residues. They also correlate with the locations of highly conserved hydrophobic residues (referred to as topohydrophobic), in a structural alignment of the members of a protein family. This study analysed these positions in 111 representatives of different protein folds, and then carried out dynamic Monte Carlo simulations of the first steps of the folding process, aimed at predicting the origins of the assembling folds. The simulations demonstrated that there is an obvious trend for certain sets of residues, named 'mostly interacting residues', to be buried at the early stages of the folding process. Location of these residues at the loop ends and correlation with topohydrophobic positions are demonstrated, thereby giving a route to simulations of the protein folding process.
Collapse
|
30
|
Capini CJ, Bertin-Maghit SM, Bessis N, Haumont PM, Bernier EM, Muel EG, Laborie MA, Autin L, Paturance S, Chomilier J, Boissier MC, Briand JP, Muller S, Cavaillon JM, Therwath A, Zagury JF. Active immunization against murine TNFalpha peptides in mice: generation of endogenous antibodies cross-reacting with the native cytokine and in vivo protection. Vaccine 2004; 22:3144-53. [PMID: 15297067 DOI: 10.1016/j.vaccine.2004.01.064] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2003] [Revised: 12/03/2003] [Accepted: 01/26/2004] [Indexed: 10/26/2022]
Abstract
New lines of treatment targeting cytokines have been successfully developed recently and are now widely used in therapy. They are based on passive administration of cytokine inhibitors either soluble receptors or mAbs and the major example is TNFalpha in rheumatoid arthritis (RA). Since a few years, our group has developed a novel alternative approach targeting cytokines by using active immunization against biologically inactive but immunogenic cytokine derivatives. In the present work, we present a new aspect of this research, based on immunization against specific cytokine peptides chosen by molecular modelling. We could elicit a significant humoral response against four TNFalpha peptides by active immunization, and show that the Abs generated cross-reacted with the native cytokine with good titers as determined by ELISA. Interestingly, during coimmunization experiments with couples of peptides, one showed a clear immunodominant effect over the other. Overall, we could not show the neutralization of TNFalpha biological activity in vitro by the immunized sera, but it seems that it is not a prerequisite to observe clinical efficacy. Indeed, using the LPS/galactosamine-induced shock, we could demonstrate that one of the four peptides tested conferred a clinical protection. These results validate the feasibility and efficacy of active immunization against cytokine peptides, and confirm that active immunization against cytokines could represent in the future an alternative to passive immunization in many diseases.
Collapse
Affiliation(s)
- C J Capini
- Centre de recherche des Cordeliers, NEOVACS, Paris, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Abstract
The folding process of a set of 42 proteins, representative of the various folds, has been simulated by means of a Monte Carlo method on a discrete lattice, using two different potentials of mean force. Multiple compact fragments of contiguous residues are formed in the simulation, stable in composition, but not in geometry. During time, the number of fragments decreases until one final compact globular state is reached. We focused on the early steps of the folding in order to evidence the maximum number of fragments, provided they are sufficiently stable in sequence. A correlation has been established between these proto fragments and regular secondary-structure elements, whatever their nature, alpha helices or beta strands. Quantitatively, this is revealed by an overall mean one-residue quality factor of nearly 60%, which is better for proteins mainly composed of alpha helices. The correspondence between the number of fragments and the number of secondary-structure elements is of 77% and the regions separating successive fragments are mainly located in loops. Besides, hydrophobic clusters deduced from HCA correspond to fragments with an equivalent accuracy. These results suggest that folding pathways do not contain structurally static intermediate. However, since the beginning of folding, most residues that will later form one given secondary structure are kept close in space by being involved in the same fragment. This aggregation may be a way to accelerate the formation of the native state and enforces the key role played by hydrophobic residues in the formation of the fragments, thus in the folding process itself.
Collapse
Affiliation(s)
- Jacques Chomilier
- Equipe 'Systèmes moléculaires et Biologie structurale', LMCP, université Paris-6, CNRS UMR 7590, case 115, 75252 Paris 05, France.
| | | | | | | | | | | |
Collapse
|
32
|
Gregoire S, Logre C, Metharom P, Loing E, Chomilier J, Rosset MB, Aucouturier P, Carnaud C. Identification of two immunogenic domains of the prion protein-PrP-which activate class II-restricted T cells and elicit antibody responses against the native molecule. J Leukoc Biol 2004; 76:125-34. [PMID: 15075357 DOI: 10.1189/jlb.1203656] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Recent reports suggest that immunity against the prion protein (PrP) retards transmissible spongiform encephalopathies progression in infected mice. A major obstacle to the development of vaccines comes from the fact that PrP is poorly immunogenic, as it is seen as self by the host immune system. Additional questions concern the immune mechanisms involved in protection and the risk of eliciting adverse reactions in the central nervous system of treated patients. Peptide-based vaccines offer an attractive strategy to overcome these difficulties. We have undertaken the identification of the immunogenic regions of PrP, which trigger helper T cells (Th) associated with antibody production. Our results identify two main regions, one between the structured and flexible portion of PrP (98-127) and a second between alpha 1 and alpha 2 helix (143-187). Peptides (30-mer) corresponding to these regions elicit class II-restricted Th cells and antibody production against native PrP and could therefore be of potential interest for a peptide-based vaccination.
Collapse
Affiliation(s)
- Sylvie Gregoire
- INSERM E209, Hôpital Saint-Antoine, Bâtiment Kourilsky, 184 rue du faubourg Saint-Antoine, 75571 Paris Cedex 12, France.
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Znamenskiy D, Le Tuan K, Mornon JP, Chomilier J. A new protein folding algorithm based on hydrophobic compactness: Rigid Unconnected Secondary Structure Iterative Assembly (RUSSIA). II: Applications. Protein Eng Des Sel 2003; 16:937-48. [PMID: 14983073 DOI: 10.1093/protein/gzg141] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The RUSSIA procedure (Rigid Unconnected Secondary Structure Iterative Assembly) produces structural models of cores of small- and medium-sized proteins. Loops are omitted from this treatment and regular secondary structures are reduced to points, the centers of their hydrophobic faces. This methodology relies on the maximum compactness of the hydrophobic residues, as described in detail in Part I. Starting data are the sequence and the predicted limits and natures of regular secondary structures (alpha or beta). Helices are treated as rigid cylinders, whereas beta-strands are collectively taken into account within beta-sheets modeled by helicoid surfaces. Strands are allowed to shift along their mean axis to allow some flexibility and the alpha-helices can be placed on either side of beta-sheets. Numerous initial conformations are produced by discrete rotations of the helices and sheets around the direction going from the center of their hydrophobic face to the global center of the protein. Selection of proposed models is based upon a criterion lying on the minimization of distances separating hydrophobic residues belonging to different regular secondary structures. The procedure is rapid and appears to be robust relative to the quality of starting data (nature and length of regular secondary structures). This dependence of the quality of the model on secondary structure prediction and in particular the beta-sheet topology, is one of the limits of the present algorithm. We present here some results for a set of 12 proteins (alpha, beta and alpha/beta classes) of lengths 40-166 amino acids. The r.m.s. deviations for core models with respect to the native proteins are in the range 1.4-3.7 A.
Collapse
Affiliation(s)
- Denis Znamenskiy
- Systèmes Moléculaires et Biologie Structurale, LMCP, Universités Paris 6 et Paris 7, CNRS UMR 7590, Case 115, 75252 Paris cedex 05, France
| | | | | | | |
Collapse
|
34
|
Znamenskiy D, Chomilier J, Le Tuan K, Mornon JP. A new protein folding algorithm based on hydrophobic compactness: Rigid Unconnected Secondary Structure Iterative Assembly (RUSSIA). I: Methodology. Protein Eng Des Sel 2003; 16:925-35. [PMID: 14983072 DOI: 10.1093/protein/gzg140] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We present an algorithm that is able to propose compact models of protein 3D structures, only starting from the prediction of the nature and length of regular secondary structures. Helices are modeled by cylinders and sheets by helicoid surfaces, all strands of a sheet being considered as a single block. It means that relative topology of the strands inside one sheet is a prerequisite. Loops are only considered as constraints, given by the maximal distance between their Calpha extremities according to their sequence length. Unconnected regular secondary structures are reduced to a single point, the center of their hydrophobic faces. These centers are then repeatedly moved in order to obtain a compact hydrophobic core. To prevent secondary structures from interpenetrating, a repulsive term is introduced in the function whose minimization leads to the compact structure. This RUSSIA (Rigid Unconnected Secondary Structure Assembly) algorithm has the advantage of relying on a small number of variables and therefore many initial conformations can be tested. Flexibility is produced in the following way: helices or sheets are allowed to rotate around the direction leading to the center of the model; residues in a sheet can slide along the main direction of the strand where they are embedded. RUSSIA is fast and simple and it produces on a test set several neighbor good models with an r.m.s. to the native structures in the range 1.4-3.7 A. These models can be further treated by statistical potentials used in threading approaches in order to detect the best candidate. The limits of the present method are the following: small proteins with few secondary structures are excluded; multi domain proteins must be split into several compact globular domains from their sequences; sheets of more than five strands and completely buried helices are not treated. In this first paper the algorithm is developed and in Part II, which follows, some applications are presented and the program is evaluated.
Collapse
Affiliation(s)
- Denis Znamenskiy
- Systèmes Moléculaires et Biologie Structurale, LMCP, Universités Paris 6 et Paris 7, CNRS UMR 7590, Case 115, 75252 Paris cedex 05, France
| | | | | | | |
Collapse
|
35
|
Angelov B, Sadoc JF, Jullien R, Soyer A, Mornon JP, Chomilier J. Nonatomic solvent-driven Voronoi tessellation of proteins: an open tool to analyze protein folds. Proteins 2002; 49:446-56. [PMID: 12402355 DOI: 10.1002/prot.10220] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
A three-dimensional Voronoi tessellation of folded proteins is used to analyze geometrical and topological properties of a set of proteins. To each amino acid is associated a central point surrounded by a Voronoi cell. Voronoi cells describe the packing of the amino acids. Special attention is given to reproduction of the protein surface. Once the Voronoi cells are built, a lot of tools from geometrical analysis can be applied to investigate the protein structure; volume of cells, number of faces per cell, and number of sides per face are the usual signatures of the protein structure. A distinct difference between faces related to primary, secondary, and tertiary structures has been observed. Faces threaded by the main-chain have on average more than six edges, whereas those related to helical packing of the amino acid chain have less than five edges. The faces on the protein surface have on average five edges within 1% error. The average number of faces on the protein surface for a given type of amino acid brings a new point of view in the characterization of the exposition to the solvent and the classification of amino acid as hydrophilic or hydrophobic. It may be a convenient tool for model validation.
Collapse
Affiliation(s)
- Borislav Angelov
- Laboratoire de Physique des Solides, Université Paris 11, Orsay, France
| | | | | | | | | | | |
Collapse
|
36
|
Lamarine M, Mornon JP, Berezovsky N, Chomilier J. Distribution of tightened end fragments of globular proteins statistically matches that of topohydrophobic positions: towards an efficient punctuation of protein folding? Cell Mol Life Sci 2001; 58:492-8. [PMID: 11315195 DOI: 10.1007/pl00000873] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Using a set of 372 proteins representative of a variety of 56 distinct globular folds, a statistical correlation was observed between two recently revealed features of protein structures: tightened end fragments or 'closed loops', i.e. sequence fragments that are able in three-dimensional (3D) space to nearly close their ends (a current parameter of polymer physics), and 'topohydrophobic positions', i.e. positions always occupied in 3D space by strong hydrophobic amino acids for all members of a fold family. Indeed, in sequence space, the distribution of preferred lengths for tightened end fragments and that for topohydrophobic separation match. In addition to this statistically significant similarity, the extremities of these 'closed loops' may be preferentially occupied by topohydrophobic positions, as observed on a random sample of various folds. This observation may be of special interest for sequence comparison of distantly related proteins. It is also important for the ab initio prediction of protein folds, considering the remarkable topological properties of topohydrophobic positions and their paramount importance within folding nuclei. Consequently, topohydrophobic positions locking the 'closed loops' belong to the deep cores of protein domains and might have a key role in the folding process.
Collapse
Affiliation(s)
- M Lamarine
- Equipe Systèmes Moléculaires et Biologie Structurale, LMCP, Université Paris 6, CNRS UMR 7590, France
| | | | | | | |
Collapse
|
37
|
Eyries M, Michaud A, Deinum J, Agrapart M, Chomilier J, Kramers C, Soubrier F. Increased shedding of angiotensin-converting enzyme by a mutation identified in the stalk region. J Biol Chem 2001; 276:5525-32. [PMID: 11076943 DOI: 10.1074/jbc.m007706200] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Angiotensin-converting enzyme (ACE), an enzyme that plays a major role in vasoactive peptide metabolism, is a type 1 ectoprotein, which is released from the plasma membrane by a proteolytic cleavage occurring in the stalk sequence adjacent to the membrane anchor. In this study, we have discovered the molecular mechanism underlying the marked increase of plasma ACE levels observed in three unrelated individuals. We have identified a Pro(1199) --> Leu mutation in the juxtamembrane stalk region. In vitro analysis revealed that the shedding of [Leu(1199)]ACE was enhanced compared with wild-type ACE. The solubilization process of [Leu(1199)]ACE was stimulated by phorbol esters and inhibited by compound 3, an inhibitor of ACE-secretase. The results of Western blot analysis were consistent with a cleavage at the major described site (Arg(1203)/Ser(1204)). Two-dimensional structural analysis of ACE showed that the mutated residue was critical for the positioning of a specific loop containing the cleavage site. We therefore propose that a local conformational modification caused by the Pro(1199) --> Leu mutation leads to more accessibility at the stalk region for ACE secretase and is responsible for the enhancement of the cleavage-secretion process. Our results show that different molecular mechanisms are responsible for the common genetic variation of plasma ACE and for its more rare familial elevation.
Collapse
Affiliation(s)
- M Eyries
- Institut National de la Santé et de la Recherche Médicale Unit 525, Faculté de médecine Pitié-Salpétrière, 91 Boulevard de l'Hôpital, 75013 Paris, France
| | | | | | | | | | | | | |
Collapse
|
38
|
Bouchard P, Chomilier J, Ravet V, Mornon JP, Viguès B. Molecular characterization of the major membrane skeletal protein in the ciliate Tetrahymena pyriformis suggests n-plication of an early evolutionary intermediate filament protein subdomain. J Cell Sci 2001; 114:101-110. [PMID: 11112694 DOI: 10.1242/jcs.114.1.101] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Epiplasmin C is the major protein component of the membrane skeleton in the ciliate Tetrahymena pyriformis. Cloning and analysis of the gene encoding epiplasmin C showed this protein to be a previously unrecognized protein. In particular, epiplasmin C was shown to lack the canonical features of already known epiplasmic proteins in ciliates and flagellates. By means of hydrophobic cluster analysis (HCA), it has been shown that epiplasmin C is constituted of a repeat of 25 domains of 40 residues each. These domains are related and can be grouped in two families called types I and types II. Connections between types I and types II present rules that can be evidenced in the sequence itself, thus enforcing the validity of the splitting of the domains. Using these repeated domains as queries, significant structural similarities were demonstrated with an extra six heptads shared by nuclear lamins and invertebrate cytoplasmic intermediate filament proteins and deleted in the cytoplasmic intermediate filament protein lineage at the protostome-deuterostome branching in the eukaryotic phylogenetic tree.
Collapse
Affiliation(s)
- P Bouchard
- Laboratoire de Biologie des Protistes CNRS UMR 6023, Université Blaise Pascal 63177 Aubière cedex, France
| | | | | | | | | |
Collapse
|
39
|
Soyer A, Chomilier J, Mornon JP, Jullien R, Sadoc JF. Voronoï tessellation reveals the condensed matter character of folded proteins. Phys Rev Lett 2000; 85:3532-3535. [PMID: 11030939 DOI: 10.1103/physrevlett.85.3532] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2000] [Indexed: 05/23/2023]
Abstract
The packing geometry of amino acids in folded proteins is analyzed via a modified Voronoï tessellation method which distinguishes bulk and surface. From a statistical analysis of the Voronoï cells over 40 representative proteins, it appears that the packings are in average similar to random packings of hard spheres encountered in condensed matter physics, with a quite strong fivefold local symmetry. Moreover, the statistics permits one to establish a classification of amino acids in terms of increasing propensity to be buried in agreement with what is known from chemical considerations.
Collapse
Affiliation(s)
- A Soyer
- Laboratoire de Minéralogie Cristallographie, Universités Paris 6 et 7, 4 place Jussieu, 75252 Paris, France
| | | | | | | | | |
Collapse
|
40
|
Abstract
We present a topological description of a beta-sheet in terms of a piece of helical surface. It requires only two easy-to-handle parameters: the twist, i.e. the turn of the helical surface per residue, and the coiling, which is a curvature along the strands or in the direction perpendicular to the strands of the sheet. This method applies fairly well to three- and four-strand sheets, forming a too limited structure to be able to build a barrel. From an analysis of beta-sheets derived from a structural database, we show that this picture can even be reduced to the use of one main value, the twist angle. The dependence of beta-sheet twisting on the number of strands in a sheet, and also on the length and direction of strands, has been demonstrated. The applications of such a description may include the rapid modeling of 3D structures.
Collapse
Affiliation(s)
- D Znamenskiy
- Equipe Systèmes Moléculaires et Biologie Structurale, LMCP, CNRS UMR7590, Universités Paris 6 et Paris 7, Case 115, 75252 Paris cedex 05, France
| | | | | | | | | |
Collapse
|
41
|
Wojcik J, Mornon JP, Chomilier J. New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. J Mol Biol 1999; 289:1469-90. [PMID: 10373380 DOI: 10.1006/jmbi.1999.2826] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A bank of 13,563 loops from three to eight amino acid residues long, representing motifs between two consecutive regular secondary structures, has been derived from protein structures presenting less than 95 % sequence identity. Statistical analyses of occurrences of conformations and residues revealed length-dependent over-representations of particular amino acids (glycine, proline, asparagine, serine, and aspartate) and conformations (alphaL, epsilon, betaPregions of the Ramachandran plot). A position-dependent distribution of these occurrences was observed for N and C-terminal residues, which are correlated to the nature of the flanking regions. Loops of the same length were clustered into statistically meaningful families on the basis of their backbone structures when placed in a common reference frame, independent of the flanks. These clusters present significantly different distributions of sequence, conformations, and endpoint residue Calphadistances. On the basis of the sequence-structure correlation of this clustering, an automatic loop modeling algorithm was developed. Based on the knowledge of its sequence and of its flank backbone structures each query loop is assigned to a family and target loop supports are selected in this family. The support backbones of these target loops are then adjusted on flanking structures by partial exploration of the conformational space. Loop closure is performed by energy minimization for each support and the final model is chosen among connected supports based upon energy criteria. The quality of the prediction is evaluated by the root-mean-square deviation (rmsd) between the final model and the native loops when the whole bank is re-attributed on itself with a Jackknife test. This average rmsd ranges from 1.1 A for three-residue loops to 3.8 A for eight-residue loops. A few poorly predicted loops are inescapable, considering the high level of diversity in loops and the lack of environment data. To overcome such modeling problems, a statistical reliability score was assigned for each prediction. This score is correlated to the quality of the prediction, in terms of rmsd, and thus improves the selection accuracy of the model. The algorithm efficiency was compared to CASP3 target loop predictions. Moreover, when tested on a test loop bank, this algorithm was shown to be robust when the loops are not precisely delimited, therefore proving to be a useful tool in practice for protein modeling.
Collapse
Affiliation(s)
- J Wojcik
- Systèmes Moléculaires et Biologie Structurale Laboratoire de Minéralogie-Cristallographie (LMCP), Universités Paris VI et Paris VII, Cedex 05, Paris, CNRS UMR7590, France
| | | | | |
Collapse
|
42
|
Wojcik J, Girault JA, Labesse G, Chomilier J, Mornon JP, Callebaut I. Sequence analysis identifies a ras-associating (RA)-like domain in the N-termini of band 4.1/JEF domains and in the Grb7/10/14 adapter family. Biochem Biophys Res Commun 1999; 259:113-20. [PMID: 10334925 DOI: 10.1006/bbrc.1999.0727] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
RA (RalGEF/AF6 or Ras-associating) domains are found in a wide variety of proteins, several of which are known to be Ras-GTP effectors. The three dimensional structure of the RA domain has been experimentally determined in Ral-guanine nucleotide exchange factor (Ral-GEF) and found to be similar to that of the Ras-binding domain of c-Raf1, in spite of a very low level of sequence identity. Using various approaches of sequence analysis, including automatic procedures such as BLAST2, profilescan, and hidden Markov models (HMM), as well as the bidimensional hydrophobic cluster analysis (HCA), here we found that a region with a similar structure is also present at the N-terminus of the band 4.1/JEF domain of KIAA0316 (a human cDNA open reading frame) and H09G03.2 (a related protein sequence predicted from C. elegans genome cloning), as well as in a particular class of adapter proteins including Grb7, Grb10, Grb14, MIG-10, and PRP48. Although the structural conservation of this motif does not necessarily imply a conservation of its ability to bind small GTPases of the Ras superfamily, several proteins with a band 4.1/JEF domain and adapters of the Grb7 group have close functional relationships with such small GTPases. Thus, our finding raises the intriguing possibility of a direct interaction between members of these two groups of proteins and Ras-like GTP-binding proteins.
Collapse
Affiliation(s)
- J Wojcik
- Systèmes moléculaires & Biologie structurale, LMCP, CNRS UMR 7590, Universités Paris 6 et Paris 7, case 115, 4 place Jussieu, Paris Cedex 05, 75252, France
| | | | | | | | | | | |
Collapse
|
43
|
Abstract
Monte-Carlo simulations of folding of the human protein FKBP are presented. The protein is confined in a simple cubic lattice and only nearest-neighbour interactions are considered. The evolution of protein structure, energy and diameter is followed over time. Starting from different extended conformations, compact globular forms with a hydrophobic core are reached above a critical temperature Tc, while below Tc the protein 'freezes' into high-energy, non-compact states. In the temperature range of folding, all the recorded intermediate states belong to two structural groups, where the process spends most of its time, separated by relatively fast transitions. During folding, the protein is successively composed of three and two compact fragments, whose separation occurs at loop positions. From comparisons performed on a domain of the family sharing 24% identity with FKBP, it appears that the number of fragments, and therefore their location, are sequence dependent.
Collapse
Affiliation(s)
- N Papandreou
- Laboratoire de physique des solides, université Pierre-et-Marie-Curie, CNRS ERS, Paris, France
| | | | | |
Collapse
|
44
|
Esposito N, Wojcik J, Chomilier J, Martini JF, Kelly PA, Finidori J, Postel-Vinay MC. The D152H mutation found in growth hormone insensitivity syndrome impairs expression and function of human growth hormone receptor but is silent in rat receptor. J Mol Endocrinol 1998; 21:61-72. [PMID: 9723864 DOI: 10.1677/jme.0.0210061] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In two patients with growth hormone (GH) insensitivity syndrome (Laron syndrome), in whom the GH receptor is able to bind the hormone, the D152H mutation was identified, and lack of dimerization was proposed to explain GH resistance in these patients. To examine further the consequences of the substitution of conserved aspartate 152 on the function of the GH receptor (GHR), we reproduced the mutation in vitro on the full length GH receptor cDNA from man and rat. Effects of the mutation on expression and activity of the GHR were analyzed in 293 cells transfected with wild-type and mutant GHR cDNAs. Mutant human receptor protein was expressed at a lower level than wild-type receptor and its activity was reduced: GH-dependent signal transducer and activator of transcription 5 (Stat5)-mediated transactivation of a reporter gene was lower in 293 cells transfected with mutant GHR cDNA than in transfected cells expressing a comparable level of wild-type GHR. The membrane-bound form of the mutant and of the wild-type human GHR were able to homodimerize, as suggested by the size of the complexes detected in cross-linking experiments with 125I-human (h) GH, and also by the activity in the functional test. With the soluble GHR resulting from proteolysis of the wild-type membrane form, no dimeric complexes could be detected. However, when a soluble receptor lacking the transmembrane and cytoplasmic domains of the receptor was expressed, wild-type and not mutant GH binding protein (GHBP) was able to form dimers in the presence of hGH. The amino acid substitution has no effect on either expression or function of the rat receptor. Structural modeling of D152H soluble human and rat GHR (GHBP) supports the species-specific functional consequences of the mutation. Evaluation of the functional importance of the mutation strongly suggests that impairment in expression and activity of the mutant receptor, rather than complete lack of dimerization, explains the GH resistance of the patients.
Collapse
Affiliation(s)
- N Esposito
- Unité 344, Endocrinologie Moléculaire, Institut National de la Santé et de la Recherche Médicale, Faculté de Médecine Necker Enfants Malades, Université Paris VI, CNRS URA09, France
| | | | | | | | | | | | | |
Collapse
|
45
|
Déret S, Chomilier J, Huang DB, Preud'homme JL, Stevens FJ, Aucouturier P. Molecular modeling of immunoglobulin light chains implicates hydrophobic residues in non-amyloid light chain deposition disease. Protein Eng 1997; 10:1191-7. [PMID: 9488143 DOI: 10.1093/protein/10.10.1191] [Citation(s) in RCA: 44] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Light chain deposition disease is a severe complication of certain immunoproliferative disorders, due to the secretion of a monoclonal light chain which precipitates close to basement membranes of several tissues. A kappa isotype restriction and an unusual frequency of a variable region subgroup (VkappaIV) suggest that precise structural features govern the propensity of pathogenic light chains to precipitate in extracellular spaces. We studied primary structures of light chains from six patients with light chain deposition disease in comparison with light chains from other pathological conditions. Sequence alignment revealed the presence of certain amino acids only in light chain deposition disease, in particular non-polar replacing hydrophilic residues. To determine the role of these residues, structures of the variable domain from four kappa chains belonging to VkappaI and VkappaIV subgroups responsible for deposition disease were modeled using known immunoglobulins as templates. The most evident structural features shared by all pathogenic light chains were hydrophobic residues exposed to the solvent in complementarity determining regions 1 or 3. In contrast to immunoglobulin light chain-related amyloidosis, where deposition of organized material might be due to electrostatic interactions between light chain dimers, hydrophobic interactions could enhance amorphous precipitation in non-amyloid light chain deposition disease.
Collapse
Affiliation(s)
- S Déret
- Laboratoire d'Immunologie et Immunopathologie, CNRS URA 1172, Poitiers, France
| | | | | | | | | | | |
Collapse
|
46
|
Callebaut I, Labesse G, Durand P, Poupon A, Canard L, Chomilier J, Henrissat B, Mornon JP. Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives. Cell Mol Life Sci 1997; 53:621-45. [PMID: 9351466 DOI: 10.1007/s000180050082] [Citation(s) in RCA: 372] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Ten years after the idea of hydrophobic cluster analysis (HCA) was conceived and first published, theoretical and practical experience has shown this unconventional method of protein sequence analysis to be particularly efficient and sensitive, especially with families of sequences sharing low levels of sequence identity. This extreme sensitivity has made it possible to predict the functions of genes whose sequence similarities are hardly if at all detectable by current one-dimensional (1D) methods alone, and offers a new way to explore the enormous amount of data generated by genome sequencing. HCA also provides original tools to understand fundamental features of protein stability and folding. Since the last review of HCA published in 1990 [1], significant improvements have been made and several new facets have been addressed. Here we wish to update and summarize this information.
Collapse
Affiliation(s)
- I Callebaut
- Systèmes Moléculaires et Biologie Structurale, LMCP, CNRS URA 09, UP6/UP7, Paris, France.
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Abstract
A bank of loops from three to eight amino acid residues long has been constituted. On the basis of statistical analysis of occurrences of conformations and residue, loops could be divided into two parts: the side residues directly bonded to the secondary structure flanking element, and the inner part. The conformations of the side residues are correlated to the nature of their neighboring flanks, while the inner residues adopt conformations uncorrelated from one residue to the next; thus they are unrelated to the flanks. Two zones in the Ramachandran plot are important: alpha L and beta P. In particular, the high occurrence of alpha L, mainly occupied by glycine residues, is necessary to induce flexibility and thus allow loops to comply with the geometrical constraints of the flanks. An algorithm of clustering has been used to aggregate loops of the same length within families of similar 3D structures. At each position in each cluster, sequence and conformational signatures have been deduced if the occurrence of a residue (or a conformation) is higher than an equiprobable distribution over all clusters. The result is that some positions favor particular amino acids and conformations, which are typical of a cluster although not unique. This is an indication of a relation between structure and sequence in loops. A taxonomy is proposed that classifies the various clusters. It relies on two terms: the mean distance between the first and last C alpha in one cluster and, perpendicular to this line, the distance to the center of gravity of the cluster. It is noteworthy that the differently populated clusters represented in such 2D plots can be separated. Thus, although the conformations of loops in globular proteins could cover a continuum, it has been possible to cluster them into a limited number of well populated families and superfamilies. This basic feature of protein architecture could be further exploited to better predict their geometry.
Collapse
Affiliation(s)
- J M Kwasigroch
- Systèmes Moléculaires et Biologie Structurale, Laboratoire de Minéralogie Cristallographie, Universités Paris, France
| | | | | |
Collapse
|
48
|
Déret S, Maissiat C, Aucouturier P, Chomilier J. SUBIM: a program for analysing the Kabat database and determining the variability subgroup of a new immunoglobulin sequence. Comput Appl Biosci 1995; 11:435-9. [PMID: 8521053 DOI: 10.1093/bioinformatics/11.4.435] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Although various programs are available to extract all the information included in protein sequence databases, none is dedicated to immunoglobulins. For this purpose, we designed a program, SUBIM, which is adapted to the Kabat database specialized in immunoglobulin sequences. Besides all the possibilities of any database searching program, SUBIM analyses new sequences of variable regions and determines the variability subgroup they belong to. It also numbers the new sequence according to the system established by Kabat and co-workers for an easier comparison with the other immunoglobulins, thus realizing an automatic alignment with other members of a given type of immunoglobulin chain. This program is largely machine independent and requires very little memory, and should help biochemists concerned with new immunoglobulin sequences.
Collapse
Affiliation(s)
- S Déret
- Laboratoire d'Immunologie et Immunopathologie, CNRS URA 1172, Poitiers, France
| | | | | | | |
Collapse
|
49
|
Cantalloube H, Labesse G, Chomilier J, Nahum C, Cho YY, Chams V, Achour A, Lachgar A, Mbika JP, Issing W. Automat and BLAST: comparison of two protein sequence similarity search programs. Comput Appl Biosci 1995; 11:261-72. [PMID: 7583694 DOI: 10.1093/bioinformatics/11.3.261] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Since the early 1980s, protein/DNA sequence similarity search has become of major importance to biologists, and the need for fast and efficient tools grows with the size of databanks. Two programs use the strategy of finite state deterministic automatons to accomplish these searches. One of these two is BLAST, which is now widely used, and the other Automat, which has just been published. The differences and similarities in their basic principles, their use and their performances are analysed in this paper in order to allow optimal use of these important softwares.
Collapse
Affiliation(s)
- H Cantalloube
- Laboratoire de Physiologie Cellulaire, Université P. et M. Curie, Paris, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Labesse G, Vidal-Cros A, Chomilier J, Gaudry M, Mornon JP. Structural comparisons lead to the definition of a new superfamily of NAD(P)(H)-accepting oxidoreductases: the single-domain reductases/epimerases/dehydrogenases (the 'RED' family). Biochem J 1994; 304 ( Pt 1):95-9. [PMID: 7998963 PMCID: PMC1137457 DOI: 10.1042/bj3040095] [Citation(s) in RCA: 63] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Using both primary- and tertiary-structure comparisons, we have established new structural similarities shared by reductases, epimerases and dehydrogenases not previously known to be related. Despite the low sequence identity (down to 10%), short consensus segments are identified. We show that the sequence, the active site and the supersecondary structure are well conserved in these proteins. New homologues (the protochlorophyllide reductases) are detected, and we define a new superfamily composed of single-domain dinucleotide-binding enzymes. Rules for the cofactor-binding specificity are deduced from our sequence alignment. The involvement of some amino acids in catalysis is discussed. Comparison with two-domain dehydrogenases allows us to distinguish two general mechanisms of divergent evolution.
Collapse
Affiliation(s)
- G Labesse
- Laboratoire de Minéralogie-Cristallographie, Universités PVI et PVII, C.N.R.S. U.R.A. 09, Paris, France
| | | | | | | | | |
Collapse
|