1
|
Bruley A, Bitard-Feildel T, Callebaut I, Duprat E. A sequence-based foldability score combined with AlphaFold2 predictions to disentangle the protein order/disorder continuum. Proteins 2023; 91:466-484. [PMID: 36306150 DOI: 10.1002/prot.26441] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/14/2022] [Accepted: 10/18/2022] [Indexed: 11/11/2022]
Abstract
Order and disorder govern protein functions, but there is a great diversity in disorder, from regions that are-and stay-fully disordered to conditional order. This diversity is still difficult to decipher even though it is encoded in the amino acid sequences. Here, we developed an analytic Python package, named pyHCA, to estimate the foldability of a protein segment from the only information of its amino acid sequence and based on a measure of its density in regular secondary structures associated with hydrophobic clusters, as defined by the hydrophobic cluster analysis (HCA) approach. The tool was designed by optimizing the separation between foldable segments from databases of disorder (DisProt) and order (SCOPe [soluble domains] and OPM [transmembrane domains]). It allows to specify the ratio between order, embodied by regular secondary structures (either participating in the hydrophobic core of well-folded 3D structures or conditionally formed in intrinsically disordered regions) and disorder. We illustrated the relevance of pyHCA with several examples and applied it to the sequences of the proteomes of 21 species ranging from prokaryotes and archaea to unicellular and multicellular eukaryotes, for which structure models are provided in the AlphaFold protein structure database. Cases of low-confidence scores related to disorder were distinguished from those of sequences that we identified as foldable but are still excluded from accurate modeling by AlphaFold2 due to a lack of sequence homologs or to compositional biases. Overall, our approach is complementary to AlphaFold2, providing guides to map structural innovations through evolutionary processes, at proteome and gene scales.
Collapse
Affiliation(s)
- Apolline Bruley
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Tristan Bitard-Feildel
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Elodie Duprat
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| |
Collapse
|
2
|
Lamiable A, Bitard-Feildel T, Rebehmed J, Quintus F, Schoentgen F, Mornon JP, Callebaut I. A topology-based investigation of protein interaction sites using Hydrophobic Cluster Analysis. Biochimie 2019; 167:68-80. [PMID: 31525399 DOI: 10.1016/j.biochi.2019.09.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 09/11/2019] [Indexed: 01/20/2023]
Abstract
Hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA), are conditioned binary patterns, made of hydrophobic and non-hydrophobic positions, whose limits fit well those of regular secondary structures. They were proved to be useful for predicting secondary structures in proteins from the only information of a single amino acid sequence and have permitted to assess, in a comprehensive way, the leading role of binary patterns in secondary structure preference towards a particular state. Here, we considered the available experimental 3D structures of protein globular domains to enlarge our previously reported hydrophobic cluster database (HCDB), almost doubling the number of hydrophobic cluster species (each species being defined by a unique binary pattern) that represent the most frequent structural bricks encountered within protein globular domains. We then used this updated HCDB to show that the hydrophobic amino acids of discordant clusters, i.e. those less abundant clusters for which the observed secondary structure is in disagreement with the binary pattern preference of the species to which they belong, are more exposed to solvent and are more involved in protein interfaces than the hydrophobic amino acids of concordant clusters. As amino acid composition differs between concordant/discordant clusters, considering binary patterns may be used to gain novel insights into key features of protein globular domain cores and surfaces. It can also provide useful information on possible conformational plasticity, including disorder to order transitions.
Collapse
Affiliation(s)
- Alexis Lamiable
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Tristan Bitard-Feildel
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Joseph Rebehmed
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France; Lebanese American University, Department of Computer Science and Mathematics, Beirut, Lebanon
| | - Flavien Quintus
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Françoise Schoentgen
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Jean-Paul Mornon
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France.
| |
Collapse
|
3
|
Sequence characteristics responsible for protein‐protein interactions in the intrinsically disordered regions of caseins, amelogenins, and small heat‐shock proteins. Biopolymers 2019; 110:e23319. [DOI: 10.1002/bip.23319] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Revised: 06/11/2019] [Accepted: 06/19/2019] [Indexed: 01/01/2023]
|
4
|
Bitard‐Feildel T, Lamiable A, Mornon J, Callebaut I. Order in Disorder as Observed by the "Hydrophobic Cluster Analysis" of Protein Sequences. Proteomics 2018; 18:e1800054. [PMID: 30299594 PMCID: PMC7168002 DOI: 10.1002/pmic.201800054] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Revised: 08/29/2018] [Indexed: 12/17/2022]
Abstract
Hydrophobic cluster analysis (HCA) is an original approach for protein sequence analysis, which provides access to the foldable repertoire of the protein universe, including yet unannotated protein segments ("dark proteome"). Foldable segments correspond to ordered regions, as well as to intrinsically disordered regions (IDRs) undergoing disorder to order transitions. In this review, how HCA can be used to give insight into this last category of foldable segments is illustrated, with examples matching known 3D structures. After reviewing the HCA principles, examples of short foldable segments are given, which often contain short linear motifs, typically matching hydrophobic clusters. These segments become ordered upon contact with partners, with secondary structure preferences generally corresponding to those observed in the 3D structures within the complexes. Such small foldable segments are sometimes larger than the segments of known 3D structures, including flanking hydrophobic clusters that may be critical for interaction specificity or regulation, as well as intervening sequences allowing fuzziness. Cases of larger conditionally disordered domains are also presented, with lower density in hydrophobic clusters than well-folded globular domains or with exposed hydrophobic patches, which are stabilized by interaction with partners.
Collapse
Affiliation(s)
- Tristan Bitard‐Feildel
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC)Institut de recherche pour le développement (IRD)UMR CNRS 7590Muséum National d'Histoire NaturelleSorbonne Université75005ParisFrance
- Laboratoire de Biologie Computationnelle et Quantitative (LCQB)Institute of Biology Paris‐Seine (IBPS)Centre national de la recherche scientifique (CNRS)Sorbonne Université75005ParisFrance
| | - Alexis Lamiable
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC)Institut de recherche pour le développement (IRD)UMR CNRS 7590Muséum National d'Histoire NaturelleSorbonne Université75005ParisFrance
| | - Jean‐Paul Mornon
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC)Institut de recherche pour le développement (IRD)UMR CNRS 7590Muséum National d'Histoire NaturelleSorbonne Université75005ParisFrance
| | - Isabelle Callebaut
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC)Institut de recherche pour le développement (IRD)UMR CNRS 7590Muséum National d'Histoire NaturelleSorbonne Université75005ParisFrance
| |
Collapse
|
5
|
|
6
|
Hoffmann B, Elbahnsi A, Lehn P, Décout JL, Pietrucci F, Mornon JP, Callebaut I. Combining theoretical and experimental data to decipher CFTR 3D structures and functions. Cell Mol Life Sci 2018; 75:3829-3855. [PMID: 29779042 PMCID: PMC11105360 DOI: 10.1007/s00018-018-2835-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 05/04/2018] [Accepted: 05/07/2018] [Indexed: 12/15/2022]
Abstract
Cryo-electron microscopy (cryo-EM) has recently provided invaluable experimental data about the full-length cystic fibrosis transmembrane conductance regulator (CFTR) 3D structure. However, this experimental information deals with inactive states of the channel, either in an apo, quiescent conformation, in which nucleotide-binding domains (NBDs) are widely separated or in an ATP-bound, yet closed conformation. Here, we show that 3D structure models of the open and closed forms of the channel, now further supported by metadynamics simulations and by comparison with the cryo-EM data, could be used to gain some insights into critical features of the conformational transition toward active CFTR forms. These critical elements lie within membrane-spanning domains but also within NBD1 and the N-terminal extension, in which conformational plasticity is predicted to occur to help the interaction with filamin, one of the CFTR cellular partners.
Collapse
Affiliation(s)
- Brice Hoffmann
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
- Iktos, Paris, France
| | - Ahmad Elbahnsi
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Pierre Lehn
- INSERM U1078, SFR ScInBioS, Université de Bretagne Occidentale, Brest, France
| | | | - Fabio Pietrucci
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| | - Jean-Paul Mornon
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France.
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005, Paris, France
| |
Collapse
|
7
|
The Impact of tagSNPs in CXCL16 Gene on the Risk of Myocardial Infarction in a Chinese Han Population. DISEASE MARKERS 2017; 2017:9463272. [PMID: 28286356 PMCID: PMC5329692 DOI: 10.1155/2017/9463272] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 12/25/2016] [Accepted: 01/22/2017] [Indexed: 02/06/2023]
Abstract
CXCL16 has been demonstrated to be involved in the development of atherosclerosis and myocardial infarction (MI). Nonetheless, the role of the CXCL16 polymorphisms on MI pathogenesis is far to be elucidated. We herein genotyped four tagSNPs in CXCL16 gene (rs2304973, rs1050998, rs3744700, and rs8123) in 275 MI patients and 670 control subjects, aimed at probing into the impact of CXCL16 polymorphisms on individual susceptibility to MI. Multivariate logistic regression analysis showed that C allele (OR = 1.31, 95% CI = 1.03–1.66, and P = 0.029) and CC genotype (OR = 1.84, 95% CI = 1.11–3.06, and P = 0.018) of rs1050998 were associated with increased MI risk; and C allele (OR = 0.77, 95% CI = 0.60–0.98, and P = 0.036) of rs8123 exhibited decreased MI risk, while the other two tagSNPs had no significant effect. Consistently, the haplotype rs2304973T-rs1050998C-rs3744700G-rs8123A containing the C allele of rs1050998 and A allele of rs8123 exhibited elevated MI risk (OR = 1.41, 95% CI = 1.02–1.96, and P = 0.037). Further stratified analysis unveiled a more apparent association with MI risk among younger subjects (≤60 years old). Taken together, our results provided the first evidence that CXCL16 polymorphisms significantly impacted MI risk in Chinese subjects.
Collapse
|
8
|
Kavianpour H, Vasighi M. Structural classification of proteins using texture descriptors extracted from the cellular automata image. Amino Acids 2016; 49:261-271. [DOI: 10.1007/s00726-016-2354-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 10/18/2016] [Indexed: 12/12/2022]
|