1
|
Mier P, Andrade-Navarro MA, Morett E. Homorepeat variability within the human population. NAR Genom Bioinform 2024; 6:lqae053. [PMID: 38774515 PMCID: PMC11106027 DOI: 10.1093/nargab/lqae053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 04/12/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024] Open
Abstract
Genetic variation within populations plays a crucial role in driving evolution. Unlike the average protein sequence, the evolution of homorepeats can be influenced by DNA replication slippage, when DNA polymerases either add or skip repeats of nucleotides. While there are some diseases known to be caused by abnormal changes in the length of amino acid homorepeats, naturally occurring variations in homorepeat length remain relatively unexplored. In our study, we examined the variation in amino acid homorepeat length of human individuals by analyzing 125 748 exomes, as well as 15 708 whole genomes. Our analyses revealed significant variability in homorepeat length across the human population, indicating that these motifs are prone to mutations at higher rates than non repeat sequences. We focused our study on glutamine homorepeats, also known as polyQ sequences, and found that shorter polyQ sequences tend to exhibit greater length variation, while longer ones primarily undergo deletions. Notably, polyQ sequencesthat are more conserved across primates tend to show less variation within the human population, indicating stronger selective pressure to maintain their length. Overall, our results demonstrate that there is large natural variation in the length of homorepeats within the human population, with no apparent impact on observable traits.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Enrique Morett
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México (UNAM), Av. Universidad 2001, Cuernavaca, Morelos 62210, Mexico
| |
Collapse
|
2
|
Antón R, Treviño MÁ, Pantoja-Uceda D, Félix S, Babu M, Cabrita EJ, Zweckstetter M, Tinnefeld P, Vera AM, Oroz J. Alternative low-populated conformations prompt phase transitions in polyalanine repeat expansions. Nat Commun 2024; 15:1925. [PMID: 38431667 PMCID: PMC10908835 DOI: 10.1038/s41467-024-46236-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 02/19/2024] [Indexed: 03/05/2024] Open
Abstract
Abnormal trinucleotide repeat expansions alter protein conformation causing malfunction and contribute to a significant number of incurable human diseases. Scarce structural insights available on disease-related homorepeat expansions hinder the design of effective therapeutics. Here, we present the dynamic structure of human PHOX2B C-terminal fragment, which contains the longest polyalanine segment known in mammals. The major α-helical conformation of the polyalanine tract is solely extended by polyalanine expansions in PHOX2B, which are responsible for most congenital central hypoventilation syndrome cases. However, polyalanine expansions in PHOX2B additionally promote nascent homorepeat conformations that trigger length-dependent phase transitions into solid condensates that capture wild-type PHOX2B. Remarkably, HSP70 and HSP90 chaperones specifically seize PHOX2B alternative conformations preventing phase transitions. The precise observation of emerging polymorphs in expanded PHOX2B postulates unbalanced phase transitions as distinct pathophysiological mechanisms in homorepeat expansion diseases, paving the way towards the search of therapeutics modulating biomolecular condensates in central hypoventilation syndrome.
Collapse
Affiliation(s)
- Rosa Antón
- Instituto de Química Física Blas Cabrera (IQF), CSIC, E-28006, Madrid, Spain
| | - Miguel Á Treviño
- Instituto de Química Física Blas Cabrera (IQF), CSIC, E-28006, Madrid, Spain
| | - David Pantoja-Uceda
- Instituto de Química Física Blas Cabrera (IQF), CSIC, E-28006, Madrid, Spain
| | - Sara Félix
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2819-516, Caparica, Portugal
- UCIBIO, Department of Chemistry, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2819-516, Caparica, Portugal
| | - María Babu
- German Center for Neurodegenerative Diseases (DZNE), 37075, Göttingen, Germany
| | - Eurico J Cabrita
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2819-516, Caparica, Portugal
- UCIBIO, Department of Chemistry, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2819-516, Caparica, Portugal
| | - Markus Zweckstetter
- German Center for Neurodegenerative Diseases (DZNE), 37075, Göttingen, Germany
- Department for NMR-based Structural Biology, Max Planck Institute for Multidisciplinary Sciences, 37077, Göttingen, Germany
| | - Philip Tinnefeld
- Department of Chemistry and Center for NanoScience, Ludwig-Maximilians-Universität München, München, 81377, Germany
| | - Andrés M Vera
- Department of Chemistry and Center for NanoScience, Ludwig-Maximilians-Universität München, München, 81377, Germany
| | - Javier Oroz
- Instituto de Química Física Blas Cabrera (IQF), CSIC, E-28006, Madrid, Spain.
| |
Collapse
|
3
|
Luo Z, Wang R, Sun Y, Liu J, Chen Z, Zhang YJ. Interpretable feature extraction and dimensionality reduction in ESM2 for protein localization prediction. Brief Bioinform 2024; 25:bbad534. [PMID: 38279650 PMCID: PMC10818170 DOI: 10.1093/bib/bbad534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 11/19/2023] [Accepted: 12/15/2024] [Indexed: 01/28/2024] Open
Abstract
As the application of large language models (LLMs) has broadened into the realm of biological predictions, leveraging their capacity for self-supervised learning to create feature representations of amino acid sequences, these models have set a new benchmark in tackling downstream challenges, such as subcellular localization. However, previous studies have primarily focused on either the structural design of models or differing strategies for fine-tuning, largely overlooking investigations into the nature of the features derived from LLMs. In this research, we propose different ESM2 representation extraction strategies, considering both the character type and position within the ESM2 input sequence. Using model dimensionality reduction, predictive analysis and interpretability techniques, we have illuminated potential associations between diverse feature types and specific subcellular localizations. Particularly, the prediction of Mitochondrion and Golgi apparatus prefer segments feature closer to the N-terminal, and phosphorylation site-based features could mirror phosphorylation properties. We also evaluate the prediction performance and interpretability robustness of Random Forest and Deep Neural Networks with varied feature inputs. This work offers novel insights into maximizing LLMs' utility, understanding their mechanisms, and extracting biological domain knowledge. Furthermore, we have made the code, feature extraction API, and all relevant materials available at https://github.com/yujuan-zhang/feature-representation-for-LLMs.
Collapse
Affiliation(s)
- Zeyu Luo
- Chongqing Key Laboratory of Vector Insects, Chongqing Key Laboratory of Animal Biology, College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Rui Wang
- Chongqing Key Laboratory of Vector Insects, Chongqing Key Laboratory of Animal Biology, College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Yawen Sun
- Chongqing Key Laboratory of Vector Insects, Chongqing Key Laboratory of Animal Biology, College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Junhao Liu
- Chongqing Key Laboratory of Vector Insects, Chongqing Key Laboratory of Animal Biology, College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Zongqing Chen
- School of Mathematical Sciences, Chongqing Normal University, Chongqing 400047, China
| | - Yu-Juan Zhang
- Chongqing Key Laboratory of Vector Insects, Chongqing Key Laboratory of Animal Biology, College of Life Science, Chongqing Normal University, Chongqing 401331, China
| |
Collapse
|
4
|
Monzon AM, Arrías PN, Elofsson A, Mier P, Andrade-Navarro MA, Bevilacqua M, Clementel D, Bateman A, Hirsh L, Fornasari MS, Parisi G, Piovesan D, Kajava AV, Tosatto SCE. A STRP-ed definition of Structured Tandem Repeats in Proteins. J Struct Biol 2023; 215:108023. [PMID: 37652396 DOI: 10.1016/j.jsb.2023.108023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/31/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023]
Abstract
Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Dept. of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131 Padova, Italy
| | - Paula Nazarena Arrías
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Arne Elofsson
- Dept. of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Tomtebodavägen 23, 171 21 Solna, Sweden
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Martina Bevilacqua
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Clementel
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Layla Hirsh
- Dept. of Engineering, Faculty of Science and Engineering, Pontifical Catholic University of Peru, Av. Universitaria 1801 San Miguel, Lima 32, Lima, Peru
| | - Maria Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy.
| |
Collapse
|
5
|
Elena-Real CA, Mier P, Sibille N, Andrade-Navarro MA, Bernadó P. Structure-function relationships in protein homorepeats. Curr Opin Struct Biol 2023; 83:102726. [PMID: 37924569 DOI: 10.1016/j.sbi.2023.102726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 10/06/2023] [Accepted: 10/09/2023] [Indexed: 11/06/2023]
Abstract
Homorepeats (or polyX), protein segments containing repetitions of the same amino acid, are abundant in proteomes from all kingdoms of life and are involved in crucial biological functions as well as several neurodegenerative and developmental diseases. Mainly inserted in disordered segments of proteins, the structure/function relationships of homorepeats remain largely unexplored. In this review, we summarize present knowledge for the most abundant homorepeats, highlighting the role of the inherent structure and the conformational influence exerted by their flanking regions. Recent experimental and computational methods enable residue-specific investigations of these regions and promise novel structural and dynamic information for this elusive group of proteins. This information should increase our knowledge about the structural bases of phenomena such as liquid-liquid phase separation and trinucleotide repeat disorders.
Collapse
Affiliation(s)
- Carlos A Elena-Real
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS. 29 rue de Navacelles, 34090 Montpellier, France. https://twitter.com/carloselenareal
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz. Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Nathalie Sibille
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS. 29 rue de Navacelles, 34090 Montpellier, France
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz. Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Pau Bernadó
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS. 29 rue de Navacelles, 34090 Montpellier, France.
| |
Collapse
|
6
|
Yang J, Cheng WX, Wu G, Sheng S, Zhang P. Prediction of folding patterns for intrinsic disordered protein. Sci Rep 2023; 13:20343. [PMID: 37990040 PMCID: PMC10663623 DOI: 10.1038/s41598-023-45969-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 10/26/2023] [Indexed: 11/23/2023] Open
Abstract
The conformation flexibility of natural protein causes both complexity and difficulty to understand the relationship between structure and function. The prediction of intrinsically disordered protein primarily is focusing on to disclose the regions with structural flexibility involving relevant biological functions and various diseases. The order of amino acids in protein sequence determines possible conformations, folding flexibility and biological function. Although many methods provided the information of intrinsically disordered protein (IDP), but the results are mainly limited to determine the locations of regions without knowledge of possible folding conformations. Here, the developed protein folding fingerprint adopted the protein folding variation matrix (PFVM) to reveal all possible folding patterns for the intrinsically disordered protein along its sequence. The PFVM integrally exhibited the intrinsically disordered protein with disordering regions, degree of disorder as well as folding pattern. The advantage of PFVM will not only provide rich information for IDP, but also may promote the study of protein folding problem.
Collapse
Affiliation(s)
- Jiaan Yang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China.
- Micro Biotech, Ltd., Shanghai, 200123, China.
| | - Wen-Xiang Cheng
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China
| | - Gang Wu
- School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Sitong Sheng
- HYK High-throughput Biotechnology Institute, Shenzhen, 518057, Guangdong, China
| | - Peng Zhang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China
| |
Collapse
|
7
|
Mier P, Andrade-Navarro MA. The nucleotide landscape of polyXY regions. Comput Struct Biotechnol J 2023; 21:5408-5412. [PMID: 38022702 PMCID: PMC10652141 DOI: 10.1016/j.csbj.2023.10.054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 10/30/2023] [Accepted: 10/30/2023] [Indexed: 12/01/2023] Open
Abstract
PolyXY regions are compositionally biased regions composed of two different amino acids. They are classified according to the arrangement of the two amino acid types 'X' and 'Y' into direpeats (composed of alternating amino acids, e.g. 'XYXYXY'), joined (composed of two consecutive stretches of each amino acid, e.g. 'XXXYYY') and shuffled (other arrangements, e.g., 'XYXXYY'). They have been characterized at the amino acid level in all domains of life, and are described as often found within intrinsically disordered regions. Since DNA replication slippage has been proposed as a driver of repeat variation, and given that some polyXY have a repetitive nature, we hypothesized that characterizing the nucleotide coding of various types of polyXY could give hints about their origin and evolution. To test this, we obtained all polyXY regions in the human transcriptome, categorized them, and studied their coding nucleotide sequences. We observed that polyXY exacerbates the codon biases, and that the similarity between the X and Y codons is higher than in the background proteome. Our results support a general mechanism of emergence and evolution of polyXY from single-codon polyX. PolyXY are revealed as hotspots for replication slippage, particularly those composed of repeats: joined and direpeat polyXY. Inter-conversion to shuffled polyXY disrupts nucleotide repeats and restricts further evolution by replication slippage, a mechanism that we previously observed in polyX. Our results shed light on polyXY composition and should simplify the determination of their functions.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A. Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| |
Collapse
|
8
|
Vaglietti S, Villeri V, Dell’Oca M, Marchetti C, Cesano F, Rizzo F, Miller D, LaPierre L, Pelassa I, Monje FJ, Colnaghi L, Ghirardi M, Fiumara F. PolyQ length-based molecular encoding of vocalization frequency in FOXP2. iScience 2023; 26:108036. [PMID: 37860754 PMCID: PMC10582585 DOI: 10.1016/j.isci.2023.108036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 07/18/2023] [Accepted: 09/21/2023] [Indexed: 10/21/2023] Open
Abstract
The transcription factor FOXP2, a regulator of vocalization- and speech/language-related phenotypes, contains two long polyQ repeats (Q1 and Q2) displaying marked, still enigmatic length variation across mammals. We found that the Q1/Q2 length ratio quantitatively encodes vocalization frequency ranges, from the infrasonic to the ultrasonic, displaying striking convergent evolution patterns. Thus, species emitting ultrasonic vocalizations converge with bats in having a low ratio, whereas species vocalizing in the low-frequency/infrasonic range converge with elephants and whales, which have higher ratios. Similar, taxon-specific patterns were observed for the FOXP2-related protein FOXP1. At the molecular level, we observed that the FOXP2 polyQ tracts form coiled coils, assembling into condensates and fibrils, and drive liquid-liquid phase separation (LLPS). By integrating evolutionary and molecular analyses, we found that polyQ length variation related to vocalization frequency impacts FOXP2 structure, LLPS, and transcriptional activity, thus defining a novel form of polyQ length-based molecular encoding of vocalization frequency.
Collapse
Affiliation(s)
- Serena Vaglietti
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Veronica Villeri
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Marco Dell’Oca
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Chiara Marchetti
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Federico Cesano
- Department of Chemistry, University of Turin, 10125 Turin, Italy
| | - Francesca Rizzo
- Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR 518057, China
| | - Dave Miller
- Cascades Pika Watch, Oregon Zoo, Portland, OR 97221, USA
| | - Louis LaPierre
- Deptartment of Natural Science, Lower Columbia College, Longview, WA 98632, USA
| | - Ilaria Pelassa
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Francisco J. Monje
- Department of Neurophysiology and Neuropharmacology, Medical University of Vienna, 1090 Vienna, Austria
| | - Luca Colnaghi
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, 20132 Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, 20132 Milan, Italy
| | - Mirella Ghirardi
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| | - Ferdinando Fiumara
- Rita Levi Montalcini Department of Neuroscience, University of Turin, 10125 Turin, Italy
| |
Collapse
|
9
|
Manso JA, Carabias A, Sárkány Z, de Pereda JM, Pereira PJB, Macedo-Ribeiro S. Pathogen-specific structural features of Candida albicans Ras1 activation complex: uncovering new antifungal drug targets. mBio 2023; 14:e0063823. [PMID: 37526476 PMCID: PMC10470544 DOI: 10.1128/mbio.00638-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 06/16/2023] [Indexed: 08/02/2023] Open
Abstract
An important feature associated with Candida albicans pathogenicity is its ability to switch between yeast and hyphal forms, a process in which CaRas1 plays a key role. CaRas1 is activated by the guanine nucleotide exchange factor (GEF) CaCdc25, triggering hyphal growth-related signaling pathways through its conserved GTP-binding (G)-domain. An important function in hyphal growth has also been proposed for the long hypervariable region downstream the G-domain, whose unusual content of polyglutamine stretches and Q/N repeats make CaRas1 unique within Ras proteins. Despite its biological importance, both the structure of CaRas1 and the molecular basis of its activation by CaCdc25 remain unexplored. Here, we show that CaRas1 has an elongated shape and limited conformational flexibility and that its hypervariable region contains helical structural elements, likely forming an intramolecular coiled-coil. Functional assays disclosed that CaRas1-activation by CaCdc25 is highly efficient, with activities up to 2,000-fold higher than reported for human GEFs. The crystal structure of the CaCdc25 catalytic region revealed an active conformation for the α-helical hairpin, critical for CaRas1-activation, unveiling a specific region exclusive to CTG-clade species. Structural studies on CaRas1/CaCdc25 complexes also revealed an interaction surface clearly distinct from that of homologous human complexes. Furthermore, we identified an inhibitory synthetic peptide, prompting the proposal of a key regulatory mechanism for CaCdc25. To our knowledge, this is the first report of specific inhibition of the CaRas1-activation via targeting its GEF. This, together with their unique pathogen-structural features, disclose a set of novel strategies to specifically block this important virulence-related mechanism. IMPORTANCE Candida albicans is the main causative agent of candidiasis, the commonest fungal infection in humans. The eukaryotic nature of C. albicans and the rapid emergence of antifungal resistance raise the challenge of identifying novel drug targets to battle this prevalent and life-threatening disease. CaRas1 and CaCdc25 are key players in the activation of signaling pathways triggering multiple virulence traits, including the yeast-to-hypha interconversion. The structural similarity of the conserved G-domain of CaRas1 to those of human homologs and the lack of structural information on CaCdc25 has impeded progress in targeting these proteins. The unique structural and functional features for CaRas1 and CaCdc25 presented here, together with the identification of a synthetic peptide capable of specifically inhibiting the GEF activity of CaCdc25, open new possibilities to uncover new antifungal drug targets against C. albicans virulence.
Collapse
Affiliation(s)
- José A. Manso
- IBMC–Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto, Portugal
- i3S–Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
| | - Arturo Carabias
- Instituto de Biología Molecular y Celular del Cáncer, Consejo Superior de Investigaciones Científicas-University of Salamanca, Salamanca, Spain
| | - Zsuzsa Sárkány
- IBMC–Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto, Portugal
- i3S–Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
| | - José M. de Pereda
- Instituto de Biología Molecular y Celular del Cáncer, Consejo Superior de Investigaciones Científicas-University of Salamanca, Salamanca, Spain
| | - Pedro José Barbosa Pereira
- IBMC–Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto, Portugal
- i3S–Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
| | - Sandra Macedo-Ribeiro
- IBMC–Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto, Portugal
- i3S–Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
| |
Collapse
|
10
|
Singh AK, Amar I, Ramadasan H, Kappagantula KS, Chavali S. Proteins with amino acid repeats constitute a rapidly evolvable and human-specific essentialome. Cell Rep 2023; 42:112811. [PMID: 37453061 DOI: 10.1016/j.celrep.2023.112811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 05/30/2023] [Accepted: 06/29/2023] [Indexed: 07/18/2023] Open
Abstract
Protein products of essential genes, indispensable for organismal survival, are highly conserved and bring about fundamental functions. Interestingly, proteins that contain amino acid homorepeats that tend to evolve rapidly are enriched in eukaryotic essentialomes. Why are proteins with hypermutable homorepeats enriched in conserved and functionally vital essential proteins? We solve this functional versus evolutionary paradox by demonstrating that human essential proteins with homorepeats bring about crosstalk across biological processes through high interactability and have distinct regulatory functions affecting expansive global regulation. Importantly, essential proteins with homorepeats rapidly diverge with the amino acid substitutions frequently affecting functional sites, likely facilitating rapid adaptability. Strikingly, essential proteins with homorepeats influence human-specific embryonic and brain development, implying that the presence of homorepeats could contribute to the emergence of human-specific processes. Thus, we propose that homorepeat-containing essential proteins affecting species-specific traits can be potential intervention targets across pathologies, including cancers and neurological disorders.
Collapse
Affiliation(s)
- Anjali K Singh
- Department of Biology, Indian Institute of Science Education and Research (IISER) Tirupati, Tirupati 517507, Andhra Pradesh, India
| | - Ishita Amar
- Department of Biology, Indian Institute of Science Education and Research (IISER) Tirupati, Tirupati 517507, Andhra Pradesh, India
| | - Harikrishnan Ramadasan
- Department of Biology, Indian Institute of Science Education and Research (IISER) Tirupati, Tirupati 517507, Andhra Pradesh, India
| | - Keertana S Kappagantula
- Department of Biology, Indian Institute of Science Education and Research (IISER) Tirupati, Tirupati 517507, Andhra Pradesh, India
| | - Sreenivas Chavali
- Department of Biology, Indian Institute of Science Education and Research (IISER) Tirupati, Tirupati 517507, Andhra Pradesh, India.
| |
Collapse
|
11
|
Yao SY, Wang JF, Xu Z, Meng Y, Xue Y, Yang F, Yao WB, Gao XD, Chen S. A peptide rich in glycine-serine-alanine repeats ameliorates Alzheimer-type neurodegeneration. Br J Pharmacol 2023; 180:1878-1896. [PMID: 36727262 DOI: 10.1111/bph.16048] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 12/04/2022] [Accepted: 01/23/2023] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND AND PURPOSE Repeated amino acid sequences in proteins are widely found, and the glycine-serine-alanine repeat is an element with a general propensity to form β-sheet aggregates as found in key pathological factors, in several neurodegenerative diseases. Such properties of this repeat may guide development of disease-modifying therapies for neurodegenerative disease. However, details of its role and underlying mechanism(s) remain largely unknown. EXPERIMENTAL APPROACH Actions of specific glycine-serine-alanine repeat peptides (SNPs), especially SNP-9, on Alzheimer's disease (AD)-like abnormalities were evaluated in transgenic mice and Caenorhabditis elegans, and in rat and cell models. Entry of SNPs into the brain, SNP activity in neuronal cells and peptide entry into cells were analysed in vivo and in vitro. Cell-free systems and the yeast two-hybrid system were also used to explore possible targets of SNP-9, and interactions of potential targets with SNP-9 were confirmed in cell-based systems. KEY RESULTS We first identified SNP-9 as a potent neuroprotective peptide with the activity to decrease oligomeric amyloid β (Aβ) via co-assembling with the toxic Aβ oligomer to form hetero-oligomers. Also, calcyclin-binding protein was found to act as a SNP-9-binding protein, by screening of a human brain cDNA library. Such binding showed that SNP-9 could regulate the abnormal hyperphosphorylation of tau via calcyclin-binding protein. CONCLUSION AND IMPLICATIONS Our study provides a foundation for development of SNPs, especially SNP-9, as potential therapeutic interventions for AD. We propose SNP-9 as a potential therapeutic agent for the treatment of AD.
Collapse
Affiliation(s)
- Si-Yuan Yao
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Jia-Fan Wang
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Zheng Xu
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Yue Meng
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Yue Xue
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Fan Yang
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Wen-Bing Yao
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Xiang-Dong Gao
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Song Chen
- Jiangsu Key Laboratory of Druggability of Biopharmaceuticals, State Key Laboratory of Natural Medicines, School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
12
|
Sousa A, Rocha S, Vieira J, Reboiro-Jato M, López-Fernández H, Vieira CP. On the identification of potential novel therapeutic targets for spinocerebellar ataxia type 1 (SCA1) neurodegenerative disease using EvoPPI3. J Integr Bioinform 2023; 20:jib-2022-0056. [PMID: 36848492 PMCID: PMC10561075 DOI: 10.1515/jib-2022-0056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 11/26/2022] [Indexed: 03/01/2023] Open
Abstract
EvoPPI (http://evoppi.i3s.up.pt), a meta-database for protein-protein interactions (PPI), has been upgraded (EvoPPI3) to accept new types of data, namely, PPI from patients, cell lines, and animal models, as well as data from gene modifier experiments, for nine neurodegenerative polyglutamine (polyQ) diseases caused by an abnormal expansion of the polyQ tract. The integration of the different types of data allows users to easily compare them, as here shown for Ataxin-1, the polyQ protein involved in spinocerebellar ataxia type 1 (SCA1) disease. Using all available datasets and the data here obtained for Drosophila melanogaster wt and exp Ataxin-1 mutants (also available at EvoPPI3), we show that, in humans, the Ataxin-1 network is much larger than previously thought (380 interactors), with at least 909 interactors. The functional profiling of the newly identified interactors is similar to the ones already reported in the main PPI databases. 16 out of 909 interactors are putative novel SCA1 therapeutic targets, and all but one are already being studied in the context of this disease. The 16 proteins are mainly involved in binding and catalytic activity (mainly kinase activity), functional features already thought to be important in the SCA1 disease.
Collapse
Affiliation(s)
- André Sousa
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| | - Sara Rocha
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| | - Jorge Vieira
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
- Instituto de Biologia Molecular e Celular (IBMC), Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| | - Miguel Reboiro-Jato
- Department of Computer Science, CINBIO, Universidade de Vigo, ESEI – Escuela Superior de Ingeniería Informática, 32004Ourense, Spain
- SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, 36213 Vigo, Spain
| | - Hugo López-Fernández
- Department of Computer Science, CINBIO, Universidade de Vigo, ESEI – Escuela Superior de Ingeniería Informática, 32004Ourense, Spain
- SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, 36213 Vigo, Spain
| | - Cristina P. Vieira
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
- Instituto de Biologia Molecular e Celular (IBMC), Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| |
Collapse
|
13
|
Barbosa Pereira PJ, Manso JA, Macedo-Ribeiro S. The structural plasticity of polyglutamine repeats. Curr Opin Struct Biol 2023; 80:102607. [PMID: 37178477 DOI: 10.1016/j.sbi.2023.102607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 04/11/2023] [Accepted: 04/12/2023] [Indexed: 05/15/2023]
Abstract
From yeast to humans, polyglutamine (polyQ) repeat tracts are found frequently in the proteome and are particularly prominent in the activation domains of transcription factors. PolyQ is a polymorphic motif that modulates functional protein-protein interactions and aberrant self-assembly. Expansion of the polyQ repeated sequences beyond critical physiological repeat length thresholds triggers self-assembly and is linked to severe pathological implications. This review provides an overview of the current knowledge on the structures of polyQ tracts in the soluble and aggregated states and discusses the influence of neighboring regions on polyQ secondary structure, aggregation, and fibril morphologies. The influence of the genetic context of the polyQ-encoding trinucleotides is briefly discussed as a challenge for future endeavors in this field.
Collapse
Affiliation(s)
- Pedro José Barbosa Pereira
- IBMC - Instituto de Biologia Molecular e Celular, Universidade do Porto, 4200-135, Porto, Portugal; Instituto de Investigação e Inovação em Saúde, Universidade do Porto, 4200-135, Porto, Portugal.
| | - José A Manso
- IBMC - Instituto de Biologia Molecular e Celular, Universidade do Porto, 4200-135, Porto, Portugal; Instituto de Investigação e Inovação em Saúde, Universidade do Porto, 4200-135, Porto, Portugal
| | - Sandra Macedo-Ribeiro
- IBMC - Instituto de Biologia Molecular e Celular, Universidade do Porto, 4200-135, Porto, Portugal; Instituto de Investigação e Inovação em Saúde, Universidade do Porto, 4200-135, Porto, Portugal
| |
Collapse
|
14
|
Petrzilek J, Pasulka J, Malik R, Horvat F, Kataruka S, Fulka H, Svoboda P. De novo emergence, existence, and demise of a protein-coding gene in murids. BMC Biol 2022; 20:272. [PMID: 36482406 PMCID: PMC9733328 DOI: 10.1186/s12915-022-01470-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/15/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Genes, principal units of genetic information, vary in complexity and evolutionary history. Less-complex genes (e.g., long non-coding RNA (lncRNA) expressing genes) readily emerge de novo from non-genic sequences and have high evolutionary turnover. Genesis of a gene may be facilitated by adoption of functional genic sequences from retrotransposon insertions. However, protein-coding sequences in extant genomes rarely lack any connection to an ancestral protein-coding sequence. RESULTS We describe remarkable evolution of the murine gene D6Ertd527e and its orthologs in the rodent Muroidea superfamily. The D6Ertd527e emerged in a common ancestor of mice and hamsters most likely as a lncRNA-expressing gene. A major contributing factor was a long terminal repeat (LTR) retrotransposon insertion carrying an oocyte-specific promoter and a 5' terminal exon of the gene. The gene survived as an oocyte-specific lncRNA in several extant rodents while in some others the gene or its expression were lost. In the ancestral lineage of Mus musculus, the gene acquired protein-coding capacity where the bulk of the coding sequence formed through CAG (AGC) trinucleotide repeat expansion and duplications. These events generated a cytoplasmic serine-rich maternal protein. Knock-out of D6Ertd527e in mice has a small but detectable effect on fertility and the maternal transcriptome. CONCLUSIONS While this evolving gene is not showing a clear function in laboratory mice, its documented evolutionary history in Muroidea during the last ~ 40 million years provides a textbook example of how a several common mutation events can support de novo gene formation, evolution of protein-coding capacity, as well as gene's demise.
Collapse
Affiliation(s)
- Jan Petrzilek
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic ,grid.22937.3d0000 0000 9259 8492Present address: Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical University of Vienna, Vienna, Austria
| | - Josef Pasulka
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic
| | - Radek Malik
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic
| | - Filip Horvat
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic ,grid.4808.40000 0001 0657 4636Bioinformatics Group, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia
| | - Shubhangini Kataruka
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic ,grid.47100.320000000419368710Present address: Department of Genetics, Yale School of Medicine, New Haven, CT 06510 USA
| | - Helena Fulka
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic ,grid.418095.10000 0001 1015 3316Current address: Institute of Experimental Medicine of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic
| | - Petr Svoboda
- grid.418827.00000 0004 0620 870XInstitute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20 Prague 4, Czech Republic
| |
Collapse
|
15
|
Wu C, Guo D. Computational Docking Reveals Co-Evolution of C4 Carbon Delivery Enzymes in Diverse Plants. Int J Mol Sci 2022; 23:ijms232012688. [PMID: 36293547 PMCID: PMC9604239 DOI: 10.3390/ijms232012688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 10/14/2022] [Accepted: 10/19/2022] [Indexed: 11/16/2022] Open
Abstract
Proteins are modular functionalities regulating multiple cellular activities in prokaryotes and eukaryotes. As a consequence of higher plants adapting to arid and thermal conditions, C4 photosynthesis is the carbon fixation process involving multi-enzymes working in a coordinated fashion. However, how these enzymes interact with each other and whether they co-evolve in parallel to maintain interactions in different plants remain elusive to date. Here, we report our findings on the global protein co-evolution relationship and local dynamics of co-varying site shifts in key C4 photosynthetic enzymes. We found that in most of the selected key C4 photosynthetic enzymes, global pairwise co-evolution events exist to form functional couplings. Besides, protein-protein interactions between these enzymes may suggest their unknown functionalities in the carbon delivery process. For PEPC and PPCK regulation pairs, pocket formation at the interactive interface are not necessary for their function. This feature is distinct from another well-known regulation pair in C4 photosynthesis, namely, PPDK and PPDK-RP, where the pockets are necessary. Our findings facilitate the discovery of novel protein regulation types and contribute to expanding our knowledge about C4 photosynthesis.
Collapse
|
16
|
Jarnot P, Ziemska-Legiecka J, Grynberg M, Gruca A. Insights from analyses of low complexity regions with canonical methods for protein sequence comparison. Brief Bioinform 2022; 23:bbac299. [PMID: 35914952 PMCID: PMC9487646 DOI: 10.1093/bib/bbac299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 06/29/2022] [Accepted: 07/01/2022] [Indexed: 11/28/2022] Open
Abstract
Low complexity regions are fragments of protein sequences composed of only a few types of amino acids. These regions frequently occur in proteins and can play an important role in their functions. However, scientists are mainly focused on regions characterized by high diversity of amino acid composition. Similarity between regions of protein sequences frequently reflect functional similarity between them. In this article, we discuss strengths and weaknesses of the similarity analysis of low complexity regions using BLAST, HHblits and CD-HIT. These methods are considered to be the gold standard in protein similarity analysis and were designed for comparison of high complexity regions. However, we lack specialized methods that could be used to compare the similarity of low complexity regions. Therefore, we investigated the existing methods in order to understand how they can be applied to compare such regions. Our results are supported by exploratory study, discussion of amino acid composition and biological roles of selected examples. We show that existing methods need improvements to efficiently search for similar low complexity regions. We suggest features that have to be re-designed specifically for comparing low complexity regions: scoring matrix, multiple sequence alignment, e-value, local alignment and clustering based on a set of representative sequences. Results of this analysis can either be used to improve existing methods or to create new methods for the similarity analysis of low complexity regions.
Collapse
Affiliation(s)
- Patryk Jarnot
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 2A, 44-100, Gliwice, Poland
| | - Joanna Ziemska-Legiecka
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5A, 02-106, Warsaw, Poland
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5A, 02-106, Warsaw, Poland
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 2A, 44-100, Gliwice, Poland
| |
Collapse
|
17
|
Mier P, Elena-Real CA, Cortés J, Bernadó P, Andrade-Navarro MA. The sequence context in poly-alanine regions: structure, function and conservation. Bioinformatics 2022; 38:4851-4858. [PMID: 36106994 PMCID: PMC9620824 DOI: 10.1093/bioinformatics/btac610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 07/07/2022] [Accepted: 09/05/2022] [Indexed: 11/24/2022] Open
Abstract
Motivation Poly-alanine (polyA) regions are protein stretches mostly composed of alanines. Despite their abundance in eukaryotic proteomes and their association to nine inherited human diseases, the structural and functional roles exerted by polyA stretches remain poorly understood. In this work we study how the amino acid context in which polyA regions are settled in proteins influences their structure and function. Results We identified glycine and proline as the most abundant amino acids within polyA and in the flanking regions of polyA tracts, in human proteins as well as in 17 additional eukaryotic species. Our analyses indicate that the non-structuring nature of these two amino acids influences the α-helical conformations predicted for polyA, suggesting a relevant role in reducing the inherent aggregation propensity of long polyA. Then, we show how polyA position in protein N-termini relates with their function as transit peptides. PolyA placed just after the initial methionine is often predicted as part of mitochondrial transit peptides, whereas when placed in downstream positions, polyA are part of signal peptides. A few examples from known structures suggest that short polyA can emerge by alanine substitutions in α-helices; but evolution by insertion is observed for longer polyA. Our results showcase the importance of studying the sequence context of homorepeats as a mechanism to shape their structure–function relationships. Availability and implementation The datasets used and/or analyzed during the current study are available from the corresponding author onreasonable request. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz , 55128 Mainz, Germany
| | - Carlos A Elena-Real
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS , 34090 Montpellier, France
| | - Juan Cortés
- LAAS-CNRS, Université de Toulouse, CNRS , Toulouse, France
| | - Pau Bernadó
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS , 34090 Montpellier, France
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz , 55128 Mainz, Germany
| |
Collapse
|
18
|
Xu J, Zhao X, Zhao X, Wang Z, Tang Q, Xu H, Liu Y. Memristors with Biomaterials for Biorealistic Neuromorphic Applications. SMALL SCIENCE 2022. [DOI: 10.1002/smsc.202200028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
- Jiaqi Xu
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| | - Xiaoning Zhao
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| | - Xiaoli Zhao
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| | - Zhongqiang Wang
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| | - Qingxin Tang
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| | - Haiyang Xu
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| | - Yichun Liu
- Key Laboratory of UV Light-Emitting Materials and Technology of Ministry of Education Northeast Normal University Changchun 130024 China
| |
Collapse
|
19
|
Clore GM. NMR spectroscopy, excited states and relevance to problems in cell biology - transient pre-nucleation tetramerization of huntingtin and insights into Huntington's disease. J Cell Sci 2022; 135:jcs258695. [PMID: 35703323 PMCID: PMC9270955 DOI: 10.1242/jcs.258695] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Solution nuclear magnetic resonance (NMR) spectroscopy is a powerful technique for analyzing three-dimensional structure and dynamics of macromolecules at atomic resolution. Recent advances have exploited the unique properties of NMR in exchanging systems to detect, characterize and visualize excited sparsely populated states of biological macromolecules and their complexes, which are only transient. These states are invisible to conventional biophysical techniques, and play a key role in many processes, including molecular recognition, protein folding, enzyme catalysis, assembly and fibril formation. All the NMR techniques make use of exchange between sparsely populated NMR-invisible and highly populated NMR-visible states to transfer a magnetization property from the invisible state to the visible one where it can be easily detected and quantified. There are three classes of NMR experiments that rely on differences in distance, chemical shift or transverse relaxation (molecular mass) between the NMR-visible and -invisible species. Here, I illustrate the application of these methods to unravel the complex mechanism of sub-millisecond pre-nucleation oligomerization of the N-terminal region of huntingtin, encoded by exon-1 of the huntingtin gene, where CAG expansion leads to Huntington's disease, a fatal autosomal-dominant neurodegenerative condition. I also discuss how inhibition of tetramerization blocks the much slower (by many orders of magnitude) process of fibril formation.
Collapse
Affiliation(s)
- G. Marius Clore
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| |
Collapse
|
20
|
Mier P, Andrade-Navarro MA. PolyX2: Fast Detection of Homorepeats in Large Protein Datasets. Genes (Basel) 2022; 13:genes13050758. [PMID: 35627143 PMCID: PMC9141109 DOI: 10.3390/genes13050758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 04/22/2022] [Accepted: 04/22/2022] [Indexed: 12/03/2022] Open
Abstract
Homorepeat sequences, consecutive runs of identical amino acids, are prevalent in eukaryotic proteins. It has become necessary to annotate and evaluate this feature in entire proteomes. The definition of what constitutes a homorepeat is not fixed, and different research approaches may require different definitions; therefore, flexible approaches to analyze homorepeats in complete proteomes are needed. Here, we present polyX2, a fast, simple but tunable script to scan protein datasets for all possible homorepeats. The user can modify the length of the window to scan, the minimum number of identical residues that must be found in the window, and the types of homorepeats to be found.
Collapse
|
21
|
Hu Y, Wang K, Ye C. "Four-in-One" Nanozyme and Natural Enzyme Symbiotic System of Cu 2-x Se-GOx for Cervical Cancer Therapy. Chemistry 2021; 28:e202102885. [PMID: 34773414 DOI: 10.1002/chem.202102885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Indexed: 12/19/2022]
Abstract
Cervical cancer, as a common malignant tumor of the reproductive system, seriously threatens women's life and health, and is difficult to be cured by traditional treatments, such as surgery, chemotherapy and radiotherapy. Fortunately, tumor microenvironment (TME)-activated catalytic therapy with high efficiency and reduced off-target toxicity has emerged as a novel treatment model. Herein, we designed a "four-in-one" nanozyme and natural enzyme symbiotic system of Cu2-x Se-GOx for TME-triggered cascaded catalytic enhanced cancer treatment. In response to unique TME, Cu2-x Se with catalase activity could effectively catalyze over-expressed H2 O2 in cancer cells into O2 . Subsequently, the glucose oxidase (GOx) could deplete intracellular glucose with the assistance of O2 ; this not only achieves starvation therapy, but also regenerates H2 O2 to boost the generation of highly cytotoxic . OH due to the peroxidase activity of Cu2-x Se. Moreover, although the free-radical scavenger glutathione (GSH) is overexpressed in tumor cells, Cu2-x Se with glutathione oxidase activity could effectively consume GSH for enhanced ROS production. Thus, the "four-in-one" nanozyme@natural enzyme symbiotic system of Cu2-x Se-GOx could induce significant ROS accumulation at the tumor regions, thus providing a potential approach for the treatment of cervical cancer.
Collapse
Affiliation(s)
- Yubo Hu
- Department of Anesthesiology, China-Japan Union Hospital of Jilin University Changchun, Jilin, 130000, P. R. China
| | - Ke Wang
- Department of Gynaecology and Obstetrics, China-Japan Union Hospital of Jilin University Changchun, Jilin, 130000, P. R. China
| | - Cong Ye
- Department of Gynaecology and Obstetrics, China-Japan Union Hospital of Jilin University Changchun, Jilin, 130000, P. R. China
| |
Collapse
|
22
|
Vaglietti S, Fiumara F. PolyQ length co-evolution in neural proteins. NAR Genom Bioinform 2021; 3:lqab032. [PMID: 34017944 PMCID: PMC8121095 DOI: 10.1093/nargab/lqab032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 02/10/2021] [Accepted: 03/31/2021] [Indexed: 12/29/2022] Open
Abstract
Intermolecular co-evolution optimizes physiological performance in functionally related proteins, ultimately increasing molecular co-adaptation and evolutionary fitness. Polyglutamine (polyQ) repeats, which are over-represented in nervous system-related proteins, are increasingly recognized as length-dependent regulators of protein function and interactions, and their length variation contributes to intraspecific phenotypic variability and interspecific divergence. However, it is unclear whether polyQ repeat lengths evolve independently in each protein or rather co-evolve across functionally related protein pairs and networks, as in an integrated regulatory system. To address this issue, we investigated here the length evolution and co-evolution of polyQ repeats in clusters of functionally related and physically interacting neural proteins in Primates. We observed function-/disease-related polyQ repeat enrichment and evolutionary hypervariability in specific neural protein clusters, particularly in the neurocognitive and neuropsychiatric domains. Notably, these analyses detected extensive patterns of intermolecular polyQ length co-evolution in pairs and clusters of functionally related, physically interacting proteins. Moreover, they revealed both direct and inverse polyQ length co-variation in protein pairs, together with complex patterns of coordinated repeat variation in entire polyQ protein sets. These findings uncover a whole system of co-evolving polyQ repeats in neural proteins with direct implications for understanding polyQ-dependent phenotypic variability, neurocognitive evolution and neuropsychiatric disease pathogenesis.
Collapse
Affiliation(s)
- Serena Vaglietti
- Rita Levi Montalcini Department of Neuroscience, University of Torino, Torino 10125, Italy
| | - Ferdinando Fiumara
- Rita Levi Montalcini Department of Neuroscience, University of Torino, Torino 10125, Italy
- National Institute of Neuroscience (INN), University of Torino, Torino 10125, Italy
| |
Collapse
|
23
|
Ca 2+-regulated mitochondrial carriers of ATP-Mg 2+/Pi: Evolutionary insights in protozoans. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2021; 1868:119038. [PMID: 33839167 DOI: 10.1016/j.bbamcr.2021.119038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/30/2021] [Accepted: 03/31/2021] [Indexed: 11/23/2022]
Abstract
In addition to its uptake across the Ca2+ uniporter, intracellular calcium signals can stimulate mitochondrial metabolism activating metabolite exchangers of the inner mitochondrial membrane belonging to the mitochondrial carrier family (SLC25). One of these Ca2+-regulated mitochondrial carriers (CaMCs) are the reversible ATP-Mg2+/Pi transporters, or SCaMCs, required for maintaining optimal adenine nucleotide (AdN) levels in the mitochondrial matrix representing an alternative transporter to the ADP/ATP translocases (AAC). This CaMC has a distinctive Calmodulin-like (CaM-like) domain fused to the carrier domain that makes its transport activity strictly dependent on cytosolic Ca2+ signals. Here we investigate about its origin analysing its distribution and features in unicellular eukaryotes. Unexpectedly, we find two types of ATP-Mg2+/Pi carriers, the canonical ones and shortened variants lacking the CaM-like domain. Phylogenetic analysis shows that both SCaMC variants have a common origin, unrelated to AACs, suggesting in turn that recurrent losses of the regulatory module have occurred in the different phyla. They are excluding variants that show a more limited distribution and less conservation than AACs. Interestingly, these truncated variants of SCaMC are found almost exclusively in parasitic protists, such as apicomplexans, kinetoplastides or animal-patogenic oomycetes, and in green algae, suggesting that its lost could be related to certain life-styles. In addition, we find an intricate structural diversity in these variants that may be associated with their pathogenicity. The consequences on SCaMC functions of these new SCaMC-b variants are discussed.
Collapse
|
24
|
Milorey B, Schweitzer-Stenner R, Andrews B, Schwalbe H, Urbanc B. Short peptides as predictors for the structure of polyarginine sequences in disordered proteins. Biophys J 2021; 120:662-676. [PMID: 33453267 PMCID: PMC7896027 DOI: 10.1016/j.bpj.2020.12.026] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 12/08/2020] [Accepted: 12/30/2020] [Indexed: 12/12/2022] Open
Abstract
Intrinsically disordered proteins and intrinsically disordered regions are frequently enriched in charged amino acids. Intrinsically disordered regions are regularly involved in important biological processes in which one or more charged residues is the driving force behind a protein-biomolecule interaction. Several lines of experimental and computational evidence suggest that polypeptides and proteins that carry high net charges have a high preference for extended conformations with average end-to-end distances exceeding expectations for self-avoiding random coils. Here, we show that charged arginine residues even in short glycine-capped model peptides (GRRG and GRRRG) significantly affect the conformational propensities of each other when compared with the intrinsic propensities of a mostly unperturbed arginine in the tripeptide GRG. A conformational analysis based on experimentally determined J-coupling constants from heteronuclear NMR spectroscopy and amide I' band profiles from vibrational spectroscopy reveals that nearest-neighbor interactions stabilize extended β-strand conformations at the expense of polyproline II and turn conformations. The results from molecular dynamics simulations with a CHARMM36m force field and TIP3P water reproduce our results only to a limited extent. The use of the Ramachandran distribution of the central residue of GRRRG in a calculation of end-to-end distances of polyarginines of different length yielded the expected power law behavior. The scaling coefficient of 0.66 suggests that such peptides would be more extended than predicted by a self-avoiding random walk. Our findings thus support in principle theoretical predictions.
Collapse
Affiliation(s)
- Bridget Milorey
- Department of Chemistry, Drexel University, Philadelphia, Pennsylvania
| | | | - Brian Andrews
- Department of Physics, Drexel University, Philadelphia, Pennsylvania
| | - Harald Schwalbe
- Institut für Organische Chemie und Chemische Biologie, Johann Wolfgang Goethe Universität, Frankfurt, Germany
| | - Brigita Urbanc
- Department of Physics, Drexel University, Philadelphia, Pennsylvania
| |
Collapse
|
25
|
Persi E, Wolf YI, Horn D, Ruppin E, Demichelis F, Gatenby RA, Gillies RJ, Koonin EV. Mutation-selection balance and compensatory mechanisms in tumour evolution. Nat Rev Genet 2020; 22:251-262. [PMID: 33257848 DOI: 10.1038/s41576-020-00299-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/16/2020] [Indexed: 12/11/2022]
Abstract
Intratumour heterogeneity and phenotypic plasticity, sustained by a range of somatic aberrations, as well as epigenetic and metabolic adaptations, are the principal mechanisms that enable cancers to resist treatment and survive under environmental stress. A comprehensive picture of the interplay between different somatic aberrations, from point mutations to whole-genome duplications, in tumour initiation and progression is lacking. We posit that different genomic aberrations generally exhibit a temporal order, shaped by a balance between the levels of mutations and selective pressures. Repeat instability emerges first, followed by larger aberrations, with compensatory effects leading to robust tumour fitness maintained throughout the tumour progression. A better understanding of the interplay between genetic aberrations, the microenvironment, and epigenetic and metabolic cellular states is essential for early detection and prevention of cancer as well as development of efficient therapeutic strategies.
Collapse
Affiliation(s)
- Erez Persi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - David Horn
- School of Physics and Astronomy, Raymond & Beverly Sackler Faculty of Exact Sciences, Tel-Aviv University, Tel-Aviv, Israel
| | - Eytan Ruppin
- Cancer Data Science Lab, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Francesca Demichelis
- Department for Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.,Caryl and Israel Englander Institute for Precision Medicine, New York Presbyterian Hospital, Weill Cornell Medicine, New York, NY, USA
| | - Robert A Gatenby
- Integrated Mathematical Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Robert J Gillies
- Department of Cancer Physiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
26
|
Morató A, Elena-Real CA, Popovic M, Fournet A, Zhang K, Allemand F, Sibille N, Urbanek A, Bernadó P. Robust Cell-Free Expression of Sub-Pathological and Pathological Huntingtin Exon-1 for NMR Studies. General Approaches for the Isotopic Labeling of Low-Complexity Proteins. Biomolecules 2020; 10:E1458. [PMID: 33086646 PMCID: PMC7603387 DOI: 10.3390/biom10101458] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 10/07/2020] [Accepted: 10/16/2020] [Indexed: 12/23/2022] Open
Abstract
The high-resolution structural study of huntingtin exon-1 (HttEx1) has long been hampered by its intrinsic properties. In addition to being prone to aggregate, HttEx1 contains low-complexity regions (LCRs) and is intrinsically disordered, ruling out several standard structural biology approaches. Here, we use a cell-free (CF) protein expression system to robustly and rapidly synthesize (sub-) pathological HttEx1. The open nature of the CF reaction allows the application of different isotopic labeling schemes, making HttEx1 amenable for nuclear magnetic resonance studies. While uniform and selective labeling facilitate the sequential assignment of HttEx1, combining CF expression with nonsense suppression allows the site-specific incorporation of a single labeled residue, making possible the detailed investigation of the LCRs. To optimize CF suppression yields, we analyze the expression and suppression kinetics, revealing that high concentrations of loaded suppressor tRNA have a negative impact on the final reaction yield. The optimized CF protein expression and suppression system is very versatile and well suited to produce challenging proteins with LCRs in order to enable the characterization of their structure and dynamics.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Annika Urbanek
- Centre de Biochimie Structurale (CBS), INSERM, CNRS and Université de Montpellier. 29 rue de Navacelles, 34090 Montpellier, France; (A.M.); (C.A.E.-R.); (M.P.); (A.F.); (K.Z.); (F.A.); (N.S.)
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), INSERM, CNRS and Université de Montpellier. 29 rue de Navacelles, 34090 Montpellier, France; (A.M.); (C.A.E.-R.); (M.P.); (A.F.); (K.Z.); (F.A.); (N.S.)
| |
Collapse
|