1
|
Hoffmann C, Ruff KM, Edu IA, Shinn MK, Tromm JV, King MR, Pant A, Ausserwöger H, Morgan JR, Knowles TPJ, Pappu RV, Milovanovic D. Synapsin Condensation is Governed by Sequence-Encoded Molecular Grammars. J Mol Biol 2025; 437:168987. [PMID: 39947282 PMCID: PMC11903162 DOI: 10.1016/j.jmb.2025.168987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 01/04/2025] [Accepted: 02/06/2025] [Indexed: 02/19/2025]
Abstract
Multiple biomolecular condensates coexist at the pre- and post- synapse to enable vesicle dynamics and controlled neurotransmitter release in the brain. In pre-synapses, intrinsically disordered regions (IDRs) of synaptic proteins are drivers of condensation that enable clustering of synaptic vesicles (SVs). Using computational analysis, we show that the IDRs of SV proteins feature evolutionarily conserved non-random compositional biases and sequence patterns. Synapsin-1 is essential for condensation of SVs, and its C-terminal IDR has been shown to be a key driver of condensation. Focusing on this IDR, we dissected the contributions of two conserved features namely the segregation of polar and proline residues along the linear sequence, and the compositional preference for arginine over lysine. Scrambling the blocks of polar and proline residues weakens the driving forces for forming micron-scale condensates. However, the extent of clustering in subsaturated solutions remains equivalent to that of the wild-type synapsin-1. In contrast, substituting arginine with lysine significantly weakens both the driving forces for condensation and the extent of clustering in subsaturated solutions. Co-expression of the scrambled variant of synapsin-1 with synaptophysin results in a gain-of-function phenotype in cells, whereas arginine to lysine substitutions eliminate condensation in cells. We report an emergent consequence of synapsin-1 condensation, which is the generation of interphase pH gradients that is realized via differential partitioning of protons between coexisting phases. This pH gradient is likely to be directly relevant for vesicular ATPase functions and the loading of neurotransmitters. Our studies highlight how conserved IDR grammars serve as drivers of synapsin-1 condensation.
Collapse
Affiliation(s)
- Christian Hoffmann
- Laboratory of Molecular Neuroscience Berlin, German Center for Neurodegenerative Diseases (DZNE), 10117 Berlin, Germany
| | - Kiersten M Ruff
- Department of Biomedical Engineering and Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Irina A Edu
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Min Kyung Shinn
- Department of Biomedical Engineering and Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Johannes V Tromm
- Laboratory of Molecular Neuroscience Berlin, German Center for Neurodegenerative Diseases (DZNE), 10117 Berlin, Germany
| | - Matthew R King
- Department of Biomedical Engineering and Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Avnika Pant
- Department of Biomedical Engineering and Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Hannes Ausserwöger
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Jennifer R Morgan
- Eugene Bell Center for Regenerative Biology and Tissue Engineering, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Tuomas P J Knowles
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Cavendish Laboratory, Department of Physics, University of Cambridge, JJ Thomson Road, Cambridge CB3 0HE, United Kingdom
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA.
| | - Dragomir Milovanovic
- Laboratory of Molecular Neuroscience Berlin, German Center for Neurodegenerative Diseases (DZNE), 10117 Berlin, Germany; German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany; Einstein Center for Neuroscience, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Berlin, and Berlin Institute of Health, 10117 Berlin, Germany; Whitman Center, Marine Biological Laboratory, 02543 Woods Hole, MA, USA.
| |
Collapse
|
2
|
Ditlev JA, Forman-Kay JD. Beyond peptide targeting sequences: machine learning of cellular condensate localization. Cell Res 2025:10.1038/s41422-025-01115-6. [PMID: 40223018 DOI: 10.1038/s41422-025-01115-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2025] Open
Affiliation(s)
- Jonathon A Ditlev
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON, Canada.
- Department of Biochemistry, University of Toronto, Toronto, ON, Canada.
- Cell and Systems Biology Program, Hospital for Sick Children, Toronto, ON, Canada.
| | - Julie D Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON, Canada.
- Department of Biochemistry, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
von Bülow S, Tesei G, Lindorff-Larsen K. Machine learning methods to study sequence-ensemble-function relationships in disordered proteins. Curr Opin Struct Biol 2025; 92:103028. [PMID: 40081192 DOI: 10.1016/j.sbi.2025.103028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 02/13/2025] [Accepted: 02/14/2025] [Indexed: 03/15/2025]
Abstract
Recent years have seen tremendous developments in the use of machine learning models to link amino-acid sequence, structure, and function of folded proteins. These methods are, however, rarely applicable to the wide range of proteins and sequences that comprise intrinsically disordered regions. We here review developments in the study of sequence-ensemble-function relationships of disordered proteins that exploit or are used to train machine learning models. These include methods for generating conformational ensembles and designing new sequences, and for linking sequences to biophysical properties and biological functions. We highlight how these developments are built on a tight integration between experiment, theory and simulations, and account for evolutionary constraints, which operate on sequences of disordered regions differently than on those of folded domains.
Collapse
Affiliation(s)
- Sören von Bülow
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200, Copenhagen, Denmark
| | - Giulio Tesei
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & the Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200, Copenhagen, Denmark.
| |
Collapse
|
4
|
Vashishtha S, Sabari BR. Disordered Regions of Condensate-promoting Proteins Have Distinct Molecular Signatures Associated with Cellular Function. J Mol Biol 2025; 437:168953. [PMID: 39826710 DOI: 10.1016/j.jmb.2025.168953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 12/23/2024] [Accepted: 01/10/2025] [Indexed: 01/22/2025]
Abstract
Disordered regions of proteins play crucial roles in cellular functions through diverse mechanisms. Some disordered regions function by promoting the formation of biomolecular condensates through dynamic multivalent interactions. While many have assumed that interactions among these condensate-promoting disordered regions are non-specific, recent studies have shown that distinct sequence compositions and patterning lead to specific condensate compositions associated with cellular function. Despite in-depth characterization of several key examples, the full chemical diversity of condensate-promoting disordered regions has not been surveyed. Here, we define a list of disordered regions of condensate-promoting proteins to survey the relationship between sequence and function. We find that these disordered regions show amino acid biases associated with different cellular functions. These amino acid biases are evolutionarily conserved in the absence of positional sequence conservation. Overall, our analysis highlights the relationship between sequence features and function for condensate-promoting disordered regions. This analysis suggests that molecular signatures encoded within disordered regions could impart functional specificity.
Collapse
Affiliation(s)
- Shubham Vashishtha
- Laboratory of Nuclear Organization, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Research, Department of Obstetrics and Gynecology, Department of Molecular Biology, Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Benjamin R Sabari
- Laboratory of Nuclear Organization, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Research, Department of Obstetrics and Gynecology, Department of Molecular Biology, Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
| |
Collapse
|
5
|
Jemth P. Protein binding and folding through an evolutionary lens. Curr Opin Struct Biol 2025; 90:102980. [PMID: 39817990 DOI: 10.1016/j.sbi.2024.102980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 12/18/2024] [Accepted: 12/19/2024] [Indexed: 01/18/2025]
Abstract
Protein-protein associations are often mediated by an intrinsically disordered protein region interacting with a folded domain in a coupled binding and folding reaction. Classic physical organic chemistry approaches together with structural biology have shed light on mechanistic aspects of such reactions. Further insight into general principles may be obtained by interpreting the results through an evolutionary lens. This review attempts to provide an overview on how the analysis of binding and folding reactions can benefit from an evolutionary approach, and is aimed at protein scientists without a background in evolution. Evolution constantly reshapes existing proteins by sampling more or less fit variants. Most new variants are weeded out as generations and new species come and go over hundreds to hundreds of millions of years. The huge ongoing genome sequencing efforts have provided us with a snapshot of existing adapted fit-for-purpose protein homologs in thousands of different organisms. Comparison of present-day orthologs and paralogs highlights general principles of the evolution of coupled binding and folding reactions and demonstrate a great potential for evolution to operate on disordered regions and modulate affinity and specificity of the interactions.
Collapse
Affiliation(s)
- Per Jemth
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC, Box 582, SE-75123 Uppsala, Sweden.
| |
Collapse
|
6
|
LeBlanc C, Stefani J, Soriano M, Lam A, Zintel MA, Kotha SR, Chase E, Pimentel-Solorio G, Vunnum A, Flug K, Fultineer A, Hummel N, Staller MV. Conservation of function without conservation of amino acid sequence in intrinsically disordered transcriptional activation domains. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.03.626510. [PMID: 39677729 PMCID: PMC11642888 DOI: 10.1101/2024.12.03.626510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Protein function is canonically believed to be more conserved than amino acid sequence, but this idea is only well supported in folded domains, where highly diverged sequences can fold into equivalent 3D structures. In contrast, intrinsically disordered protein regions (IDRs) do not fold into a stable 3D structure, thus it remains unknown when and how function is conserved for IDRs that experience rapid amino acid sequence divergence. As a model system for studying the evolution of IDRs, we examined transcriptional activation domains, the regions of transcription factors that bind to coactivator complexes. We systematically identified activation domains on 502 orthologs of the transcriptional activator Gcn4 spanning 600 MY of fungal evolution. We find that the central activation domain shows strong conservation of function without conservation of sequence. This conservation of function without conservation of sequence is facilitated by evolutionary turnover (gain and loss) of key acidic and aromatic residues, the positions most important for function. This high sequence flexibility of functional orthologs mirrors the physical flexibility of the activation domain coactivator interaction interface, suggesting that physical flexibility enables evolutionary plasticity. We propose that turnover of short functional elements, sometimes individual amino acids, is a general mechanism for conservation of function without conservation of sequence during IDR evolution.
Collapse
Affiliation(s)
- Claire LeBlanc
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Jordan Stefani
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Melvin Soriano
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Angelica Lam
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Marissa A. Zintel
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
| | - Sanjana R. Kotha
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Emily Chase
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Giovani Pimentel-Solorio
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Aditya Vunnum
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
| | - Katherine Flug
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
| | - Aaron Fultineer
- Department of Physics, University of California Berkeley, Berkeley, 94720
| | - Niklas Hummel
- Department of Biology, Technische Universität Darmstadt, Darmstadt, Germany
| | - Max V. Staller
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
- Chan Zuckerberg Biohub–San Francisco, San Francisco, CA 94158
| |
Collapse
|
7
|
King MR, Ruff KM, Pappu RV. Emergent microenvironments of nucleoli. Nucleus 2024; 15:2319957. [PMID: 38443761 PMCID: PMC10936679 DOI: 10.1080/19491034.2024.2319957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 02/13/2024] [Indexed: 03/07/2024] Open
Abstract
In higher eukaryotes, the nucleolus harbors at least three sub-phases that facilitate multiple functionalities including ribosome biogenesis. The three prominent coexisting sub-phases are the fibrillar center (FC), the dense fibrillar component (DFC), and the granular component (GC). Here, we review recent efforts in profiling sub-phase compositions that shed light on the types of physicochemical properties that emerge from compositional biases and territorial organization of specific types of macromolecules. We highlight roles played by molecular grammars which refers to protein sequence features including the substrate binding domains, the sequence features of intrinsically disordered regions, and the multivalence of these distinct types of domains / regions. We introduce the concept of a barcode of emergent physicochemical properties of nucleoli. Although our knowledge of the full barcode remains incomplete, we hope that the concept prompts investigations into undiscovered emergent properties and engenders an appreciation for how and why unique microenvironments control biochemical reactions.
Collapse
Affiliation(s)
- Matthew R. King
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, Campus, MO, USA
| | - Kiersten M. Ruff
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, Campus, MO, USA
| | - Rohit V. Pappu
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, Campus, MO, USA
| |
Collapse
|
8
|
Sun Y, Hsieh T, Lin C, Shao W, Lin Y, Huang J. A Few Charged Residues in Galectin-3's Folded and Disordered Regions Regulate Phase Separation. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2402570. [PMID: 39248370 PMCID: PMC11538691 DOI: 10.1002/advs.202402570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 07/25/2024] [Indexed: 09/10/2024]
Abstract
Proteins with intrinsically disordered regions (IDRs) often undergo phase separation to control their functions spatiotemporally. Changing the pH alters the protonation levels of charged sidechains, which in turn affects the attractive or repulsive force for phase separation. In a cell, the rupture of membrane-bound compartments, such as lysosomes, creates an abrupt change in pH. However, how proteins' phase separation reacts to different pH environments remains largely unexplored. Here, using extensive mutagenesis, NMR spectroscopy, and biophysical techniques, it is shown that the assembly of galectin-3, a widely studied lysosomal damage marker, is driven by cation-π interactions between positively charged residues in its folded domain with aromatic residues in the IDR in addition to π-π interaction between IDRs. It is also found that the sole two negatively charged residues in its IDR sense pH changes for tuning the condensation tendency. Also, these two residues may prevent this prion-like IDR domain from forming rapid and extensive aggregates. These results demonstrate how cation-π, π-π, and electrostatic interactions can regulate protein condensation between disordered and structured domains and highlight the importance of sparse negatively charged residues in prion-like IDRs.
Collapse
Affiliation(s)
- Yung‐Chen Sun
- Institute of Biochemistry and Molecular BiologyNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
- Taiwan International Graduate Program in Molecular MedicineNational Yang Ming Chiao Tung University and Academia SinicaTaipeiTaiwan
| | - Tsung‐Lun Hsieh
- Institute of Biochemistry and Molecular BiologyNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
| | - Chia‐I Lin
- Institute of Biochemistry and Molecular BiologyNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
| | - Wan‐Yu Shao
- Department of Life Sciences and Institute of Genome SciencesNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
| | - Yu‐Hao Lin
- Institute of Biochemistry and Molecular BiologyNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
- Taiwan International Graduate Program in Molecular MedicineNational Yang Ming Chiao Tung University and Academia SinicaTaipeiTaiwan
| | - Jie‐rong Huang
- Institute of Biochemistry and Molecular BiologyNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
- Department of Life Sciences and Institute of Genome SciencesNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
- Institute of Biomedical InformaticsNational Yang Ming Chiao Tung UniversityNo. 155, Sec. 2, Linong St.Taipei112304Taiwan
| |
Collapse
|
9
|
Dominique C, Maiga NK, Méndez-Godoy A, Pillet B, Hamze H, Léger-Silvestre I, Henry Y, Marchand V, Gomes Neto V, Dez C, Motorin Y, Kressler D, Gadal O, Henras AK, Albert B. The dual life of disordered lysine-rich domains of snoRNPs in rRNA modification and nucleolar compaction. Nat Commun 2024; 15:9415. [PMID: 39482307 PMCID: PMC11528048 DOI: 10.1038/s41467-024-53805-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 10/22/2024] [Indexed: 11/03/2024] Open
Abstract
Intrinsically disordered regions (IDRs) are highly enriched in the nucleolar proteome but their physiological role in ribosome assembly remains poorly understood. Our study reveals the functional plasticity of the extremely abundant lysine-rich IDRs of small nucleolar ribonucleoprotein particles (snoRNPs) from protists to mammalian cells. We show in Saccharomyces cerevisiae that the electrostatic properties of this lysine-rich IDR, the KKE/D domain, promote snoRNP accumulation in the vicinity of nascent rRNAs, facilitating their modification. Under stress conditions reducing the rate of ribosome assembly, they are essential for nucleolar compaction and sequestration of key early-acting ribosome biogenesis factors, including RNA polymerase I, owing to their self-interaction capacity in a latent, non-rRNA-associated state. We propose that such functional plasticity of these lysine-rich IDRs may represent an ancestral eukaryotic regulatory mechanism, explaining how nucleolar morphology is continuously adapted to rRNA production levels.
Collapse
Affiliation(s)
- Carine Dominique
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France
| | - Nana Kadidia Maiga
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France
| | | | - Benjamin Pillet
- Department of Biology, University of Fribourg, Fribourg, Switzerland
| | - Hussein Hamze
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France
| | - Isabelle Léger-Silvestre
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France
| | - Yves Henry
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France
| | - Virginie Marchand
- CNRS-Université de Lorraine, UAR2008 IBSLor/UMR7365 IMoPA, Nancy, France
| | - Valdir Gomes Neto
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, São Paulo, Brazil
| | - Christophe Dez
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France
| | - Yuri Motorin
- CNRS-Université de Lorraine, UAR2008 IBSLor/UMR7365 IMoPA, Nancy, France
| | - Dieter Kressler
- Department of Biology, University of Fribourg, Fribourg, Switzerland.
| | - Olivier Gadal
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France.
| | - Anthony K Henras
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France.
| | - Benjamin Albert
- Molecular, Cellular and Developmental (MCD) Unit, Centre for Integrative Biology (CBI), CNRS, University of Toulouse, UPS, Toulouse, France.
| |
Collapse
|
10
|
Hoffmann C, Ruff KM, Edu I, Kyung Shinn M, Tromm J, King M, Pant A, Ausserwoeger H, Morgan J, Knowles T, Pappu RV, Milovanovic D. Synapsin condensation is governed by sequence-encoded molecular grammars. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.03.606464. [PMID: 39131319 PMCID: PMC11312526 DOI: 10.1101/2024.08.03.606464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Multiple biomolecular condensates coexist at the pre- and post- synapse to enable vesicle dynamics and controlled neurotransmitter release in the brain. In pre-synapses, intrinsically disordered regions (IDRs) of synaptic proteins are drivers of condensation that enable clustering of synaptic vesicles (SVs). Using computational analysis, we show that the IDRs of SV proteins feature evolutionarily conserved non-random compositional biases and sequence patterns. Synapsin-1 is essential for condensation of SVs, and its C-terminal IDR has been shown to be a key driver of condensation. Focusing on this IDR, we dissected the contributions of two conserved features namely the segregation of polar and proline residues along the linear sequence, and the compositional preference for arginine over lysine. Scrambling the blocks of polar and proline residues weakens the driving forces for forming micron-scale condensates. However, the extent of clustering in subsaturated solutions remains equivalent to that of the wild-type synapsin-1. In contrast, substituting arginine with lysine significantly weakens both the driving forces for condensation and the extent of clustering in subsaturated solutions. Co-expression of the scrambled variant of synapsin-1 with synaptophysin results in a gain-of-function phenotype in cells, whereas arginine to lysine substitutions eliminate condensation. We report an emergent consequence of synapsin-1 condensation, which is the generation of interphase pH gradients realized via differential partitioning of protons between coexisting phases. This pH gradient is likely to be directly relevant for vesicular ATPase functions and the loading of neurotransmitters. Our study highlights how conserved IDR grammars serve as drivers of synapsin-1 condensation.
Collapse
|
11
|
Chow CFW, Ghosh S, Hadarovich A, Toth-Petroczy A. SHARK enables sensitive detection of evolutionary homologs and functional analogs in unalignable and disordered sequences. Proc Natl Acad Sci U S A 2024; 121:e2401622121. [PMID: 39383002 PMCID: PMC11494347 DOI: 10.1073/pnas.2401622121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 08/30/2024] [Indexed: 10/11/2024] Open
Abstract
Intrinsically disordered regions (IDRs) are structurally flexible protein segments with regulatory functions in multiple contexts, such as in the assembly of biomolecular condensates. Since IDRs undergo more rapid evolution than ordered regions, identifying homology of such poorly conserved regions remains challenging for state-of-the-art alignment-based methods that rely on position-specific conservation of residues. Thus, systematic functional annotation and evolutionary analysis of IDRs have been limited, despite them comprising ~21% of proteins. To accurately assess homology between unalignable sequences, we developed an alignment-free sequence comparison algorithm, SHARK (Similarity/Homology Assessment by Relating K-mers). We trained SHARK-dive, a machine learning homology classifier, which achieved superior performance to standard alignment-based approaches in assessing evolutionary homology in unalignable sequences. Furthermore, it correctly identified dissimilar but functionally analogous IDRs in IDR-replacement experiments reported in the literature, whereas alignment-based tools were incapable of detecting such functional relationships. SHARK-dive not only predicts functionally similar IDRs at a proteome-wide scale but also identifies cryptic sequence properties and motifs that drive remote homology and analogy, thereby providing interpretable and experimentally verifiable hypotheses of the sequence determinants that underlie such relationships. SHARK-dive acts as an alternative to alignment to facilitate systematic analysis and functional annotation of the unalignable protein universe.
Collapse
Affiliation(s)
- Chi Fung Willis Chow
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden01307, Germany
- Center for Systems Biology Dresden, Dresden01307, Germany
- Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden01062, Germany
| | - Soumyadeep Ghosh
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden01307, Germany
- Center for Systems Biology Dresden, Dresden01307, Germany
| | - Anna Hadarovich
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden01307, Germany
- Center for Systems Biology Dresden, Dresden01307, Germany
| | - Agnes Toth-Petroczy
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden01307, Germany
- Center for Systems Biology Dresden, Dresden01307, Germany
- Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden01062, Germany
| |
Collapse
|
12
|
Pal T, Wessén J, Das S, Chan HS. Differential Effects of Sequence-Local versus Nonlocal Charge Patterns on Phase Separation and Conformational Dimensions of Polyampholytes as Model Intrinsically Disordered Proteins. J Phys Chem Lett 2024; 15:8248-8256. [PMID: 39105804 DOI: 10.1021/acs.jpclett.4c01973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/07/2024]
Abstract
Conformational properties of intrinsically disordered proteins (IDPs) are governed by a sequence-ensemble relationship. To differentiate the impact of sequence-local versus sequence-nonlocal features of an IDP's charge pattern on its conformational dimensions and its phase-separation propensity, the charge "blockiness" κ and the nonlocality-weighted sequence charge decoration (SCD) parameters are compared for their correlations with isolated-chain radii of gyration (Rgs) and upper critical solution temperatures (UCSTs) of polyampholytes modeled by random phase approximation, field-theoretic simulation, and coarse-grained molecular dynamics. SCD is superior to κ in predicting Rg because SCD accounts for effects of contact order, i.e., nonlocality, on dimensions of isolated chains. In contrast, κ and SCD are comparably good, though nonideal, predictors of UCST because frequencies of interchain contacts in the multiple-chain condensed phase are less sensitive to sequence positions than frequencies of intrachain contacts of an isolated chain, as reflected by κ correlating better with condensed-phase interaction energy than SCD.
Collapse
Affiliation(s)
- Tanmoy Pal
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Jonas Wessén
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Suman Das
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Department of Chemistry, Gandhi Institute of Technology and Management, Visakhapatnam, Andhra Pradesh 530045, India
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
13
|
Middendorf L, Ravi Iyengar B, Eicholt LA. Sequence, Structure, and Functional Space of Drosophila De Novo Proteins. Genome Biol Evol 2024; 16:evae176. [PMID: 39212966 PMCID: PMC11363682 DOI: 10.1093/gbe/evae176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/29/2024] [Indexed: 09/04/2024] Open
Abstract
During de novo emergence, new protein coding genes emerge from previously nongenic sequences. The de novo proteins they encode are dissimilar in composition and predicted biochemical properties to conserved proteins. However, functional de novo proteins indeed exist. Both identification of functional de novo proteins and their structural characterization are experimentally laborious. To identify functional and structured de novo proteins in silico, we applied recently developed machine learning based tools and found that most de novo proteins are indeed different from conserved proteins both in their structure and sequence. However, some de novo proteins are predicted to adopt known protein folds, participate in cellular reactions, and to form biomolecular condensates. Apart from broadening our understanding of de novo protein evolution, our study also provides a large set of testable hypotheses for focused experimental studies on structure and function of de novo proteins in Drosophila.
Collapse
Affiliation(s)
- Lasse Middendorf
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| | - Bharat Ravi Iyengar
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| | - Lars A Eicholt
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| |
Collapse
|
14
|
Halpin JC, Keating AE. PairK: Pairwise k-mer alignment for quantifying protein motif conservation in disordered regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.23.604860. [PMID: 39091826 PMCID: PMC11291154 DOI: 10.1101/2024.07.23.604860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Protein-protein interactions are often mediated by a modular peptide recognition domain binding to a short linear motif (SLiM) in the disordered region of another protein. The ability to predict domain-SLiM interactions would allow researchers to map protein interaction networks, predict the effects of perturbations to those networks, and develop biologically meaningful hypotheses. Unfortunately, sequence database searches for SLiMs generally yield mostly biologically irrelevant motif matches or false positives. To improve the prediction of novel SLiM interactions, researchers employ filters to discriminate between biologically relevant and improbable motif matches. One promising criterion for identifying biologically relevant SLiMs is the sequence conservation of the motif, exploiting the fact that functional motifs are more likely to be conserved than spurious motif matches. However, the difficulty of aligning disordered regions has significantly hampered the utility of this approach. We present PairK (pairwise k-mer alignment), an MSA-free method to quantify motif conservation in disordered regions. PairK outperforms both standard MSA-based conservation scores and a modern LLM-based conservation score predictor on the task of identifying biologically important motif instances. PairK can quantify conservation over wider phylogenetic distances than MSAs, indicating that SLiMs may be more conserved than is implied by MSA-based metrics. PairK is available as open-source code at https://github.com/jacksonh1/pairk.
Collapse
Affiliation(s)
- Jackson C. Halpin
- MIT Department of Biology, 77 Massachusetts Ave., Cambridge, MA 02139
| | - Amy E. Keating
- MIT Department of Biology, 77 Massachusetts Ave., Cambridge, MA 02139
- MIT Department of Biological Engineering, 77 Massachusetts Ave., Cambridge, MA 02139
- Koch Institute for Integrative Cancer Research, 77 Massachusetts Ave., Cambridge, MA 02139
| |
Collapse
|
15
|
Pastic A, Nosella ML, Kochhar A, Liu ZH, Forman-Kay JD, D'Amours D. Chromosome compaction is triggered by an autonomous DNA-binding module within condensin. Cell Rep 2024; 43:114419. [PMID: 38985672 DOI: 10.1016/j.celrep.2024.114419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 04/16/2024] [Accepted: 06/14/2024] [Indexed: 07/12/2024] Open
Abstract
The compaction of chromatin into mitotic chromosomes is essential for faithful transmission of the genome during cell division. In eukaryotes, chromosome morphogenesis is regulated by the condensin complex, though the exact mechanism used to target condensin to chromatin and initiate condensation is not understood. Here, we reveal that condensin contains an intrinsically disordered region (IDR) that modulates its association with chromatin in early mitosis and exhibits phase separation. We describe DNA-binding motifs within the IDR that, upon deletion, inflict striking defects in chromosome condensation and segregation, ill-timed condensin turnover on chromatin, and cell death. Importantly, we demonstrate that the condensin IDR can impart cell cycle regulatory functions when transferred to other subunits within the complex, indicating its autonomous nature. Collectively, our study unveils the molecular basis for the initiation of chromosome condensation in early mitosis and how this process ultimately promotes genomic stability and faultless cell division.
Collapse
Affiliation(s)
- Alyssa Pastic
- Ottawa Institute of Systems Biology, Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Michael L Nosella
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Annahat Kochhar
- Ottawa Institute of Systems Biology, Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Zi Hao Liu
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Julie D Forman-Kay
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Damien D'Amours
- Ottawa Institute of Systems Biology, Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada.
| |
Collapse
|
16
|
Xiao YX, Lee SY, Aguilera-Uribe M, Samson R, Au A, Khanna Y, Liu Z, Cheng R, Aulakh K, Wei J, Farias AG, Reilly T, Birkadze S, Habsid A, Brown KR, Chan K, Mero P, Huang JQ, Billmann M, Rahman M, Myers C, Andrews BJ, Youn JY, Yip CM, Rotin D, Derry WB, Forman-Kay JD, Moses AM, Pritišanac I, Gingras AC, Moffat J. The TSC22D, WNK, and NRBP gene families exhibit functional buffering and evolved with Metazoa for cell volume regulation. Cell Rep 2024; 43:114417. [PMID: 38980795 DOI: 10.1016/j.celrep.2024.114417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 05/08/2024] [Accepted: 06/13/2024] [Indexed: 07/11/2024] Open
Abstract
The ability to sense and respond to osmotic fluctuations is critical for the maintenance of cellular integrity. We used gene co-essentiality analysis to identify an unappreciated relationship between TSC22D2, WNK1, and NRBP1 in regulating cell volume homeostasis. All of these genes have paralogs and are functionally buffered for osmo-sensing and cell volume control. Within seconds of hyperosmotic stress, TSC22D, WNK, and NRBP family members physically associate into biomolecular condensates, a process that is dependent on intrinsically disordered regions (IDRs). A close examination of these protein families across metazoans revealed that TSC22D genes evolved alongside a domain in NRBPs that specifically binds to TSC22D proteins, which we have termed NbrT (NRBP binding region with TSC22D), and this co-evolution is accompanied by rapid IDR length expansion in WNK-family kinases. Our study reveals that TSC22D, WNK, and NRBP genes evolved in metazoans to co-regulate rapid cell volume changes in response to osmolarity.
Collapse
Affiliation(s)
- Yu-Xi Xiao
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Seon Yong Lee
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Magali Aguilera-Uribe
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Reuben Samson
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Sinai Health, Toronto, ON, Canada
| | - Aaron Au
- Institute for Biomedical Engineering, University of Toronto, Toronto, ON, Canada; Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada; Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Yukti Khanna
- Otto-Loewi Research Center, Division of Medicinal Chemistry, Medical University of Graz, Neue Stiftingtalstrabe 6, 8010, Graz, Austria
| | - Zetao Liu
- Program in Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Biochemistry, University of Toronto, Toronto, ON, Canada
| | - Ran Cheng
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Program in Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Kamaldeep Aulakh
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Jiarun Wei
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Adrian Granda Farias
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Taylor Reilly
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Saba Birkadze
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Andrea Habsid
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Kevin R Brown
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Katherine Chan
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Patricia Mero
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Jie Qi Huang
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Program in Molecular Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Maximilian Billmann
- Institute of Human Genetics, School of Medicine and University Hospital Bonn, University of Bonn, 53127 Bonn, Germany
| | - Mahfuzur Rahman
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Chad Myers
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Brenda J Andrews
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Ji-Young Youn
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Program in Molecular Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Christopher M Yip
- Institute for Biomedical Engineering, University of Toronto, Toronto, ON, Canada; Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Daniela Rotin
- Program in Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Biochemistry, University of Toronto, Toronto, ON, Canada
| | - W Brent Derry
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Program in Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Julie D Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ON, Canada; Program in Molecular Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Alan M Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Iva Pritišanac
- Otto-Loewi Research Center, Division of Medicinal Chemistry, Medical University of Graz, Neue Stiftingtalstrabe 6, 8010, Graz, Austria
| | - Anne-Claude Gingras
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Sinai Health, Toronto, ON, Canada
| | - Jason Moffat
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Institute for Biomedical Engineering, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
17
|
Nguyen A, Zhao H, Myagmarsuren D, Srinivasan S, Wu D, Chen J, Piszczek G, Schuck P. Modulation of biophysical properties of nucleocapsid protein in the mutant spectrum of SARS-CoV-2. eLife 2024; 13:RP94836. [PMID: 38941236 PMCID: PMC11213569 DOI: 10.7554/elife.94836] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024] Open
Abstract
Genetic diversity is a hallmark of RNA viruses and the basis for their evolutionary success. Taking advantage of the uniquely large genomic database of SARS-CoV-2, we examine the impact of mutations across the spectrum of viable amino acid sequences on the biophysical phenotypes of the highly expressed and multifunctional nucleocapsid protein. We find variation in the physicochemical parameters of its extended intrinsically disordered regions (IDRs) sufficient to allow local plasticity, but also observe functional constraints that similarly occur in related coronaviruses. In biophysical experiments with several N-protein species carrying mutations associated with major variants, we find that point mutations in the IDRs can have nonlocal impact and modulate thermodynamic stability, secondary structure, protein oligomeric state, particle formation, and liquid-liquid phase separation. In the Omicron variant, distant mutations in different IDRs have compensatory effects in shifting a delicate balance of interactions controlling protein assembly properties, and include the creation of a new protein-protein interaction interface in the N-terminal IDR through the defining P13L mutation. A picture emerges where genetic diversity is accompanied by significant variation in biophysical characteristics of functional N-protein species, in particular in the IDRs.
Collapse
Affiliation(s)
- Ai Nguyen
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Huaying Zhao
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Dulguun Myagmarsuren
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Sanjana Srinivasan
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Di Wu
- Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, United States
| | - Jiji Chen
- Advanced Imaging and Microscopy Resource, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| | - Grzegorz Piszczek
- Biophysics Core Facility, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, United States
| | - Peter Schuck
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, United States
| |
Collapse
|
18
|
McShea H, Weibel C, Wehbi S, Goodman P, James JE, Wheeler AL, Masel J. The effectiveness of selection in a species affects the direction of amino acid frequency evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.01.526552. [PMID: 38948853 PMCID: PMC11212923 DOI: 10.1101/2023.02.01.526552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Nearly neutral theory predicts that species with higher effective population size (N e ) are better able to purge slightly deleterious mutations. We compare evolution in high-N e vs. low-N e vertebrates to reveal which amino acid frequencies are subject to subtle selective preferences. We take three complementary approaches, two measuring flux and one measuring outcomes. First, we fit non-stationary substitution models of amino acid flux using maximum likelihood, comparing the high-N e clade of rodents and lagomorphs to its low-N e sister clade of primates and colugos. Second, we compare evolutionary outcomes across a wider range of vertebrates, via correlations between amino acid frequencies and N e . Third, we dissect the details of flux in human, chimpanzee, mouse, and rat, as scored by parsimony - this also enables comparison to a historical paper. All three methods agree on which amino acids are preferred under more effective selection. Preferred amino acids tend to be smaller, less costly to synthesize, and to promote intrinsic structural disorder. Parsimony-induced bias in the historical study produces an apparent reduction in structural disorder, perhaps driven by slightly deleterious substitutions. Within highly exchangeable pairs of amino acids, arginine is strongly preferred over lysine, and valine over isoleucine, consistent with more effective selection preferring a marginally larger free energy of folding. These two preferences match differences between thermophiles and mesophilic relatives. These results reveal the biophysical consequences of mutation-selection-drift balance, and demonstrate the utility of nearly neutral theory for understanding protein evolution.
Collapse
Affiliation(s)
- Hanon McShea
- Department of Earth System Science, Stanford University
| | - Catherine Weibel
- Department of Ecology & Evolutionary Biology, University of Arizona
- Department of Applied Physics, Stanford University
| | - Sawsan Wehbi
- Graduate Interdisciplinary Program in Genetics, University of Arizona
| | | | - Jennifer E James
- Department of Ecology & Evolutionary Biology, University of Arizona
- Department of Ecology and Genetics, Uppsala University
| | - Andrew L Wheeler
- Graduate Interdisciplinary Program in Genetics, University of Arizona
| | - Joanna Masel
- Department of Ecology & Evolutionary Biology, University of Arizona
| |
Collapse
|
19
|
Ginell GM, Emenecker RJ, Lotthammer JM, Usher ET, Holehouse AS. Direct prediction of intermolecular interactions driven by disordered regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.03.597104. [PMID: 38895487 PMCID: PMC11185574 DOI: 10.1101/2024.06.03.597104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Intrinsically disordered regions (IDRs) are critical for a wide variety of cellular functions, many of which involve interactions with partner proteins. Molecular recognition is typically considered through the lens of sequence-specific binding events. However, a growing body of work has shown that IDRs often interact with partners in a manner that does not depend on the precise order of the amino acid order, instead driven by complementary chemical interactions leading to disordered bound-state complexes. Despite this emerging paradigm, we lack tools to describe, quantify, predict, and interpret these types of structurally heterogeneous interactions from the underlying amino acid sequences. Here, we repurpose the chemical physics developed originally for molecular simulations to develop an approach for predicting intermolecular interactions between IDRs and partner proteins. Our approach enables the direct prediction of phase diagrams, the identification of chemically-specific interaction hotspots on IDRs, and a route to develop and test mechanistic hypotheses regarding IDR function in the context of molecular recognition. We use our approach to examine a range of systems and questions to highlight its versatility and applicability.
Collapse
Affiliation(s)
- Garrett M. Ginell
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Ryan. J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Jeffrey M. Lotthammer
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Emery T. Usher
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Alex S. Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| |
Collapse
|
20
|
Alston JJ, Soranno A, Holehouse AS. Conserved molecular recognition by an intrinsically disordered region in the absence of sequence conservation. RESEARCH SQUARE 2024:rs.3.rs-4477977. [PMID: 38883712 PMCID: PMC11177979 DOI: 10.21203/rs.3.rs-4477977/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2024]
Abstract
Intrinsically disordered regions (IDRs) are critical for cellular function yet often appear to lack sequence conservation when assessed by multiple sequence alignments. This raises the question of if and how function can be encoded and preserved in these regions despite massive sequence variation. To address this question, we have applied coarse-grained molecular dynamics simulations to investigate non-specific RNA binding of coronavirus nucleocapsid proteins. Coronavirus nucleocapsid proteins consist of multiple interspersed disordered and folded domains that bind RNA. Here, we focus on the first two domains of coronavirus nucleocapsid proteins: the disordered N-terminal domain (NTD) and the folded RNA binding domain (RBD). While the NTD is highly variable across evolution, the RBD is structurally conserved. This combination makes the NTD-RBD a convenient model system for exploring the interplay between an IDR adjacent to a folded domain and how changes in IDR sequence can influence molecular recognition of a partner. Our results reveal a surprising degree of sequence-specificity encoded by both the composition and the precise order of the amino acids in the NTD. The presence of an NTD can - depending on the sequence - either suppress or enhance RNA binding. Despite this sensitivity, large-scale variation in NTD sequences is possible while certain sequence features are retained. Consequently, a conformationally-conserved dynamic and disordered RNA:protein complex is found across nucleocapsid protein orthologs despite large-scale changes in both NTD sequence and RBD surface chemistry. Taken together, these insights shed light on the ability of disordered regions to preserve functional characteristics despite their sequence variability.
Collapse
Affiliation(s)
- Jhullian J. Alston
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
- Present Address, Program In Cellular and Molecular Medicine (PCMM), Boston Children’s Hospital, Boston, MA, USA
| | - Andrea Soranno
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - Alex S. Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
21
|
Kilgore HR, Chinn I, Mikhael PG, Mitnikov I, Van Dongen C, Zylberberg G, Afeyan L, Banani S, Wilson-Hawken S, Lee TI, Barzilay R, Young RA. Protein codes promote selective subcellular compartmentalization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589616. [PMID: 38659952 PMCID: PMC11042338 DOI: 10.1101/2024.04.15.589616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must efficiently assemble. Here, we demonstrate that proteins with shared functions share amino acid sequence codes that guide them to compartment destinations. A protein language model, ProtGPS, was developed that predicts with high performance the compartment localization of human proteins excluded from the training set. ProtGPS successfully guided generation of novel protein sequences that selectively assemble in targeted subcellular compartments. ProtGPS also identified pathological mutations that change this code and lead to altered subcellular localization of proteins. Our results indicate that protein sequences contain not only a folding code, but also a previously unrecognized code governing their distribution in specific cellular compartments.
Collapse
Affiliation(s)
- Henry R. Kilgore
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Itamar Chinn
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Peter G. Mikhael
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Ilan Mitnikov
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | - Guy Zylberberg
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Lena Afeyan
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Salman Banani
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Susana Wilson-Hawken
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Program of Computational & Systems Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tong Ihn Lee
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Richard A. Young
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
22
|
Liang Q, Peng N, Xie Y, Kumar N, Gao W, Miao Y. MolPhase, an advanced prediction algorithm for protein phase separation. EMBO J 2024; 43:1898-1918. [PMID: 38565952 PMCID: PMC11065880 DOI: 10.1038/s44318-024-00090-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 02/27/2024] [Accepted: 03/14/2024] [Indexed: 04/04/2024] Open
Abstract
We introduce MolPhase, an advanced algorithm for predicting protein phase separation (PS) behavior that improves accuracy and reliability by utilizing diverse physicochemical features and extensive experimental datasets. MolPhase applies a user-friendly interface to compare distinct biophysical features side-by-side along protein sequences. By additional comparison with structural predictions, MolPhase enables efficient predictions of new phase-separating proteins and guides hypothesis generation and experimental design. Key contributing factors underlying MolPhase include electrostatic pi-interactions, disorder, and prion-like domains. As an example, MolPhase finds that phytobacterial type III effectors (T3Es) are highly prone to homotypic PS, which was experimentally validated in vitro biochemically and in vivo in plants, mimicking their injection and accumulation in the host during microbial infection. The physicochemical characteristics of T3Es dictate their patterns of association for multivalent interactions, influencing the material properties of phase-separating droplets based on the surrounding microenvironment in vivo or in vitro. Robust integration of MolPhase's effective prediction and experimental validation exhibit the potential to evaluate and explore how biomolecule PS functions in biological systems.
Collapse
Affiliation(s)
- Qiyu Liang
- School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore, Singapore
| | - Nana Peng
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore, Singapore
| | - Yi Xie
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore, Singapore
| | - Nivedita Kumar
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore, Singapore
| | - Weibo Gao
- School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore, Singapore
| | - Yansong Miao
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore, Singapore.
- Institute for Digital Molecular Analytics and Science, Nanyang Technological University, 636921, Singapore, Singapore.
| |
Collapse
|
23
|
King MR, Ruff KM, Lin AZ, Pant A, Farag M, Lalmansingh JM, Wu T, Fossat MJ, Ouyang W, Lew MD, Lundberg E, Vahey MD, Pappu RV. Macromolecular condensation organizes nucleolar sub-phases to set up a pH gradient. Cell 2024; 187:1889-1906.e24. [PMID: 38503281 PMCID: PMC11938373 DOI: 10.1016/j.cell.2024.02.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 01/02/2024] [Accepted: 02/22/2024] [Indexed: 03/21/2024]
Abstract
Nucleoli are multicomponent condensates defined by coexisting sub-phases. We identified distinct intrinsically disordered regions (IDRs), including acidic (D/E) tracts and K-blocks interspersed by E-rich regions, as defining features of nucleolar proteins. We show that the localization preferences of nucleolar proteins are determined by their IDRs and the types of RNA or DNA binding domains they encompass. In vitro reconstitutions and studies in cells showed how condensation, which combines binding and complex coacervation of nucleolar components, contributes to nucleolar organization. D/E tracts of nucleolar proteins contribute to lowering the pH of co-condensates formed with nucleolar RNAs in vitro. In cells, this sets up a pH gradient between nucleoli and the nucleoplasm. By contrast, juxta-nucleolar bodies, which have different macromolecular compositions, featuring protein IDRs with very different charge profiles, have pH values that are equivalent to or higher than the nucleoplasm. Our findings show that distinct compositional specificities generate distinct physicochemical properties for condensates.
Collapse
Affiliation(s)
- Matthew R King
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Kiersten M Ruff
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Andrew Z Lin
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Avnika Pant
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Mina Farag
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Jared M Lalmansingh
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Tingting Wu
- Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Department of Electrical and Systems Engineering, James F. McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Martin J Fossat
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Wei Ouyang
- Department of Bioengineering, Schools of Engineering and Medicine, Stanford University, Stanford, CA, USA; Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA; Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology, Stockholm, Sweden
| | - Matthew D Lew
- Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Department of Electrical and Systems Engineering, James F. McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Emma Lundberg
- Department of Bioengineering, Schools of Engineering and Medicine, Stanford University, Stanford, CA, USA; Department of Pathology, School of Medicine, Stanford University, Stanford, CA, USA; Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology, Stockholm, Sweden
| | - Michael D Vahey
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Rohit V Pappu
- Department of Biomedical Engineering, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA; Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
24
|
Lemke EA, Babu MM, Kriwacki RW, Mittag T, Pappu RV, Wright PE, Forman-Kay JD. Intrinsic disorder: A term to define the specific physicochemical characteristic of protein conformational heterogeneity. Mol Cell 2024; 84:1188-1190. [PMID: 38579677 DOI: 10.1016/j.molcel.2024.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 02/19/2024] [Accepted: 02/23/2024] [Indexed: 04/07/2024]
Abstract
In his commentary in this issue of Molecular Cell,1 Struhl reasons that the term "intrinsically disordered regions" represents a vague and confusing concept for protein function. However, the term "intrinsically disordered" highlights the important physicochemical characteristic of conformational heterogeneity. Thus, "intrinsically disordered" is the counterpart to the term "folded, " with neither term having specific functional implications.
Collapse
Affiliation(s)
- Edward A Lemke
- Biocenter, Johannes Gutenberg University, Hanns-Dieter-Hüsch Weg 17, 55128 Mainz, Germany; Institute for Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany.
| | - M Madan Babu
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA; Center of Excellence for Data Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
| | - Richard W Kriwacki
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA; Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Sciences Center, Memphis, TN, USA.
| | - Tanja Mittag
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA.
| | - Peter E Wright
- Department of Integrative Structural and Computational Biology and Skaggs Institute of Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | - Julie D Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto ON M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto ON M5S 1A8, Canada.
| |
Collapse
|
25
|
Singleton MD, Eisen MB. Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. PLoS Comput Biol 2024; 20:e1012028. [PMID: 38662765 PMCID: PMC11075841 DOI: 10.1371/journal.pcbi.1012028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 05/07/2024] [Accepted: 03/28/2024] [Indexed: 05/08/2024] Open
Abstract
Intrinsically disordered regions (IDRs) are segments of proteins without stable three-dimensional structures. As this flexibility allows them to interact with diverse binding partners, IDRs play key roles in cell signaling and gene expression. Despite the prevalence and importance of IDRs in eukaryotic proteomes and various biological processes, associating them with specific molecular functions remains a significant challenge due to their high rates of sequence evolution. However, by comparing the observed values of various IDR-associated properties against those generated under a simulated model of evolution, a recent study found most IDRs across the entire yeast proteome contain conserved features. Furthermore, it showed clusters of IDRs with common "evolutionary signatures," i.e. patterns of conserved features, were associated with specific biological functions. To determine if similar patterns of conservation are found in the IDRs of other systems, in this work we applied a series of phylogenetic models to over 7,500 orthologous IDRs identified in the Drosophila genome to dissect the forces driving their evolution. By comparing models of constrained and unconstrained continuous trait evolution using the Brownian motion and Ornstein-Uhlenbeck models, respectively, we identified signals of widespread constraint, indicating conservation of distributed features is mechanism of IDR evolution common to multiple biological systems. In contrast to the previous study in yeast, however, we observed limited evidence of IDR clusters with specific biological functions, which suggests a more complex relationship between evolutionary constraints and function in the IDRs of multicellular organisms.
Collapse
Affiliation(s)
- Marc D. Singleton
- Howard Hughes Medical Institute, UC Berkeley, Berkeley, California, United States of America
| | - Michael B. Eisen
- Howard Hughes Medical Institute, UC Berkeley, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, UC Berkeley, Berkeley, California, United States of America
| |
Collapse
|
26
|
Yang KK, Fusi N, Lu AX. Convolutions are competitive with transformers for protein sequence pretraining. Cell Syst 2024; 15:286-294.e2. [PMID: 38428432 DOI: 10.1016/j.cels.2024.01.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 11/08/2023] [Accepted: 01/24/2024] [Indexed: 03/03/2024]
Abstract
Pretrained protein sequence language models have been shown to improve the performance of many prediction tasks and are now routinely integrated into bioinformatics tools. However, these models largely rely on the transformer architecture, which scales quadratically with sequence length in both run-time and memory. Therefore, state-of-the-art models have limitations on sequence length. To address this limitation, we investigated whether convolutional neural network (CNN) architectures, which scale linearly with sequence length, could be as effective as transformers in protein language models. With masked language model pretraining, CNNs are competitive with, and occasionally superior to, transformers across downstream applications while maintaining strong performance on sequences longer than those allowed in the current state-of-the-art transformer models. Our work suggests that computational efficiency can be improved without sacrificing performance, simply by using a CNN architecture instead of a transformer, and emphasizes the importance of disentangling pretraining task and model architecture. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Kevin K Yang
- Microsoft Research New England, Cambridge, MA 02139, USA.
| | - Nicolo Fusi
- Microsoft Research New England, Cambridge, MA 02139, USA
| | - Alex X Lu
- Microsoft Research New England, Cambridge, MA 02139, USA
| |
Collapse
|
27
|
Dragwidge JM, Wang Y, Brocard L, De Meyer A, Hudeček R, Eeckhout D, Grones P, Buridan M, Chambaud C, Pejchar P, Potocký M, Winkler J, Vandorpe M, Serre N, Fendrych M, Bernard A, De Jaeger G, Pleskot R, Fang X, Van Damme D. Biomolecular condensation orchestrates clathrin-mediated endocytosis in plants. Nat Cell Biol 2024; 26:438-449. [PMID: 38347182 PMCID: PMC7615741 DOI: 10.1038/s41556-024-01354-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 01/10/2024] [Indexed: 02/16/2024]
Abstract
Clathrin-mediated endocytosis is an essential cellular internalization pathway involving the dynamic assembly of clathrin and accessory proteins to form membrane-bound vesicles. The evolutionarily ancient TSET-TPLATE complex (TPC) plays an essential, but ill-defined role in endocytosis in plants. Here we show that two highly disordered TPC subunits, AtEH1 and AtEH2, function as scaffolds to drive biomolecular condensation of the complex. These condensates specifically nucleate on the plasma membrane through interactions with anionic phospholipids, and facilitate the dynamic recruitment and assembly of clathrin, as well as early- and late-stage endocytic accessory proteins. Importantly, condensation promotes ordered clathrin assemblies. TPC-driven biomolecular condensation thereby facilitates dynamic protein assemblies throughout clathrin-mediated endocytosis. Furthermore, we show that a disordered region of AtEH1 controls the material properties of endocytic condensates in vivo. Alteration of these material properties disturbs the recruitment of accessory proteins, influences endocytosis dynamics and impairs plant responsiveness. Our findings reveal how collective interactions shape endocytosis.
Collapse
Affiliation(s)
- Jonathan Michael Dragwidge
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
- VIB Center for Plant Systems Biology, Ghent, Belgium.
| | - Yanning Wang
- Center for Plant Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Lysiane Brocard
- Bordeaux Imaging Center, INSERM, CNRS, Université de Bordeaux, Bordeaux, France
| | - Andreas De Meyer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Roman Hudeček
- Institute of Experimental Botany of the Czech Academy of Sciences, Prague, Czech Republic
| | - Dominique Eeckhout
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Peter Grones
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Matthieu Buridan
- Bordeaux Imaging Center, INSERM, CNRS, Université de Bordeaux, Bordeaux, France
| | - Clément Chambaud
- Laboratoire de Biogenèse Membranaire, CNRS, Université de Bordeaux, Bordeaux, France
| | - Přemysl Pejchar
- Institute of Experimental Botany of the Czech Academy of Sciences, Prague, Czech Republic
| | - Martin Potocký
- Institute of Experimental Botany of the Czech Academy of Sciences, Prague, Czech Republic
| | - Joanna Winkler
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Michaël Vandorpe
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Nelson Serre
- Department of Experimental Plant Biology, Faculty of Sciences, Charles University, Prague, Czech Republic
| | - Matyáš Fendrych
- Department of Experimental Plant Biology, Faculty of Sciences, Charles University, Prague, Czech Republic
| | - Amelie Bernard
- Laboratoire de Biogenèse Membranaire, CNRS, Université de Bordeaux, Bordeaux, France
| | - Geert De Jaeger
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Roman Pleskot
- Institute of Experimental Botany of the Czech Academy of Sciences, Prague, Czech Republic
| | - Xiaofeng Fang
- Center for Plant Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Daniël Van Damme
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
- VIB Center for Plant Systems Biology, Ghent, Belgium.
| |
Collapse
|
28
|
Holehouse AS, Kragelund BB. The molecular basis for cellular function of intrinsically disordered protein regions. Nat Rev Mol Cell Biol 2024; 25:187-211. [PMID: 37957331 PMCID: PMC11459374 DOI: 10.1038/s41580-023-00673-0] [Citation(s) in RCA: 159] [Impact Index Per Article: 159.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered protein regions exist in a collection of dynamic interconverting conformations that lack a stable 3D structure. These regions are structurally heterogeneous, ubiquitous and found across all kingdoms of life. Despite the absence of a defined 3D structure, disordered regions are essential for cellular processes ranging from transcriptional control and cell signalling to subcellular organization. Through their conformational malleability and adaptability, disordered regions extend the repertoire of macromolecular interactions and are readily tunable by their structural and chemical context, making them ideal responders to regulatory cues. Recent work has led to major advances in understanding the link between protein sequence and conformational behaviour in disordered regions, yet the link between sequence and molecular function is less well defined. Here we consider the biochemical and biophysical foundations that underlie how and why disordered regions can engage in productive cellular functions, provide examples of emerging concepts and discuss how protein disorder contributes to intracellular information processing and regulation of cellular function.
Collapse
Affiliation(s)
- Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St Louis, MO, USA.
- Center for Biomolecular Condensates, Washington University in St Louis, St Louis, MO, USA.
| | - Birthe B Kragelund
- REPIN, Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
29
|
Garg A, González-Foutel NS, Gielnik MB, Kjaergaard M. Design of functional intrinsically disordered proteins. Protein Eng Des Sel 2024; 37:gzae004. [PMID: 38431892 DOI: 10.1093/protein/gzae004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/22/2023] [Indexed: 03/05/2024] Open
Abstract
Many proteins do not fold into a fixed three-dimensional structure, but rather function in a highly disordered state. These intrinsically disordered proteins pose a unique challenge to protein engineering and design: How can proteins be designed de novo if not by tailoring their structure? Here, we will review the nascent field of design of intrinsically disordered proteins with focus on applications in biotechnology and medicine. The design goals should not necessarily be the same as for de novo design of folded proteins as disordered proteins have unique functional strengths and limitations. We focus on functions where intrinsically disordered proteins are uniquely suited including disordered linkers, desiccation chaperones, sensors of the chemical environment, delivery of pharmaceuticals, and constituents of biomolecular condensates. Design of functional intrinsically disordered proteins relies on a combination of computational tools and heuristics gleaned from sequence-function studies. There are few cases where intrinsically disordered proteins have made it into industrial applications. However, we argue that disordered proteins can perform many roles currently performed by organic polymers, and that these proteins might be more designable due to their modularity.
Collapse
Affiliation(s)
- Ankush Garg
- Department of Molecular Biology and Genetics, Aarhus University, 8000 Aarhus, Denmark
| | | | - Maciej B Gielnik
- Department of Molecular Biology and Genetics, Aarhus University, 8000 Aarhus, Denmark
| | - Magnus Kjaergaard
- Department of Molecular Biology and Genetics, Aarhus University, 8000 Aarhus, Denmark
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, 8000 Aarhus, Denmark
| |
Collapse
|
30
|
Taneja I, Lasker K. Machine-learning-based methods to generate conformational ensembles of disordered proteins. Biophys J 2024; 123:101-113. [PMID: 38053335 PMCID: PMC10808026 DOI: 10.1016/j.bpj.2023.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 10/24/2023] [Accepted: 12/01/2023] [Indexed: 12/07/2023] Open
Abstract
Intrinsically disordered proteins are characterized by a conformational ensemble. While computational approaches such as molecular dynamics simulations have been used to generate such ensembles, their computational costs can be prohibitive. An alternative approach is to learn from data and train machine-learning models to generate conformational ensembles of disordered proteins. This has been a relatively unexplored approach, and in this work we demonstrate a proof-of-principle approach to do so. Specifically, we devised a two-stage computational pipeline: in the first stage, we employed supervised machine-learning models to predict ensemble-derived two-dimensional (2D) properties of a sequence, given the conformational ensemble of a closely related sequence. In the second stage, we used denoising diffusion models to generate three-dimensional (3D) coarse-grained conformational ensembles, given the two-dimensional predictions outputted by the first stage. We trained our models on a data set of coarse-grained molecular dynamics simulations of thousands of rationally designed synthetic sequences. The accuracy of our 2D and 3D predictions was validated across multiple metrics, and our work demonstrates the applicability of machine-learning techniques to predicting higher-dimensional properties of disordered proteins.
Collapse
Affiliation(s)
- Ishan Taneja
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, California
| | - Keren Lasker
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, California.
| |
Collapse
|
31
|
Pérez-Jover I, Rochon K, Hu D, Mahajan M, Madan Mohan P, Santos-Pérez I, Ormaetxea Gisasola J, Martinez Galvez JM, Agirre J, Qi X, Mears JA, Shnyrova AV, Ramachandran R. Allosteric control of dynamin-related protein 1 through a disordered C-terminal Short Linear Motif. Nat Commun 2024; 15:52. [PMID: 38168038 PMCID: PMC10761769 DOI: 10.1038/s41467-023-44413-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024] Open
Abstract
The mechanochemical GTPase dynamin-related protein 1 (Drp1) catalyzes mitochondrial and peroxisomal fission, but the regulatory mechanisms remain ambiguous. Here we find that a conserved, intrinsically disordered, six-residue Short Linear Motif at the extreme Drp1 C-terminus, named CT-SLiM, constitutes a critical allosteric site that controls Drp1 structure and function in vitro and in vivo. Extension of the CT-SLiM by non-native residues, or its interaction with the protein partner GIPC-1, constrains Drp1 subunit conformational dynamics, alters self-assembly properties, and limits cooperative GTP hydrolysis, surprisingly leading to the fission of model membranes in vitro. In vivo, the involvement of the native CT-SLiM is critical for productive mitochondrial and peroxisomal fission, as both deletion and non-native extension of the CT-SLiM severely impair their progression. Thus, contrary to prevailing models, Drp1-catalyzed membrane fission relies on allosteric communication mediated by the CT-SLiM, deceleration of GTPase activity, and coupled changes in subunit architecture and assembly-disassembly dynamics.
Collapse
Affiliation(s)
- Isabel Pérez-Jover
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940, Leioa, Spain
- Instituto Biofisika, CSIC, UPV/EHU, 48940, Leioa, Spain
| | - Kristy Rochon
- Department of Pharmacology, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Di Hu
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Mukesh Mahajan
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Pooja Madan Mohan
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Isaac Santos-Pérez
- Electron Microscopy and Crystallography Center for Cooperative Research in Biosciences (CIC bioGUNE), Bizkaia Science and Technology, Park Bld 800, 48160-Derio, Bizkaia, Spain
| | - Julene Ormaetxea Gisasola
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940, Leioa, Spain
- Instituto Biofisika, CSIC, UPV/EHU, 48940, Leioa, Spain
| | - Juan Manuel Martinez Galvez
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940, Leioa, Spain
- Instituto Biofisika, CSIC, UPV/EHU, 48940, Leioa, Spain
| | - Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, YO10 5DD, York, UK
| | - Xin Qi
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
- Center for Mitochondrial Diseases, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Jason A Mears
- Department of Pharmacology, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
- Center for Mitochondrial Diseases, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
- Cleveland Center for Membrane and Structural Biology, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Anna V Shnyrova
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940, Leioa, Spain.
- Instituto Biofisika, CSIC, UPV/EHU, 48940, Leioa, Spain.
| | - Rajesh Ramachandran
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
- Cleveland Center for Membrane and Structural Biology, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| |
Collapse
|
32
|
Schuck P, Zhao H. Diversity of short linear interaction motifs in SARS-CoV-2 nucleocapsid protein. mBio 2023; 14:e0238823. [PMID: 38018991 PMCID: PMC10746173 DOI: 10.1128/mbio.02388-23] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 10/16/2023] [Indexed: 11/30/2023] Open
Abstract
IMPORTANCE Short linear motifs (SLiMs) are 3-10 amino acid long binding motifs in intrinsically disordered protein regions (IDRs) that serve as ubiquitous protein-protein interaction modules in eukaryotic cells. Through molecular mimicry, viruses hijack these sequence motifs to control host cellular processes. It is thought that the small size of SLiMs and the high mutation frequencies of viral IDRs allow rapid host adaptation. However, a salient characteristic of RNA viruses, due to high replication errors, is their obligate existence as mutant swarms. Taking advantage of the uniquely large genomic database of SARS-CoV-2, here, we analyze the role of sequence diversity in the presentation of SLiMs, focusing on the highly abundant, multi-functional nucleocapsid protein. We find that motif mimicry is a highly dynamic process that produces an abundance of motifs transiently present in subsets of mutant species. This diversity allows the virus to efficiently explore eukaryotic motifs and evolve the host-virus interface.
Collapse
Affiliation(s)
- Peter Schuck
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, Maryland, USA
| | - Huaying Zhao
- Laboratory of Dynamics of Macromolecular Assembly, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
33
|
Moses D, Ginell GM, Holehouse AS, Sukenik S. Intrinsically disordered regions are poised to act as sensors of cellular chemistry. Trends Biochem Sci 2023; 48:1019-1034. [PMID: 37657994 PMCID: PMC10840941 DOI: 10.1016/j.tibs.2023.08.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 07/31/2023] [Accepted: 08/01/2023] [Indexed: 09/03/2023]
Abstract
Intrinsically disordered proteins and protein regions (IDRs) are abundant in eukaryotic proteomes and play a wide variety of essential roles. Instead of folding into a stable structure, IDRs exist in an ensemble of interconverting conformations whose structure is biased by sequence-dependent interactions. The absence of a stable 3D structure, combined with high solvent accessibility, means that IDR conformational biases are inherently sensitive to changes in their environment. Here, we argue that IDRs are ideally poised to act as sensors and actuators of cellular physicochemistry. We review the physical principles that underlie IDR sensitivity, the molecular mechanisms that translate this sensitivity to function, and recent studies where environmental sensing by IDRs may play a key role in their downstream function.
Collapse
Affiliation(s)
- David Moses
- Department of Chemistry and Biochemistry, University of California, Merced, CA, USA
| | - Garrett M Ginell
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA; Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA; Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO, USA.
| | - Shahar Sukenik
- Department of Chemistry and Biochemistry, University of California, Merced, CA, USA; Quantitative Systems Biology Program, University of California, Merced, CA, USA.
| |
Collapse
|
34
|
Alderson TR, Pritišanac I, Kolarić Đ, Moses AM, Forman-Kay JD. Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2. Proc Natl Acad Sci U S A 2023; 120:e2304302120. [PMID: 37878721 PMCID: PMC10622901 DOI: 10.1073/pnas.2304302120] [Citation(s) in RCA: 59] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed that these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly fivefold enriched in conditionally folded IDRs over IDRs in general and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs.
Collapse
Affiliation(s)
- T. Reid Alderson
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ONM5S 1A8, Canada
| | - Iva Pritišanac
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Đesika Kolarić
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Alan M. Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
| | - Julie D. Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
| |
Collapse
|
35
|
Patil A, Strom AR, Paulo JA, Collings CK, Ruff KM, Shinn MK, Sankar A, Cervantes KS, Wauer T, St Laurent JD, Xu G, Becker LA, Gygi SP, Pappu RV, Brangwynne CP, Kadoch C. A disordered region controls cBAF activity via condensation and partner recruitment. Cell 2023; 186:4936-4955.e26. [PMID: 37788668 PMCID: PMC10792396 DOI: 10.1016/j.cell.2023.08.032] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 07/16/2023] [Accepted: 08/24/2023] [Indexed: 10/05/2023]
Abstract
Intrinsically disordered regions (IDRs) represent a large percentage of overall nuclear protein content. The prevailing dogma is that IDRs engage in non-specific interactions because they are poorly constrained by evolutionary selection. Here, we demonstrate that condensate formation and heterotypic interactions are distinct and separable features of an IDR within the ARID1A/B subunits of the mSWI/SNF chromatin remodeler, cBAF, and establish distinct "sequence grammars" underlying each contribution. Condensation is driven by uniformly distributed tyrosine residues, and partner interactions are mediated by non-random blocks rich in alanine, glycine, and glutamine residues. These features concentrate a specific cBAF protein-protein interaction network and are essential for chromatin localization and activity. Importantly, human disease-associated perturbations in ARID1B IDR sequence grammars disrupt cBAF function in cells. Together, these data identify IDR contributions to chromatin remodeling and explain how phase separation provides a mechanism through which both genomic localization and functional partner recruitment are achieved.
Collapse
Affiliation(s)
- Ajinkya Patil
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Program in Virology, Harvard Medical School, Boston, MA 02115, USA
| | - Amy R Strom
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Joao A Paulo
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Clayton K Collings
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kiersten M Ruff
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Min Kyung Shinn
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Akshay Sankar
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kasey S Cervantes
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Tobias Wauer
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA
| | - Jessica D St Laurent
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Department of Obstetrics and Gynecology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Grace Xu
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Lindsay A Becker
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Steven P Gygi
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Clifford P Brangwynne
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA; Howard Hughes Medical Institute, Chevy Chase, MD 21044, USA; Omenn-Darling Bioengineering Institute, Princeton University, Princeton, NJ 08544, USA.
| | - Cigall Kadoch
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Howard Hughes Medical Institute, Chevy Chase, MD 21044, USA.
| |
Collapse
|
36
|
Pesce F, Bremer A, Tesei G, Hopkins JB, Grace CR, Mittag T, Lindorff-Larsen K. Design of intrinsically disordered protein variants with diverse structural properties. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.22.563461. [PMID: 37961110 PMCID: PMC10634714 DOI: 10.1101/2023.10.22.563461] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered proteins (IDPs) perform a wide range of functions in biology, suggesting that the ability to design IDPs could help expand the repertoire of proteins with novel functions. Designing IDPs with specific structural or functional properties has, however, been difficult, in part because determining accurate conformational ensembles of IDPs generally requires a combination of computational modelling and experiments. Motivated by recent advancements in efficient physics-based models for simulations of IDPs, we have developed a general algorithm for designing IDPs with specific structural properties. We demonstrate the power of the algorithm by generating variants of naturally occurring IDPs with different levels of compaction and that vary more than 100 fold in their propensity to undergo phase separation, even while keeping a fixed amino acid composition. We experimentally tested designs of variants of the low-complexity domain of hnRNPA1 and find high accuracy in our computational predictions, both in terms of single-chain compaction and propensity to undergo phase separation. We analyze the sequence features that determine changes in compaction and propensity to phase separate and find an overall good agreement with previous findings for naturally occurring sequences. Our general, physics-based method enables the design of disordered sequences with specified conformational properties. Our algorithm thus expands the toolbox for protein design to include also the most flexible proteins and will enable the design of proteins whose functions exploit the many properties afforded by protein disorder.
Collapse
Affiliation(s)
- Francesco Pesce
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Anne Bremer
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Giulio Tesei
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jesse B. Hopkins
- BioCAT, Department of Physics, Illinois Institute of Technology, Chicago, IL, USA
| | - Christy R. Grace
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Tanja Mittag
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
37
|
Oh C, Buckley PM, Choi J, Hierro A, DiMaio D. Sequence-independent activity of a predicted long disordered segment of the human papillomavirus type 16 L2 capsid protein during virus entry. Proc Natl Acad Sci U S A 2023; 120:e2307721120. [PMID: 37819982 PMCID: PMC10589650 DOI: 10.1073/pnas.2307721120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 08/28/2023] [Indexed: 10/13/2023] Open
Abstract
The activity of proteins is thought to be invariably determined by their amino acid sequence or composition, but we show that a long segment of a viral protein can support infection independent of its sequence or composition. During virus entry, the papillomavirus L2 capsid protein protrudes through the endosome membrane into the cytoplasm to bind cellular factors such as retromer required for intracellular virus trafficking. Here, we show that an ~110 amino acid segment of L2 is predicted to be disordered and that large deletions in this segment abolish infectivity of HPV16 pseudoviruses by inhibiting cytoplasmic protrusion of L2, association with retromer, and proper virus trafficking. The activity of these mutants can be restored by insertion of protein segments with diverse sequences, compositions, and chemical properties, including scrambled amino acid sequences, a tandem array of a short sequence, and the intrinsically disordered region of an unrelated cellular protein. The infectivity of mutants with small in-frame deletions in this segment directly correlates with the size of the segment. These results indicate that the length of the disordered segment, not its sequence or composition, determines its activity during HPV16 pseudovirus infection. We propose that a minimal length of L2 is required for it to protrude far enough into the cytoplasm to bind cytoplasmic trafficking factors, but the sequence of this segment is largely irrelevant. Thus, protein segments can carry out complex biological functions such as Human papillomavirus pseudovirus infection in a sequence-independent manner. This finding has important implications for protein function and evolution.
Collapse
Affiliation(s)
- Changin Oh
- Department of Genetics, Yale School of Medicine, New Haven, CT06520-8005
| | - Patrick M. Buckley
- Department of Microbial Pathogenesis, Yale School of Medicine, New Haven, CT06536-0812
| | - Jeongjoon Choi
- Department of Genetics, Yale School of Medicine, New Haven, CT06520-8005
| | - Aitor Hierro
- Center for Cooperative Research in Biosciences, Bilbao, Derio48160, Spain
- Basque Foundation for Science, Bilbao48009, Spain
| | - Daniel DiMaio
- Department of Genetics, Yale School of Medicine, New Haven, CT06520-8005
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT06520-8040
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT06520-8024
- Yale Cancer Center, New Haven, CT06520-8028
| |
Collapse
|
38
|
Alston JJ, Soranno A, Holehouse AS. Conserved molecular recognition by an intrinsically disordered region in the absence of sequence conservation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.06.552128. [PMID: 37609146 PMCID: PMC10441348 DOI: 10.1101/2023.08.06.552128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Intrinsically disordered regions (IDRs) are critical for cellular function, yet often appear to lack sequence conservation when assessed by multiple sequence alignments. This raises the question of if and how function can be encoded and preserved in these regions despite massive sequence variation. To address this question, we have applied coarse-grained molecular dynamics simulations to investigate non-specific RNA binding of coronavirus nucleocapsid proteins. Coronavirus nucleocapsid proteins consist of multiple interspersed disordered and folded domains that bind RNA. We focussed here on the first two domains of coronavirus nucleocapsid proteins, the disordered N-terminal domain (NTD) followed by the folded RNA binding domain (RBD). While the NTD is highly variable across evolution, the RBD is structurally conserved. This combination makes the NTD-RBD a convenient model system to explore the interplay between an IDR adjacent to a folded domain, and how changes in IDR sequence can influence molecular recognition of a partner. Our results reveal a surprising degree of sequence-specificity encoded by both the composition and the precise order of the amino acids in the NTD. The presence of an NTD can - depending on the sequence - either suppress or enhance RNA binding. Despite this sensitivity, large-scale variation in NTD sequences is possible while certain sequence features are retained. Consequently, a conformationally-conserved fuzzy RNA:protein complex is found across nucleocapsid protein orthologs, despite large-scale changes in both NTD sequence and RBD surface chemistry. Taken together, these insights shed light on the ability of disordered regions to preserve functional characteristics despite their sequence variability.
Collapse
|
39
|
Ahmed R, Forman-Kay JD. Aberrant phase separation: linking IDR mutations to disease. Cell Res 2023; 33:583-584. [PMID: 37016021 PMCID: PMC10397348 DOI: 10.1038/s41422-023-00804-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2023] Open
Affiliation(s)
- Rashik Ahmed
- Program in Molecular Medicine, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON, Canada
| | - Julie D Forman-Kay
- Program in Molecular Medicine, The Hospital for Sick Children, Toronto, ON, Canada.
- Department of Biochemistry, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
40
|
Ginell GM, Flynn AJ, Holehouse AS. SHEPHARD: a modular and extensible software architecture for analyzing and annotating large protein datasets. Bioinformatics 2023; 39:btad488. [PMID: 37540173 PMCID: PMC10423030 DOI: 10.1093/bioinformatics/btad488] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 07/02/2023] [Accepted: 08/03/2023] [Indexed: 08/05/2023] Open
Abstract
MOTIVATION The emergence of high-throughput experiments and high-resolution computational predictions has led to an explosion in the quality and volume of protein sequence annotations at proteomic scales. Unfortunately, sanity checking, integrating, and analyzing complex sequence annotations remains logistically challenging and introduces a major barrier to entry for even superficial integrative bioinformatics. RESULTS To address this technical burden, we have developed SHEPHARD, a Python framework that trivializes large-scale integrative protein bioinformatics. SHEPHARD combines an object-oriented hierarchical data structure with database-like features, enabling programmatic annotation, integration, and analysis of complex datatypes. Importantly SHEPHARD is easy to use and enables a Pythonic interrogation of largescale protein datasets with millions of unique annotations. We use SHEPHARD to examine three orthogonal proteome-wide questions relating protein sequence to molecular function, illustrating its ability to uncover novel biology. AVAILABILITY AND IMPLEMENTATION We provided SHEPHARD as both a stand-alone software package (https://github.com/holehouse-lab/shephard), and as a Google Colab notebook with a collection of precomputed proteome-wide annotations (https://github.com/holehouse-lab/shephard-colab).
Collapse
Affiliation(s)
- Garrett M Ginell
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, 660 South Euclid Avenue, Saint Louis, MO 63110, United States
- Center for Biomolecular Condensates, Washington University in St. Louis, 1 Brookings Drive, Saint Louis, MO 63130, United States
| | - Aidan J Flynn
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, 660 South Euclid Avenue, Saint Louis, MO 63110, United States
- Center for Biomolecular Condensates, Washington University in St. Louis, 1 Brookings Drive, Saint Louis, MO 63130, United States
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, 660 South Euclid Avenue, Saint Louis, MO 63110, United States
- Center for Biomolecular Condensates, Washington University in St. Louis, 1 Brookings Drive, Saint Louis, MO 63130, United States
| |
Collapse
|
41
|
Abstract
Multivalent proteins and nucleic acids, collectively referred to as multivalent associative biomacromolecules, provide the driving forces for the formation and compositional regulation of biomolecular condensates. Here, we review the key concepts of phase transitions of aqueous solutions of associative biomacromolecules, specifically proteins that include folded domains and intrinsically disordered regions. The phase transitions of these systems come under the rubric of coupled associative and segregative transitions. The concepts underlying these processes are presented, and their relevance to biomolecular condensates is discussed.
Collapse
Affiliation(s)
- Rohit V. Pappu
- Department of Biomedical Engineering, Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Samuel R. Cohen
- Department of Biomedical Engineering, Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO 63130, USA
- Center of Regenerative Medicine, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Furqan Dar
- Department of Biomedical Engineering, Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Mina Farag
- Department of Biomedical Engineering, Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Mrityunjoy Kar
- Max Planck Institute of Cell Biology and Genetics, 01307 Dresden, Germany
| |
Collapse
|
42
|
Pérez-Jover I, Rochon K, Hu D, Mohan PM, Santos-Perez I, Gisasola JO, Galvez JMM, Agirre J, Qi X, Mears JA, Shnyrova AV, Ramachandran R. Allosteric control of dynamin-related protein 1-catalyzed mitochondrial fission through a conserved disordered C-terminal Short Linear Motif. RESEARCH SQUARE 2023:rs.3.rs-3161608. [PMID: 37503116 PMCID: PMC10371074 DOI: 10.21203/rs.3.rs-3161608/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The mechanochemical GTPase dynamin-related protein 1 (Drp1) catalyzes mitochondrial fission, but the regulatory mechanisms remain ambiguous. Here we found that a conserved, intrinsically disordered, six-residue Short Linear Motif at the extreme Drp1 C-terminus, named CT-SLiM, constitutes a critical allosteric site that controls Drp1 structure and function in vitro and in vivo. Extension of the CT-SLiM by non-native residues, or its interaction with the protein partner GIPC-1, constrains Drp1 subunit conformational dynamics, alters self-assembly properties, and limits cooperative GTP hydrolysis, leading to the fission of model membranes in vitro. In vivo, the availability of the native CT-SLiM is a requirement for productive mitochondrial fission, as both non-native extension and deletion of the CT-SLiM severely impair its progression. Thus, contrary to prevailing models, Drp1-catalyzed mitochondrial fission relies on allosteric communication mediated by the CT-SLiM, deceleration of GTPase activity, and coupled changes in subunit architecture and assembly-disassembly dynamics.
Collapse
Affiliation(s)
- Isabel Pérez-Jover
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940 Leioa, Spain
- Instituto Biofisika, University of the Basque Country, 48940 Leioa, Spain
| | - Kristy Rochon
- Department of Pharmacology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Di Hu
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Pooja Madan Mohan
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Isaac Santos-Perez
- Electron Microscopy and Crystallography Center for Cooperative Research in Biosciences (CIC bioGUNE), Bizkaia Science and Technology Park Bld 800, 48160-Derio, Bizkaia, Spain
| | - Julene Ormaetxea Gisasola
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940 Leioa, Spain
- Instituto Biofisika, University of the Basque Country, 48940 Leioa, Spain
| | - Juan Manuel Martinez Galvez
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940 Leioa, Spain
- Instituto Biofisika, University of the Basque Country, 48940 Leioa, Spain
| | - Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, YO10 5DD, York, UK
| | - Xin Qi
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
- Center for Mitochondrial Diseases, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Jason A Mears
- Department of Pharmacology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
- Center for Mitochondrial Diseases, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
- Cleveland Center for Membrane and Structural Biology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Anna V Shnyrova
- Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940 Leioa, Spain
- Instituto Biofisika, University of the Basque Country, 48940 Leioa, Spain
| | - Rajesh Ramachandran
- Department of Physiology and Biophysics, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
- Cleveland Center for Membrane and Structural Biology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| |
Collapse
|
43
|
Jonas F, Carmi M, Krupkin B, Steinberger J, Brodsky S, Jana T, Barkai N. The molecular grammar of protein disorder guiding genome-binding locations. Nucleic Acids Res 2023; 51:4831-4844. [PMID: 36938874 PMCID: PMC10250222 DOI: 10.1093/nar/gkad184] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/25/2023] [Accepted: 03/15/2023] [Indexed: 03/21/2023] Open
Abstract
Intrinsically disordered regions (IDRs) direct transcription factors (TFs) towards selected genomic occurrences of their binding motif, as exemplified by budding yeast's Msn2. However, the sequence basis of IDR-directed TF binding selectivity remains unknown. To reveal this sequence grammar, we analyze the genomic localizations of >100 designed IDR mutants, each carrying up to 122 mutations within this 567-AA region. Our data points at multivalent interactions, carried by hydrophobic-mostly aliphatic-residues dispersed within a disordered environment and independent of linear sequence motifs, as the key determinants of Msn2 genomic localization. The implications of our results for the mechanistic basis of IDR-based TF binding preferences are discussed.
Collapse
Affiliation(s)
- Felix Jonas
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Miri Carmi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Beniamin Krupkin
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Joseph Steinberger
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Tamar Jana
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
44
|
Parra M, Libkind D, Hittinger CT, Álvarez L, Bellora N. Assembly and comparative genome analysis of a Patagonian Aureobasidium pullulans isolate reveals unexpected intraspecific variation. Yeast 2023; 40:197-213. [PMID: 37114349 DOI: 10.1002/yea.3853] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/27/2023] [Accepted: 04/14/2023] [Indexed: 04/29/2023] Open
Abstract
Aureobasidium pullulans is a yeast-like fungus with remarkable phenotypic plasticity widely studied for its importance for the pharmaceutical and food industries. So far, genomic studies with strains from all over the world suggest they constitute a genetically unstructured population, with no association by habitat. However, the mechanisms by which this genome supports so many phenotypic permutations are still poorly understood. Recent works have shown the importance of sequencing yeast genomes from extreme environments to increase the repertoire of phenotypic diversity of unconventional yeasts. In this study, we present the genomic draft of A. pullulans strain from a Patagonian yeast diversity hotspot, re-evaluate its taxonomic classification based on taxogenomic approaches, and annotate its genome with high-depth transcriptomic data. Our analysis suggests this isolate could be considered a novel variant at an early stage of the speciation process. The discovery of divergent strains in a genomically homogeneous group, such as A. pullulans, can be valuable in understanding the evolution of the species. The identification and characterization of new variants will not only allow finding unique traits of biotechnological importance, but also optimize the choice of strains whose phenotypes will be characterized, providing new elements to explore questions about plasticity and adaptation.
Collapse
Affiliation(s)
- Micaela Parra
- Laboratorio de Genómica Computacional, Instituto de Tecnologías Nucleares para la Salud (INTECNUS), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Argentina
| | - Diego Libkind
- Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional del Comahue, San Carlos de Bariloche, Argentina
| | - Chris Todd Hittinger
- Laboratory of Genetics, Center for Genomic Science Innovation, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Lucía Álvarez
- Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional del Comahue, San Carlos de Bariloche, Argentina
| | - Nicolás Bellora
- Laboratorio de Genómica Computacional, Instituto de Tecnologías Nucleares para la Salud (INTECNUS), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Argentina
| |
Collapse
|
45
|
Oh C, Buckley PM, Choi J, Hierro A, DiMaio D. Sequence independent activity of a predicted long disordered segment of the human papillomavirus L2 capsid protein during virus entry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.21.533711. [PMID: 36993745 PMCID: PMC10055320 DOI: 10.1101/2023.03.21.533711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
The papillomavirus L2 capsid protein protrudes through the endosome membrane into the cytoplasm during virus entry to bind cellular factors required for intracellular virus trafficking. Cytoplasmic protrusion of HPV16 L2, virus trafficking, and infectivity are inhibited by large deletions in an ∼110 amino acid segment of L2 that is predicted to be disordered. The activity of these mutants can be restored by inserting protein segments with diverse compositions and chemical properties into this region, including scrambled sequences, a tandem array of a short sequence, and the intrinsically disordered region of a cellular protein. The infectivity of mutants with small in-frame insertions and deletions in this segment directly correlates with the size of the segment. These results indicate that the length of the disordered segment, not its sequence or its composition, determines its activity during virus entry. Sequence independent but length dependent activity has important implications for protein function and evolution.
Collapse
|
46
|
Gurley NJ, Szymanski RA, Dowen RH, Butcher TA, Ishiyama N, Peifer M. Exploring the evolution and function of Canoe’s intrinsically disordered region in linking cell-cell junctions to the cytoskeleton during embryonic morphogenesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.06.531372. [PMID: 36945496 PMCID: PMC10028902 DOI: 10.1101/2023.03.06.531372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
One central question for cell and developmental biologists is defining how epithelial cells can change shape and move during embryonic development without tearing tissues apart. This requires robust yet dynamic connections of cells to one another, via the cell-cell adherens junction, and of junctions to the actin and myosin cytoskeleton, which generates force. The last decade revealed that these connections involve a multivalent network of proteins, rather than a simple linear pathway. We focus on Drosophila Canoe, homolog of mammalian Afadin, as a model for defining the underlying mechanisms. Canoe and Afadin are complex, multidomain proteins that share multiple domains with defined and undefined binding partners. Both also share a long carboxy-terminal intrinsically disordered region (IDR), whose function is less well defined. IDRs are found in many proteins assembled into large multiprotein complexes. We have combined bioinformatic analysis and the use of a series of canoe mutants with early stop codons to explore the evolution and function of the IDR. Our bioinformatic analysis reveals that the IDRs of Canoe and Afadin differ dramatically in sequence and sequence properties. When we looked over shorter evolutionary time scales, we identified multiple conserved motifs. Some of these are predicted by AlphaFold to be alpha-helical, and two correspond to known protein interaction sites for alpha-catenin and F-actin. We next identified the lesions in a series of eighteen canoe mutants, which have early stop codons across the entire protein coding sequence. Analysis of their phenotypes are consistent with the idea that the IDR, including its C-terminal conserved motifs, are important for protein function. These data provide the foundation for further analysis of IDR function.
Collapse
Affiliation(s)
- Noah J. Gurley
- Department of Biology, University of North Carolina at Chapel Hill, CB#3280, Chapel Hill, NC 27599-3280, USA
| | - Rachel A Szymanski
- Department of Biology, University of North Carolina at Chapel Hill, CB#3280, Chapel Hill, NC 27599-3280, USA
| | - Robert H Dowen
- Department of Biology, University of North Carolina at Chapel Hill, CB#3280, Chapel Hill, NC 27599-3280, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - T. Amber Butcher
- Department of Biology, University of North Carolina at Chapel Hill, CB#3280, Chapel Hill, NC 27599-3280, USA
| | - Noboru Ishiyama
- Launchpad Therapeutics, Inc., One Main Street, Cambridge MA 02142
| | - Mark Peifer
- Department of Biology, University of North Carolina at Chapel Hill, CB#3280, Chapel Hill, NC 27599-3280, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
47
|
Millar SR, Huang JQ, Schreiber KJ, Tsai YC, Won J, Zhang J, Moses AM, Youn JY. A New Phase of Networking: The Molecular Composition and Regulatory Dynamics of Mammalian Stress Granules. Chem Rev 2023. [PMID: 36662637 PMCID: PMC10375481 DOI: 10.1021/acs.chemrev.2c00608] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Stress granules (SGs) are cytosolic biomolecular condensates that form in response to cellular stress. Weak, multivalent interactions between their protein and RNA constituents drive their rapid, dynamic assembly through phase separation coupled to percolation. Though a consensus model of SG function has yet to be determined, their perceived implication in cytoprotective processes (e.g., antiviral responses and inhibition of apoptosis) and possible role in the pathogenesis of various neurodegenerative diseases (e.g., amyotrophic lateral sclerosis and frontotemporal dementia) have drawn great interest. Consequently, new studies using numerous cell biological, genetic, and proteomic methods have been performed to unravel the mechanisms underlying SG formation, organization, and function and, with them, a more clearly defined SG proteome. Here, we provide a consensus SG proteome through literature curation and an update of the user-friendly database RNAgranuleDB to version 2.0 (http://rnagranuledb.lunenfeld.ca/). With this updated SG proteome, we use next-generation phase separation prediction tools to assess the predisposition of SG proteins for phase separation and aggregation. Next, we analyze the primary sequence features of intrinsically disordered regions (IDRs) within SG-resident proteins. Finally, we review the protein- and RNA-level determinants, including post-translational modifications (PTMs), that regulate SG composition and assembly/disassembly dynamics.
Collapse
Affiliation(s)
- Sean R Millar
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Jie Qi Huang
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Karl J Schreiber
- Program in Molecular Medicine, The Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
| | - Yi-Cheng Tsai
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Jiyun Won
- Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Jianping Zhang
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario M5G 1X5, Canada
| | - Alan M Moses
- Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario M5T 3A1, Canada.,The Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Ji-Young Youn
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada.,Program in Molecular Medicine, The Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
| |
Collapse
|
48
|
Cascarina SM, Ross ED. Expansion and functional analysis of the SR-related protein family across the domains of life. RNA (NEW YORK, N.Y.) 2022; 28:1298-1314. [PMID: 35863866 PMCID: PMC9479744 DOI: 10.1261/rna.079170.122] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 06/29/2022] [Indexed: 06/15/2023]
Abstract
Serine/arginine-rich (SR) proteins comprise a family of proteins that is predominantly found in eukaryotes and plays a prominent role in RNA splicing. A characteristic feature of SR proteins is the presence of an S/R-rich low-complexity domain (RS domain), often in conjunction with spatially distinct RNA recognition motifs (RRMs). To date, 52 human proteins have been classified as SR or SR-related proteins. Here, using an unbiased series of composition criteria together with enrichment for known RNA binding activity, we identified >100 putative SR-related proteins in the human proteome. This method recovers known SR and SR-related proteins with high sensitivity (∼94%), yet identifies a number of additional proteins with many of the hallmark features of true SR-related proteins. Newly identified SR-related proteins display slightly different amino acid compositions yet similar levels of post-translational modification, suggesting that these new SR-related candidates are regulated in vivo and functionally important. Furthermore, candidate SR-related proteins with known RNA-binding activity (but not currently recognized as SR-related proteins) are nevertheless strongly associated with a variety of functions related to mRNA splicing and nuclear speckles. Finally, we applied our SR search method to all available reference proteomes, and provide maps of RS domains and Pfam annotations for all putative SR-related proteins as a resource. Together, these results expand the set of SR-related proteins in humans, and identify the most common functions associated with SR-related proteins across all domains of life.
Collapse
Affiliation(s)
- Sean M Cascarina
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, Colorado 80523, USA
| | - Eric D Ross
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, Colorado 80523, USA
| |
Collapse
|
49
|
Kokot T, Köhn M. Emerging insights into serine/threonine-specific phosphoprotein phosphatase function and selectivity. J Cell Sci 2022; 135:277104. [DOI: 10.1242/jcs.259618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
ABSTRACT
Protein phosphorylation on serine and threonine residues is a widely distributed post-translational modification on proteins that acts to regulate their function. Phosphoprotein phosphatases (PPPs) contribute significantly to a plethora of cellular functions through the accurate dephosphorylation of phosphorylated residues. Most PPPs accomplish their purpose through the formation of complex holoenzymes composed of a catalytic subunit with various regulatory subunits. PPP holoenzymes then bind and dephosphorylate substrates in a highly specific manner. Despite the high prevalence of PPPs and their important role for cellular function, their mechanisms of action in the cell are still not well understood. Nevertheless, substantial experimental advancements in (phospho-)proteomics, structural and computational biology have contributed significantly to a better understanding of PPP biology in recent years. This Review focuses on recent approaches and provides an overview of substantial new insights into the complex mechanism of PPP holoenzyme regulation and substrate selectivity.
Collapse
Affiliation(s)
- Thomas Kokot
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg 1 , Freiburg 79104 , Germany
- University of Freiburg, 2 Faculty of Biology , Freiburg 79104 , Germany
| | - Maja Köhn
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg 1 , Freiburg 79104 , Germany
- University of Freiburg, 2 Faculty of Biology , Freiburg 79104 , Germany
| |
Collapse
|
50
|
Theillet FX, Luchinat E. In-cell NMR: Why and how? PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2022; 132-133:1-112. [PMID: 36496255 DOI: 10.1016/j.pnmrs.2022.04.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 04/19/2022] [Accepted: 04/27/2022] [Indexed: 06/17/2023]
Abstract
NMR spectroscopy has been applied to cells and tissues analysis since its beginnings, as early as 1950. We have attempted to gather here in a didactic fashion the broad diversity of data and ideas that emerged from NMR investigations on living cells. Covering a large proportion of the periodic table, NMR spectroscopy permits scrutiny of a great variety of atomic nuclei in all living organisms non-invasively. It has thus provided quantitative information on cellular atoms and their chemical environment, dynamics, or interactions. We will show that NMR studies have generated valuable knowledge on a vast array of cellular molecules and events, from water, salts, metabolites, cell walls, proteins, nucleic acids, drugs and drug targets, to pH, redox equilibria and chemical reactions. The characterization of such a multitude of objects at the atomic scale has thus shaped our mental representation of cellular life at multiple levels, together with major techniques like mass-spectrometry or microscopies. NMR studies on cells has accompanied the developments of MRI and metabolomics, and various subfields have flourished, coined with appealing names: fluxomics, foodomics, MRI and MRS (i.e. imaging and localized spectroscopy of living tissues, respectively), whole-cell NMR, on-cell ligand-based NMR, systems NMR, cellular structural biology, in-cell NMR… All these have not grown separately, but rather by reinforcing each other like a braided trunk. Hence, we try here to provide an analytical account of a large ensemble of intricately linked approaches, whose integration has been and will be key to their success. We present extensive overviews, firstly on the various types of information provided by NMR in a cellular environment (the "why", oriented towards a broad readership), and secondly on the employed NMR techniques and setups (the "how", where we discuss the past, current and future methods). Each subsection is constructed as a historical anthology, showing how the intrinsic properties of NMR spectroscopy and its developments structured the accessible knowledge on cellular phenomena. Using this systematic approach, we sought i) to make this review accessible to the broadest audience and ii) to highlight some early techniques that may find renewed interest. Finally, we present a brief discussion on what may be potential and desirable developments in the context of integrative studies in biology.
Collapse
Affiliation(s)
- Francois-Xavier Theillet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France.
| | - Enrico Luchinat
- Dipartimento di Scienze e Tecnologie Agro-Alimentari, Alma Mater Studiorum - Università di Bologna, Piazza Goidanich 60, 47521 Cesena, Italy; CERM - Magnetic Resonance Center, and Neurofarba Department, Università degli Studi di Firenze, 50019 Sesto Fiorentino, Italy
| |
Collapse
|