1
|
Vyas P, Santra K, Preeyanka N, Gupta A, Weil-Ktorza O, Zhu Q, Metanis N, Fransson J, Longo LM, Naaman R. Role of Electron Spin, Chirality, and Charge Dynamics in Promoting the Persistence of Nascent Nucleic Acid-Peptide Complexes. J Phys Chem B 2025; 129:3978-3987. [PMID: 40231896 PMCID: PMC12035798 DOI: 10.1021/acs.jpcb.5c01150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2025] [Revised: 03/26/2025] [Accepted: 04/03/2025] [Indexed: 04/16/2025]
Abstract
Primitive nucleic acids and peptides likely collaborated in early biochemistry. What forces drove their interactions and how did these forces shape the properties of primitive complexes? We investigated how two model primordial polypeptides associate with DNA. When peptides were coupled to a ferromagnetic substrate, DNA binding depended on the substrate's magnetic moment orientation. Reversing the magnetic field nearly abolished binding despite complementary charges. Inverting the peptide chirality or just the cysteine residue reversed this effect. These results are attributed to the chiral-induced spin selectivity (CISS) effect, where molecular chirality and electron spin alter a protein's electric polarizability. The presence of CISS in simple protein-DNA complexes suggests that it played a significant role in ancient biomolecular interactions. A major consequence of CISS is enhancement of the kinetic stability of protein-nucleic acid complexes. These findings reveal how chirality and spin influence bioassociation, offering insights into primitive biochemical evolution and shaping contemporary protein functions.
Collapse
Affiliation(s)
- Pratik Vyas
- Department
of Chemical and Biological Physics, Weizmann
Institute of Science, Rehovot 76100, Israel
| | - Kakali Santra
- Department
of Chemical and Biological Physics, Weizmann
Institute of Science, Rehovot 76100, Israel
| | - Naupada Preeyanka
- Department
of Chemical and Biological Physics, Weizmann
Institute of Science, Rehovot 76100, Israel
| | - Anu Gupta
- Department
of Chemical and Biological Physics, Weizmann
Institute of Science, Rehovot 76100, Israel
| | - Orit Weil-Ktorza
- Institute
of Chemistry, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Qirong Zhu
- Department
of Chemical and Biological Physics, Weizmann
Institute of Science, Rehovot 76100, Israel
| | - Norman Metanis
- Institute
of Chemistry, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Jonas Fransson
- Department
of Physics and Astronomy, Uppsala University, Uppsala 752 36, Sweden
| | - Liam M. Longo
- Earth-Life
Science Institute, Institute of Science
Tokyo, Tokyo 152-8550, Japan
- Blue
Marble Space Institute of Science, Seattle, Washington 98104, United States
| | - Ron Naaman
- Department
of Chemical and Biological Physics, Weizmann
Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
2
|
Ezerzer Y, Frenkel-Pinter M, Kolodny R, Ben-Tal N. A building blocks perspective on protein emergence and evolution. Curr Opin Struct Biol 2025; 91:102996. [PMID: 39919321 DOI: 10.1016/j.sbi.2025.102996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 01/04/2025] [Accepted: 01/15/2025] [Indexed: 02/09/2025]
Abstract
Recent findings increasingly suggest the emergence of proteins by mix and match of short peptides, or 'building blocks'. What are these building blocks, and how did they evolve into contemporary proteins? We review two complementary approaches to tackling these questions. First, a bottom-up approach that involves identifying putative components of primordial peptides, and the synthetic routes through which these peptides may have emerged. Second, searches in protein space to reveal building blocks that make up the contemporary protein repertoire; proteins that are not closely related to one another may nevertheless have certain parts in common, suggesting common ancestry. Identifying such shared building blocks, and characterizing their functions, can shed light on the ancient molecules from which proteins emerged, and hint at the mechanisms that govern their evolution. A key challenge lies in merging these two approaches to create a cohesive narrative of how proteins emerged and continue to evolve.
Collapse
Affiliation(s)
- Yishi Ezerzer
- Institute of Chemistry, The Hebrew University of Jerusalem, 9190401, Israel
| | - Moran Frenkel-Pinter
- Institute of Chemistry, The Hebrew University of Jerusalem, 9190401, Israel; The Center for Nanoscience and Nanotechnology, The Hebrew University of Jerusalem, 9190401, Israel.
| | - Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel.
| | - Nir Ben-Tal
- School of Neurobiology, Biochemistry and Biophysics, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
3
|
Demkiv AO, Toledo-Patiño S, Medina-Carmona E, Berg A, Pinto GP, Parracino A, Sanchez-Ruiz JM, Hengge AC, Laurino P, Longo LM, Kamerlin SCL. Redefining the Limits of Functional Continuity in the Early Evolution of P-Loop NTPases. Mol Biol Evol 2025; 42:msaf055. [PMID: 40070202 PMCID: PMC11959459 DOI: 10.1093/molbev/msaf055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 01/21/2025] [Accepted: 02/14/2025] [Indexed: 04/02/2025] Open
Abstract
At the heart of many nucleoside triphosphatases is a conserved phosphate-binding sequence motif. A current model of early enzyme evolution proposes that this six to eight residue motif could have sparked the emergence of the very first nucleoside triphosphatases-a striking example of evolutionary continuity from simple beginnings, if true. To test this provocative model, seven disembodied Walker A-derived peptides were extensively computationally characterized. Although dynamic flickers of nest-like conformations were observed, significant structural similarity between the situated peptide and its disembodied counterpart was not detected. Simulations suggest that phosphate binding is nonspecific, with a preference for GTP over orthophosphate. Control peptides with the same amino acid composition but different sequences and situated conformations behaved similarly to the Walker A peptides, revealing no indication that the Walker A sequence is privileged as a disembodied peptide. We conclude that the evolutionary history of the P-loop NTPase family is unlikely to have started with a disembodied Walker A peptide in an aqueous environment. The limits of evolutionary continuity for this protein family must be reconsidered. Finally, we argue that motifs such as the Walker A motif may represent incomplete or fragmentary molecular fossils-the true nature of which has been eroded by time.
Collapse
Affiliation(s)
- Andrey O Demkiv
- Department of Chemistry—BMC, Uppsala University, Uppsala S-751 23, Sweden
| | - Saacnicteh Toledo-Patiño
- Protein Engineering and Evolution Unit, Okinawa Institute of Science and Technology, Graduate University (OIST), Okinawa 904-0495, Japan
- Molecular Bioengineering Group, Okinawa Institute of Science and Technology, Graduate University (OIST), Okinawa 904-0495, Japan
| | - Encarnación Medina-Carmona
- Departamento de Química Física, Facultad de Ciencias, Unidad de Excelencia de Química aplicada a Biomedicina y Medioambiente (UEQ), Universidad de Granada, Granada 18071, Spain
| | - Andrej Berg
- Department of Chemistry—BMC, Uppsala University, Uppsala S-751 23, Sweden
| | - Gaspar P Pinto
- Department of Chemistry—BMC, Uppsala University, Uppsala S-751 23, Sweden
| | | | - Jose M Sanchez-Ruiz
- Departamento de Química Física, Facultad de Ciencias, Unidad de Excelencia de Química aplicada a Biomedicina y Medioambiente (UEQ), Universidad de Granada, Granada 18071, Spain
| | - Alvan C Hengge
- Department of Chemistry and Biochemistry, Utah State University, Logan, UT 84322-0300, USA
| | - Paola Laurino
- Protein Engineering and Evolution Unit, Okinawa Institute of Science and Technology, Graduate University (OIST), Okinawa 904-0495, Japan
- Institute for Protein Research, Osaka University, Suita, Japan
| | - Liam M Longo
- Blue Marble Space Institute of Science, Seattle, WA 98104, USA
- Earth-Life Science Institute, Institute of Science Tokyo, Tokyo 152-8550, Japan
| | - Shina Caroline Lynn Kamerlin
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 30332, USA
- Department of Chemistry, Lund University, Box 124, Lund 22100, Sweden
| |
Collapse
|
4
|
Mulkidjanian AY, Dibrova DV, Bychkov AY. Origin of the RNA World in Cold Hadean Geothermal Fields Enriched in Zinc and Potassium: Abiogenesis as a Positive Fallout from the Moon-Forming Impact? Life (Basel) 2025; 15:399. [PMID: 40141744 PMCID: PMC11943819 DOI: 10.3390/life15030399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Revised: 02/06/2025] [Accepted: 02/14/2025] [Indexed: 03/28/2025] Open
Abstract
The ubiquitous, evolutionarily oldest RNAs and proteins exclusively use rather rare zinc as transition metal cofactor and potassium as alkali metal cofactor, which implies their abundance in the habitats of the first organisms. Intriguingly, lunar rocks contain a hundred times less zinc and ten times less potassium than the Earth's crust; the Moon is also depleted in other moderately volatile elements (MVEs). Current theories of impact formation of the Moon attribute this depletion to the MVEs still being in a gaseous state when the hot post-impact disk contracted and separated from the nascent Moon. The MVEs then fell out onto juvenile Earth's protocrust; zinc, as the most volatile metal, precipitated last, just after potassium. According to our calculations, the top layer of the protocrust must have contained up to 1019 kg of metallic zinc, a powerful reductant. The venting of hot geothermal fluids through this MVE-fallout layer, rich in metallic zinc and radioactive potassium, both capable of reducing carbon dioxide and dinitrogen, must have yielded a plethora of organic molecules released with the geothermal vapor. In the pools of vapor condensate, the RNA-like molecules may have emerged through a pre-Darwinian selection for low-volatile, associative, mineral-affine, radiation-resistant, nitrogen-rich, and polymerizable molecules.
Collapse
Affiliation(s)
- Armen Y. Mulkidjanian
- Department of Physics, Osnabrueck University, D-49069 Osnabrueck, Germany
- Center of Cellular Nanoanalytics, Osnabrueck University, D-49069 Osnabrueck, Germany
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119992 Moscow, Russia
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia
| | - Daria V. Dibrova
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119992 Moscow, Russia
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia
| | - Andrey Y. Bychkov
- School of Geology, Lomonosov Moscow State University, 119992 Moscow, Russia;
| |
Collapse
|
5
|
Subramanian AM, Martinez ZA, Lourenço AL, Liu S, Thomson M. Unexplored regions of the protein sequence-structure map revealed at scale by a library of foldtuned language models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2023.12.22.573145. [PMID: 38187750 PMCID: PMC10769378 DOI: 10.1101/2023.12.22.573145] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The combinatorial scale of amino-acid sequence-space has traditionally precluded substantive study of the full protein sequence-structure map. It remains unknown, for instance, how much of the vast uncharted landscape of far-from-natural sequences encodes the familiar ensemble of natural folds in a fashion consistent with the laws of biophysics but seemingly untouched by evolution on Earth. The scale of sequence perturbations required to access these spaces exceeds the reach of even gold-standard experimental approaches such as directed evolution. We surpass this limitation guided by the innate capacity of protein language models (pLMs) to explore sequences outside their natural training data through generation and self-feedback. We recast pLMs as probes that explore into regions of protein "deep space" that possess little-to-no detectable homology to natural examples, while enforcing core structural constraints, in a novel sequence design approach that we term "foldtuning." We build a library of foldtuned pLMs for >700 natural folds in the SCOP database, covering numerous high-priority targets for synthetic biology, including GPCRs and small GTPases, composable cell-surface-receptor and DNA-binding domains, and small signaling/regulatory domains. Candidate proteins generated by foldtuned pLMs reflect distinctive new "rules of language" for sequence innovation beyond detectable homology to any known protein and sample subtle structural alterations in a manner reminiscent of natural structural evolution and diversification. Experimental validation of two markedly different fold targets; the tyrosine-kinase- and small-GTPase-regulating SH3 domain and the bacterial RNase inhibitor barstar demonstrates that foldtuning proposes protein variants that express and fold stably in vitro and function in vivo. Foldtuning reveals protein sequence-structure information at scale outside of the context of evolution and promises to push forward the redesign and reconstitution of novel-to-nature synthetic biological systems for applications in health and catalysis.
Collapse
Affiliation(s)
- Arjuna M. Subramanian
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
| | - Zachary A. Martinez
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
| | - Alec L. Lourenço
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
| | - Shichen Liu
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
| | - Matt Thomson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
| |
Collapse
|
6
|
Timsit Y, Sergeant-Perthuis G, Bennequin D. The role of ribosomal protein networks in ribosome dynamics. Nucleic Acids Res 2025; 53:gkae1308. [PMID: 39788545 PMCID: PMC11711686 DOI: 10.1093/nar/gkae1308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 12/12/2024] [Accepted: 01/02/2025] [Indexed: 01/12/2025] Open
Abstract
Accurate protein synthesis requires ribosomes to integrate signals from distant functional sites and execute complex dynamics. Despite advances in understanding ribosome structure and function, two key questions remain: how information is transmitted between these distant sites, and how ribosomal movements are synchronized? We recently highlighted the existence of ribosomal protein networks, likely evolved to participate in ribosome signaling. Here, we investigate the relationship between ribosomal protein networks and ribosome dynamics. Our findings show that major motion centers in the bacterial ribosome interact specifically with r-proteins, and that ribosomal RNA exhibits high mobility around each r-protein. This suggests that periodic electrostatic changes in the context of negatively charged residues (Glu and Asp) induce RNA-protein 'distance-approach' cycles, controlling key ribosomal movements during translocation. These charged residues play a critical role in modulating electrostatic repulsion between RNA and proteins, thus coordinating ribosomal dynamics. We propose that r-protein networks synchronize ribosomal dynamics through an 'electrostatic domino' effect, extending the concept of allostery to the regulation of movements within supramolecular assemblies.
Collapse
Affiliation(s)
- Youri Timsit
- Aix Marseille Univ, Université de Toulon, CNRS, IRD, MIO UM110, 163 avenue de Luminy 13288 Marseille, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 3 Rue Michel-Ange, 75016 Paris, France
| | - Grégoire Sergeant-Perthuis
- Laboratory of Computational and Quantitative Biology (LCQB), Sorbonne Université, 4 Place Jussieu, 75005 Paris, France
| | - Daniel Bennequin
- Institut de Mathématiques de Jussieu - Paris Rive Gauche (IMJ-PRG), UMR 7586, CNRS, Université Paris Diderot, 8, Pace Aurélie Nemours, 75013 Paris, France
| |
Collapse
|
7
|
Timsit Y. The Expanding Universe of Extensions and Tails: Ribosomal Proteins and Histones in RNA and DNA Complex Signaling and Dynamics. Genes (Basel) 2025; 16:45. [PMID: 39858592 PMCID: PMC11764897 DOI: 10.3390/genes16010045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2024] [Revised: 12/26/2024] [Accepted: 12/27/2024] [Indexed: 01/27/2025] Open
Abstract
This short review bridges two biological fields: ribosomes and nucleosomes-two nucleoprotein assemblies that, along with many viruses, share proteins featuring long filamentous segments at their N- or C-termini. A central hypothesis is that these extensions and tails perform analogous functions in both systems. The evolution of these structures appears closely tied to the emergence of regulatory networks and signaling pathways, facilitating increasingly complex roles for ribosomes and nucleosome alike. This review begins by summarizing the structures and functions of ribosomes and nucleosomes, followed by a detailed comparison highlighting their similarities and differences, particularly in light of recent findings on the roles of ribosomal proteins in signaling and ribosome dynamics. The analysis seeks to uncover whether these systems operate based on shared principles and mechanisms. The nucleosome-ribosome analogy may offer valuable insights into unresolved questions in both fields. For instance, new structural insights from ribosomes might shed light on potential motifs formed by histone tails. From an evolutionary perspective, this study revisits the origins of signaling and regulation in ancient nucleoprotein assemblies, suggesting that tails and extensions may represent remnants of the earliest network systems governing signaling and dynamic control.
Collapse
Affiliation(s)
- Youri Timsit
- Aix Marseille Université, Université de Toulon, CNRS, IRD, MIO UM110, 13288 Marseille, France;
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, Rue Michel-Ange, 75016 Paris, France
| |
Collapse
|
8
|
Jagadeesh J, Vembar SS. Evolution of sequence, structural and functional diversity of the ubiquitous DNA/RNA-binding Alba domain. Sci Rep 2024; 14:30363. [PMID: 39638848 PMCID: PMC11621453 DOI: 10.1038/s41598-024-79937-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 11/13/2024] [Indexed: 12/07/2024] Open
Abstract
The DNA/RNA-binding Alba domain is prevalent across all kingdoms of life. First discovered in archaea, this protein domain has evolved from RNA- to DNA-binding, with a concomitant expansion in the range of cellular processes that it regulates. Despite its widespread presence, the full extent of its sequence, structural, and functional diversity remains unexplored. In this study, we employed iterative searches in PSI-BLAST to identify 15,161 unique Alba domain-containing proteins from the NCBI non-redundant protein database. Sequence similarity network (SSN) analysis clustered them into 13 distinct subgroups, including the archaeal Alba and eukaryotic Rpp20/Pop7 and Rpp25/Pop6 groups, as well as novel fungal and Plasmodium-specific Albas. Sequence and structural conservation analysis of the subgroups indicated high preservation of the dimer interface, with Alba domains from unicellular eukaryotes notably exhibiting structural deviations towards their C-terminal end. Finally, phylogenetic analysis, while supporting SSN clustering, revealed the evolutionary branchpoint at which the eukaryotic Rpp20- and Rpp25-like clades emerged from archaeal Albas, and the subsequent taxonomic lineage-based divergence within each clade. Taken together, this comprehensive analysis enhances our understanding of the evolutionary history of Alba domain-containing proteins across diverse organisms.
Collapse
Affiliation(s)
- Jaiganesh Jagadeesh
- Institute of Bioinformatics and Applied Biotechnology, Bengaluru, Karnataka, India
| | | |
Collapse
|
9
|
Caetano-Anollés G, Mughal F, Aziz MF, Caetano-Anollés K. Tracing the birth and intrinsic disorder of loops and domains in protein evolution. Biophys Rev 2024; 16:723-735. [PMID: 39830125 PMCID: PMC11735766 DOI: 10.1007/s12551-024-01251-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 10/29/2024] [Indexed: 01/22/2025] Open
Abstract
Protein loops and structural domains are building blocks of molecular structure. They hold evolutionary memory and are largely responsible for the many functions and processes that drive the living world. Here, we briefly review two decades of phylogenomic data-driven research focusing on the emergence and evolution of these elemental architects of protein structure. Phylogenetic trees of domains reconstructed from the proteomes of organisms belonging to all three superkingdoms and viruses were used to build chronological timelines describing the origin of each domain and its embedded loops at different levels of structural abstraction. These timelines consistently recovered six distinct evolutionary phases and a most parsimonious evolutionary progression of cellular life. The timelines also traced the birth of domain structures from loops, which allowed to model their growth ab initio with AlphaFold2. Accretion decreased the disorder of the growing molecules, suggesting disorder is molecular size-dependent. A phylogenomic survey of disorder revealed that loops and domains evolved differently. Loops were highly disordered, disorder increased early in evolution, and ordered and moderate disordered structures were derived. Gradual replacement of loops with α-helix and β-strand bracing structures over time paved the way for the dominance of more disordered loop types. In contrast, ancient domains were ordered, with disorder evolving as a benefit acquired later in evolution. These evolutionary patterns explain inverse correlations between disorder and sequence length of loops and domains. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - M. Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Callout Biotech, Albuquerque, NM 87112 USA
| |
Collapse
|
10
|
Bagrova O, Lapshina K, Sidorova A, Shpigun D, Lutsenko A, Belova E. Secondary structure analysis of proteins within the same topology group. Biochem Biophys Res Commun 2024; 734:150613. [PMID: 39222577 DOI: 10.1016/j.bbrc.2024.150613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 08/13/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024]
Abstract
The native conformation of a protein plays a decisive role in ensuring its functionality. It is established that the spatial structure of proteins may exhibit a greater degree of conservation than the corresponding amino acid sequences. This study aims to clarify structural distinctions between homologous and non-homologous proteins with identical topology. The analysis focuses on secondary structures with special emphasis on their fraction, distribution along the polypeptide chain, and chirality. Three different groups of proteins with identical topology were considered according to the CATH database: a homologous group of Globins, a group of Phycocyanins, which is often considered as a potential relative of globins, and a diverse assembly of other globin-like proteins. Some structural patterns in the distribution of secondary structure have been identified within Globins. A similar profile was observed in Phycocyanins, in contrast to the third group. In addition, a distinguishable structural motif, including structures such as 310-helix and irregular structure, has been found in both Globins and Phycocyanins, which can be proposed as an evolutionary imprint.
Collapse
Affiliation(s)
- Olga Bagrova
- Department of Biophysics, Faculty of Physics, Lomonosov Moscow State University, Moscow, 119991, Russia.
| | - Ksenia Lapshina
- Department of Biophysics, Faculty of Physics, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Alla Sidorova
- Department of Biophysics, Faculty of Physics, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Denis Shpigun
- Department of Biophysics, Faculty of Physics, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Aleksey Lutsenko
- Department of Biophysics, Faculty of Physics, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Ekaterina Belova
- Department of Biophysics, Faculty of Physics, Lomonosov Moscow State University, Moscow, 119991, Russia
| |
Collapse
|
11
|
Zhang Z, Wayment-Steele HK, Brixi G, Wang H, Kern D, Ovchinnikov S. Protein language models learn evolutionary statistics of interacting sequence motifs. Proc Natl Acad Sci U S A 2024; 121:e2406285121. [PMID: 39467119 PMCID: PMC11551344 DOI: 10.1073/pnas.2406285121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 09/03/2024] [Indexed: 10/30/2024] Open
Abstract
Protein language models (pLMs) have emerged as potent tools for predicting and designing protein structure and function, and the degree to which these models fundamentally understand the inherent biophysics of protein structure stands as an open question. Motivated by a finding that pLM-based structure predictors erroneously predict nonphysical structures for protein isoforms, we investigated the nature of sequence context needed for contact predictions in the pLM Evolutionary Scale Modeling (ESM-2). We demonstrate by use of a "categorical Jacobian" calculation that ESM-2 stores statistics of coevolving residues, analogously to simpler modeling approaches like Markov Random Fields and Multivariate Gaussian models. We further investigated how ESM-2 "stores" information needed to predict contacts by comparing sequence masking strategies, and found that providing local windows of sequence information allowed ESM-2 to best recover predicted contacts. This suggests that pLMs predict contacts by storing motifs of pairwise contacts. Our investigation highlights the limitations of current pLMs and underscores the importance of understanding the underlying mechanisms of these models.
Collapse
Affiliation(s)
- Zhidian Zhang
- Harvard University, Cambridge, MA02138
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA02139
- Institute of Bioengineering, School of Life Sciences, Ecole polytechnique fédérale de Lausanne, LausanneVD 1015, Switzerland
| | - Hannah K. Wayment-Steele
- HHMI, Brandeis University, Waltham, MA02453
- Department of Biochemistry, Brandeis University, Waltham, MA02453
| | - Garyk Brixi
- Harvard College, Harvard University, Cambridge, MA02138
| | | | - Dorothee Kern
- HHMI, Brandeis University, Waltham, MA02453
- Department of Biochemistry, Brandeis University, Waltham, MA02453
| | - Sergey Ovchinnikov
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA02139
- John Harvard Distinguished Science Fellowship, Harvard University, Cambridge, MA02138
| |
Collapse
|
12
|
Koga N, Tatsumi-Koga R. Inventing Novel Protein Folds. J Mol Biol 2024; 436:168791. [PMID: 39260686 DOI: 10.1016/j.jmb.2024.168791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 09/04/2024] [Accepted: 09/05/2024] [Indexed: 09/13/2024]
Abstract
The vastness of unexplored protein fold universe remains a significant question. Through systematic de novo design of proteins with novel αβ-folds, we demonstrated that nature has only explored a tiny portion of the possible folds. Numerous possible protein folds are still untouched by nature. This review outlines this study and discusses the prospects for design of functional proteins with novel folds.
Collapse
Affiliation(s)
- Nobuyasu Koga
- Laboratory for Protein Design, Institute for Protein Research (IPR), Osaka University, Suita, Osaka 565-0871, Japan; Protein Design Group, Exploratory Research Center on Life and Living Systems (ExCELLS), National Institutes of Natural Sciences, Okazaki, Aichi 444-8585, Japan.
| | - Rie Tatsumi-Koga
- Laboratory for Protein Design, Institute for Protein Research (IPR), Osaka University, Suita, Osaka 565-0871, Japan
| |
Collapse
|
13
|
Min X, Liao Y, Chen X, Yang Q, Ying J, Zou J, Yang C, Zhang J, Ge S, Xia N. PB-GPT: An innovative GPT-based model for protein backbone generation. Structure 2024; 32:1820-1833.e5. [PMID: 39173620 DOI: 10.1016/j.str.2024.07.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/02/2024] [Accepted: 07/28/2024] [Indexed: 08/24/2024]
Abstract
With advanced computational methods, it is now feasible to modify or design proteins for specific functions, a process with significant implications for disease treatment and other medical applications. Protein structures and functions are intrinsically linked to their backbones, making the design of these backbones a pivotal aspect of protein engineering. In this study, we focus on the task of unconditionally generating protein backbones. By means of codebook quantization and compression dictionaries, we convert protein backbone structures into a distinctive coded language and propose a GPT-based protein backbone generation model, PB-GPT. To validate the generalization performance of the model, we trained and evaluated the model on both public datasets and small protein datasets. The results demonstrate that our model has the capability to unconditionally generate elaborate, highly realistic protein backbones with structural patterns resembling those of natural proteins, thus showcasing the significant potential of large language models in protein structure design.
Collapse
Affiliation(s)
- Xiaoping Min
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Yiyang Liao
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Xiao Chen
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Qianli Yang
- Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Junjie Ying
- Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Jiajun Zou
- School of Informatics, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Chongzhou Yang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; Institute of Artificial Intelligence, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Jun Zhang
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China
| | - Shengxiang Ge
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China.
| | - Ningshao Xia
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, State Key, No. 422 Siming South Rd, Xiamen 361005, China; School of Public Health, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China; State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen University, No. 422 Siming South Rd, Xiamen 361005, China.
| |
Collapse
|
14
|
Caetano-Anollés K, Aziz MF, Mughal F, Caetano-Anollés G. On Protein Loops, Prior Molecular States and Common Ancestors of Life. J Mol Evol 2024; 92:624-646. [PMID: 38652291 PMCID: PMC11458777 DOI: 10.1007/s00239-024-10167-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024]
Abstract
The principle of continuity demands the existence of prior molecular states and common ancestors responsible for extant macromolecular structure. Here, we focus on the emergence and evolution of loop prototypes - the elemental architects of protein domain structure. Phylogenomic reconstruction spanning superkingdoms and viruses generated an evolutionary chronology of prototypes with six distinct evolutionary phases defining a most parsimonious evolutionary progression of cellular life. Each phase was marked by strategic prototype accumulation shaping the structures and functions of common ancestors. The last universal common ancestor (LUCA) of cells and viruses and the last universal cellular ancestor (LUCellA) defined stem lines that were structurally and functionally complex. The evolutionary saga highlighted transformative forces. LUCA lacked biosynthetic ribosomal machinery, while the pivotal LUCellA lacked essential DNA biosynthesis and modern transcription. Early proteins therefore relied on RNA for genetic information storage but appeared initially decoupled from it, hinting at transformative shifts of genetic processing. Urancestral loop types suggest advanced folding designs were present at an early evolutionary stage. An exploration of loop geometric properties revealed gradual replacement of prototypes with α-helix and β-strand bracing structures over time, paving the way for the dominance of other loop types. AlphFold2-generated atomic models of prototype accretion described patterns of fold emergence. Our findings favor a ‛processual' model of evolving stem lines aligned with Woese's vision of a communal world. This model prompts discussing the 'problem of ancestors' and the challenges that lie ahead for research in taxonomy, evolution and complexity.
Collapse
Affiliation(s)
- Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Callout Biotech, Albuquerque, NM, 87112, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
15
|
Draizen EJ, Veretnik S, Mura C, Bourne PE. Deep generative models of protein structure uncover distant relationships across a continuous fold space. Nat Commun 2024; 15:8094. [PMID: 39294145 PMCID: PMC11410806 DOI: 10.1038/s41467-024-52020-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 08/23/2024] [Indexed: 09/20/2024] Open
Abstract
Our views of fold space implicitly rest upon many assumptions that impact how we analyze, interpret and understand protein structure, function and evolution. For instance, is there an optimal granularity in viewing protein structural similarities (e.g., architecture, topology or some other level)? Similarly, the discrete/continuous dichotomy of fold space is central, but remains unresolved. Discrete views of fold space bin similar folds into distinct, non-overlapping groups; unfortunately, such binning can miss remote relationships. While hierarchical systems like CATH are indispensable resources, less heuristic and more conceptually flexible approaches could enable more nuanced explorations of fold space. Building upon an Urfold model of protein structure, here we present a deep generative modeling framework, termed DeepUrfold, for analyzing protein relationships at scale. DeepUrfold's learned embeddings occupy high-dimensional latent spaces that can be distilled for a given protein in terms of an amalgamated representation uniting sequence, structure and biophysical properties. This approach is structure-guided, versus being purely structure-based, and DeepUrfold learns representations that, in a sense, define superfamilies. Deploying DeepUrfold with CATH reveals evolutionarily-remote relationships that evade existing methodologies, and suggests a mostly-continuous view of fold space-a view that extends beyond simple geometric similarity, towards the realm of integrated sequence ↔ structure ↔ function properties.
Collapse
Affiliation(s)
- Eli J Draizen
- School of Data Science, University of Virginia, Charlottesville, VA, USA.
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.
| | - Stella Veretnik
- School of Data Science, University of Virginia, Charlottesville, VA, USA
| | - Cameron Mura
- School of Data Science, University of Virginia, Charlottesville, VA, USA.
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.
| | - Philip E Bourne
- School of Data Science, University of Virginia, Charlottesville, VA, USA
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| |
Collapse
|
16
|
Kutlu Y, Axel G, Kolodny R, Ben-Tal N, Haliloglu T. Reused Protein Segments Linked to Functional Dynamics. Mol Biol Evol 2024; 41:msae184. [PMID: 39226145 PMCID: PMC11412252 DOI: 10.1093/molbev/msae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 08/10/2024] [Accepted: 08/26/2024] [Indexed: 09/05/2024] Open
Abstract
Protein space is characterized by extensive recurrence, or "reuse," of parts, suggesting that new proteins and domains can evolve by mixing-and-matching of existing segments. From an evolutionary perspective, for a given combination to persist, the protein segments should presumably not only match geometrically but also dynamically communicate with each other to allow concerted motions that are key to function. Evidence from protein space supports the premise that domains indeed combine in this manner; we explore whether a similar phenomenon can be observed at the sub-domain level. To this end, we use Gaussian Network Models (GNMs) to calculate the so-called soft modes, or low-frequency modes of motion for a dataset of 150 protein domains. Modes of motion can be used to decompose a domain into segments of consecutive amino acids that we call "dynamic elements", each of which belongs to one of two parts that move in opposite senses. We find that, in many cases, the dynamic elements, detected based on GNM analysis, correspond to established "themes": Sub-domain-level segments that have been shown to recur in protein space, and which were detected in previous research using sequence similarity alone (i.e. completely independently of the GNM analysis). This statistically significant correlation hints at the importance of dynamics in evolution. Overall, the results are consistent with an evolutionary scenario where proteins have emerged from themes that need to match each other both geometrically and dynamically, e.g. to facilitate allosteric regulation.
Collapse
Affiliation(s)
- Yiğit Kutlu
- Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
| | - Gabriel Axel
- School of Neurobiology, Biochemistry & Biophysics, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel
| | - Nir Ben-Tal
- School of Neurobiology, Biochemistry & Biophysics, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Turkan Haliloglu
- Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
| |
Collapse
|
17
|
Rossetto D, Cvjetan N, Walde P, Mansy SS. Protocellular Heme and Iron-Sulfur Clusters. Acc Chem Res 2024; 57:2293-2302. [PMID: 39099316 PMCID: PMC11339926 DOI: 10.1021/acs.accounts.4c00254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/10/2024] [Accepted: 07/25/2024] [Indexed: 08/06/2024]
Abstract
Central to the quest of understanding the emergence of life is to uncover the role of metals, particularly iron, in shaping prebiotic chemistry. Iron, as the most abundant of the accessible transition metals on the prebiotic Earth, played a pivotal role in early biochemical processes and continues to be indispensable to modern biology. Here, we discuss our recent contributions to probing the plausibility of prebiotic complexes with iron, including heme and iron-sulfur clusters, in mediating chemistry beneficial to a protocell. Laboratory experiments and spectroscopic findings suggest plausible pathways, often facilitated by UV light, for the synthesis of heme and iron-sulfur clusters. Once formed, heme displays catalytic, peroxidase-like activity when complexed with amphiphiles. This activity could have been beneficial in two ways. First, heme could have catalytically removed a molecule (H2O2) that could have had degradative effects on a protocell. Second, heme could have helped in the synthesis of the building blocks of life by coupling the reduction of H2O2 with the oxidation of organic substrates. The necessity of amphiphiles to avoid the formation of inactive complexes of heme is telling, as the modern-day electron transport chain possesses heme embedded within a lipid membrane. Conversely, prebiotic iron-sulfur peptides have yet to be reported to partition into lipid membranes, nor have simple iron-sulfur peptides been found to be capable of participating in the synthesis of organic molecules. Instead, iron-sulfur peptides span a wide range of reduction potentials complementary to the reduction potentials of hemes. The reduction potential of iron-sulfur peptides can be tuned by the type of iron-sulfur cluster formed, e.g., [2Fe-2S] versus [4Fe-4S], or by the substitution of ligands to the metal center. Since iron-sulfur clusters easily form upon stochastic encounters between iron ions, hydrosulfide, and small organic molecules possessing a thiolate, including peptides, the likelihood of soluble iron-sulfur clusters seems to be high. What remains challenging to determine is if iron-sulfur peptides participated in early prebiotic chemistry or were recruited later when protocellular membranes evolved that were compatible with the exploitation of electron transfer for the storage of energy as a proton gradient. This problem mirrors in some ways the difficulty in deciphering the origins of metabolism as a whole. Chemistry that resembles some facets of extant metabolism must have transpired on the prebiotic Earth, but there are few clues as to how and when such chemistry was harnessed to support a (proto)cell. Ultimately, unraveling the roles of hemes and iron-sulfur clusters in prebiotic chemistry promises to deepen our understanding of the origins of life on Earth and aids the search for life elsewhere in the universe.
Collapse
Affiliation(s)
- Daniele Rossetto
- Department
of Chemistry, University of Alberta, 11227 Saskatchewan Drive, Edmonton, AlbertaT6G 2G2, Canada
- D-CIBIO, University of Trento, via Sommarive 9, Trento 38123, Italy
| | - Nemanja Cvjetan
- Department
of Chemistry, University of Alberta, 11227 Saskatchewan Drive, Edmonton, AlbertaT6G 2G2, Canada
- Department
of Materials, ETH Zürich, Leopold-Ruzicka-Weg 4, Zürich 8093, Switzerland
| | - Peter Walde
- Department
of Materials, ETH Zürich, Leopold-Ruzicka-Weg 4, Zürich 8093, Switzerland
| | - Sheref S. Mansy
- Department
of Chemistry, University of Alberta, 11227 Saskatchewan Drive, Edmonton, AlbertaT6G 2G2, Canada
- D-CIBIO, University of Trento, via Sommarive 9, Trento 38123, Italy
| |
Collapse
|
18
|
Tanoz I, Timsit Y. Protein Fold Usages in Ribosomes: Another Glance to the Past. Int J Mol Sci 2024; 25:8806. [PMID: 39201491 PMCID: PMC11354259 DOI: 10.3390/ijms25168806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 08/07/2024] [Accepted: 08/08/2024] [Indexed: 09/02/2024] Open
Abstract
The analysis of protein fold usage, similar to codon usage, offers profound insights into the evolution of biological systems and the origins of modern proteomes. While previous studies have examined fold distribution in modern genomes, our study focuses on the comparative distribution and usage of protein folds in ribosomes across bacteria, archaea, and eukaryotes. We identify the prevalence of certain 'super-ribosome folds,' such as the OB fold in bacteria and the SH3 domain in archaea and eukaryotes. The observed protein fold distribution in the ribosomes announces the future power-law distribution where only a few folds are highly prevalent, and most are rare. Additionally, we highlight the presence of three copies of proto-Rossmann folds in ribosomes across all kingdoms, showing its ancient and fundamental role in ribosomal structure and function. Our study also explores early mechanisms of molecular convergence, where different protein folds bind equivalent ribosomal RNA structures in ribosomes across different kingdoms. This comparative analysis enhances our understanding of ribosomal evolution, particularly the distinct evolutionary paths of the large and small subunits, and underscores the complex interplay between RNA and protein components in the transition from the RNA world to modern cellular life. Transcending the concept of folds also makes it possible to group a large number of ribosomal proteins into five categories of urfolds or metafolds, which could attest to their ancestral character and common origins. This work also demonstrates that the gradual acquisition of extensions by simple but ordered folds constitutes an inexorable evolutionary mechanism. This observation supports the idea that simple but structured ribosomal proteins preceded the development of their disordered extensions.
Collapse
Affiliation(s)
- Inzhu Tanoz
- Aix-Marseille Université, Université de Toulon, IRD, CNRS, Mediterranean Institute of Oceanography (MIO), UM 110, 13288 Marseille, France;
| | - Youri Timsit
- Aix-Marseille Université, Université de Toulon, IRD, CNRS, Mediterranean Institute of Oceanography (MIO), UM 110, 13288 Marseille, France;
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 3 Rue Michel-Ange, 75016 Paris, France
| |
Collapse
|
19
|
Hlouchová K. Peptides En Route from Prebiotic to Biotic Catalysis. Acc Chem Res 2024; 57:2027-2037. [PMID: 39016062 PMCID: PMC11308367 DOI: 10.1021/acs.accounts.4c00137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 05/24/2024] [Accepted: 07/03/2024] [Indexed: 07/18/2024]
Abstract
In the quest to understand prebiotic catalysis, different molecular entities, mainly minerals, metal ions, organic cofactors, and ribozymes, have been implied as key players. Of these, inorganic and organic cofactors have gained attention for their ability to catalyze a wide array of reactions central to modern metabolism and frequently participate in these reactions within modern enzymes. Nevertheless, bridging the gap between prebiotic and modern metabolism remains a fundamental question in the origins of life. In this Account, peptides are investigated as a potential bridge linking prebiotic catalysis by minerals/cofactors to enzymes that dominate modern life's chemical reactions. Before ribosomal synthesis emerged, peptides of random sequences were plausible on early Earth. This was made possible by different sources of amino acid delivery and synthesis, as well as their condensation under a variety of conditions. Early peptides and proteins probably exhibited distinct compositions, enriched in small aliphatic and acidic residues. An increase in abundance of amino acids with larger side chains and canonical basic groups was most likely dependent on the emergence of their more challenging (bio)synthesis. Pressing questions thus arise: how did this composition influence the early peptide properties, and to what extent could they contribute to early metabolism? Recent research from our group and colleagues shows that highly acidic peptides/proteins comprising only the presumably "early" amino acids are in fact competent at secondary structure formation and even possess adaptive folding characteristics such as spontaneous refoldability and chaperone independence to achieve soluble structures. Moreover, we showed that highly acidic proteins of presumably "early" composition can still bind RNA by utilizing metal ions as cofactors to bridge carboxylate and phosphoester functional groups. And finally, ancient organic cofactors were shown to be capable of binding to sequences from amino acids considered prebiotically plausible, supporting their folding properties and providing functional groups, which would nominate them as catalytic hubs of great prebiotic relevance. These findings underscore the biochemical plausibility of an early peptide/protein world devoid of more complex amino acids yet collaborating with other catalytic species. Drawing from the mechanistic properties of protein-cofactor catalysis, it is speculated here that the early peptide/protein-cofactor ensemble could facilitate a similar range of chemical reactions, albeit with lower catalytic rates. This hypothesis invites a systematic experimental test. Nonetheless, this Account does not exclude other scenarios of prebiotic-to-biotic catalysis or prioritize any specific pathways of prebiotic syntheses. The objective is to examine peptide availability, composition, and functional potential among the various factors involved in the emergence of early life.
Collapse
Affiliation(s)
- Klára Hlouchová
- Department
of Cell Biology, Faculty of Science, Charles
University, Prague 12800, Czech Republic
- Institute
of Organic Chemistry and Biochemistry, Czech
Academy of Sciences, Prague 16610, Czech Republic
| |
Collapse
|
20
|
Middendorf L, Ravi Iyengar B, Eicholt LA. Sequence, Structure, and Functional Space of Drosophila De Novo Proteins. Genome Biol Evol 2024; 16:evae176. [PMID: 39212966 PMCID: PMC11363682 DOI: 10.1093/gbe/evae176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/29/2024] [Indexed: 09/04/2024] Open
Abstract
During de novo emergence, new protein coding genes emerge from previously nongenic sequences. The de novo proteins they encode are dissimilar in composition and predicted biochemical properties to conserved proteins. However, functional de novo proteins indeed exist. Both identification of functional de novo proteins and their structural characterization are experimentally laborious. To identify functional and structured de novo proteins in silico, we applied recently developed machine learning based tools and found that most de novo proteins are indeed different from conserved proteins both in their structure and sequence. However, some de novo proteins are predicted to adopt known protein folds, participate in cellular reactions, and to form biomolecular condensates. Apart from broadening our understanding of de novo protein evolution, our study also provides a large set of testable hypotheses for focused experimental studies on structure and function of de novo proteins in Drosophila.
Collapse
Affiliation(s)
- Lasse Middendorf
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| | - Bharat Ravi Iyengar
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| | - Lars A Eicholt
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| |
Collapse
|
21
|
Kocher CD, Dill KA. The prebiotic emergence of biological evolution. ROYAL SOCIETY OPEN SCIENCE 2024; 11:240431. [PMID: 39050718 PMCID: PMC11265915 DOI: 10.1098/rsos.240431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 05/10/2024] [Indexed: 07/27/2024]
Abstract
The origin of life must have been preceded by Darwin-like evolutionary dynamics that could propagate it. How did that adaptive dynamics arise? And from what prebiotic molecules? Using evolutionary invasion analysis, we develop a universal framework for describing any origin story for evolutionary dynamics. We find that cooperative autocatalysts, i.e. autocatalysts whose per-unit reproductive rate grows as their population increases, have the special property of being able to cross a barrier that separates their initial degradation-dominated state from a growth-dominated state with evolutionary dynamics. For some model parameters, this leap to persistent propagation is likely, not rare. We apply this analysis to the Foldcat Mechanism, wherein peptides fold and help catalyse the elongation of each other. Foldcats are found to have cooperative autocatalysis and be capable of emergent evolutionary dynamics.
Collapse
Affiliation(s)
- Charles D. Kocher
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY 11794, USA
| | - Ken A. Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY 11794, USA
- Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
22
|
Caetano-Anollés G. Are Viruses Taxonomic Units? A Protein Domain and Loop-Centric Phylogenomic Assessment. Viruses 2024; 16:1061. [PMID: 39066224 PMCID: PMC11281659 DOI: 10.3390/v16071061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 06/26/2024] [Accepted: 06/27/2024] [Indexed: 07/28/2024] Open
Abstract
Virus taxonomy uses a Linnaean-like subsumption hierarchy to classify viruses into taxonomic units at species and higher rank levels. Virus species are considered monophyletic groups of mobile genetic elements (MGEs) often delimited by the phylogenetic analysis of aligned genomic or metagenomic sequences. Taxonomic units are assumed to be independent organizational, functional and evolutionary units that follow a 'natural history' rationale. Here, I use phylogenomic and other arguments to show that viruses are not self-standing genetically-driven systems acting as evolutionary units. Instead, they are crucial components of holobionts, which are units of biological organization that dynamically integrate the genetics, epigenetic, physiological and functional properties of their co-evolving members. Remarkably, phylogenomic analyses show that viruses share protein domains and loops with cells throughout history via massive processes of reticulate evolution, helping spread evolutionary innovations across a wider taxonomic spectrum. Thus, viruses are not merely MGEs or microbes. Instead, their genomes and proteomes conduct cellularly integrated processes akin to those cataloged by the GO Consortium. This prompts the generation of compositional hierarchies that replace the 'is-a-kind-of' by a 'is-a-part-of' logic to better describe the mereology of integrated cellular and viral makeup. My analysis demands a new paradigm that integrates virus taxonomy into a modern evolutionarily centered taxonomy of organisms.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
23
|
Toledo-Patiño S, Goetz SK, Shanmugaratnam S, Höcker B, Farías-Rico JA. Molecular handcraft of a well-folded protein chimera. FEBS Lett 2024; 598:1375-1386. [PMID: 38508768 DOI: 10.1002/1873-3468.14856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 02/11/2024] [Accepted: 02/12/2024] [Indexed: 03/22/2024]
Abstract
Modular assembly is a compelling pathway to create new proteins, a concept supported by protein engineering and millennia of evolution. Natural evolution provided a repository of building blocks, known as domains, which trace back to even shorter segments that underwent numerous 'copy-paste' processes culminating in the scaffolds we see today. Utilizing the subdomain-database Fuzzle, we constructed a fold-chimera by integrating a flavodoxin-like fragment into a periplasmic binding protein. This chimera is well-folded and a crystal structure reveals stable interfaces between the fragments. These findings demonstrate the adaptability of α/β-proteins and offer a stepping stone for optimization. By emphasizing the practicality of fragment databases, our work pioneers new pathways in protein engineering. Ultimately, the results substantiate the conjecture that periplasmic binding proteins originated from a flavodoxin-like ancestor.
Collapse
Affiliation(s)
- Saacnicteh Toledo-Patiño
- Max Planck Institute for Developmental Biology, Tübingen, Germany
- Okinawa Institute of Science and Technology Graduate University, Japan
| | | | - Sooruban Shanmugaratnam
- Max Planck Institute for Developmental Biology, Tübingen, Germany
- Department of Biochemistry, University of Bayreuth, Germany
| | - Birte Höcker
- Max Planck Institute for Developmental Biology, Tübingen, Germany
- Department of Biochemistry, University of Bayreuth, Germany
| | - José Arcadio Farías-Rico
- Max Planck Institute for Developmental Biology, Tübingen, Germany
- Synthetic Biology Program, Center for Genome Sciences, National Autonomous University of Mexico, Cuernavaca, Mexico
| |
Collapse
|
24
|
Gaschignard G, Millet M, Bruley A, Benzerara K, Dezi M, Skouri-Panet F, Duprat E, Callebaut I. AlphaFold2-guided description of CoBaHMA, a novel family of bacterial domains within the heavy-metal-associated superfamily. Proteins 2024; 92:776-794. [PMID: 38258321 DOI: 10.1002/prot.26668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/22/2023] [Accepted: 01/01/2024] [Indexed: 01/24/2024]
Abstract
Three-dimensional (3D) structure information, now available at the proteome scale, may facilitate the detection of remote evolutionary relationships in protein superfamilies. Here, we illustrate this with the identification of a novel family of protein domains related to the ferredoxin-like superfold, by combining (i) transitive sequence similarity searches, (ii) clustering approaches, and (iii) the use of AlphaFold2 3D structure models. Domains of this family were initially identified in relation with the intracellular biomineralization of calcium carbonates by Cyanobacteria. They are part of the large heavy-metal-associated (HMA) superfamily, departing from the latter by specific sequence and structural features. In particular, most of them share conserved basic amino acids (hence their name CoBaHMA for Conserved Basic residues HMA), forming a positively charged surface, which is likely to interact with anionic partners. CoBaHMA domains are found in diverse modular organizations in bacteria, existing in the form of monodomain proteins or as part of larger proteins, some of which are membrane proteins involved in transport or lipid metabolism. This suggests that the CoBaHMA domains may exert a regulatory function, involving interactions with anionic lipids. This hypothesis might have a particular resonance in the context of the compartmentalization observed for cyanobacterial intracellular calcium carbonates.
Collapse
Affiliation(s)
- Geoffroy Gaschignard
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Maxime Millet
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Apolline Bruley
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Karim Benzerara
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Manuela Dezi
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Feriel Skouri-Panet
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Elodie Duprat
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| |
Collapse
|
25
|
Zheng Z, Goncearenco A, Berezovsky IN. Back in time to the Gly-rich prototype of the phosphate binding elementary function. Curr Res Struct Biol 2024; 7:100142. [PMID: 38655428 PMCID: PMC11035071 DOI: 10.1016/j.crstbi.2024.100142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 03/31/2024] [Accepted: 04/03/2024] [Indexed: 04/26/2024] Open
Abstract
Binding of nucleotides and their derivatives is one of the most ancient elementary functions dating back to the Origin of Life. We review here the works considering one of the key elements in binding of (di)nucleotide-containing ligands - phosphate binding. We start from a brief discussion of major participants, conditions, and events in prebiotic evolution that resulted in the Origin of Life. Tracing back to the basic functions, including metal and phosphate binding, and, potentially, formation of primitive protein-protein interactions, we focus here on the phosphate binding. Critically assessing works on the structural, functional, and evolutionary aspects of phosphate binding, we perform a simple computational experiment reconstructing its most ancient and generic sequence prototype. The profiles of the phosphate binding signatures have been derived in form of position-specific scoring matrices (PSSMs), their peculiarities depending on the type of the ligands have been analyzed, and evolutionary connections between them have been delineated. Then, the apparent prototype that gave rise to all relevant phosphate-binding signatures had also been reconstructed. We show that two major signatures of the phosphate binding that discriminate between the binding of dinucleotide- and nucleotide-containing ligands are GxGxxG and GxxGxG, respectively. It appears that the signature archetypal for dinucleotide-containing ligands is more generic, and it can frequently bind phosphate groups in nucleotide-containing ligands as well. The reconstructed prototype's key signature GxGGxG underlies the role of glycine residues in providing flexibility and interactions necessary for binding the phosphate groups. The prototype also contains other ancient amino acids, valine, and alanine, showing versatility towards evolutionary design and functional diversification.
Collapse
Affiliation(s)
- Zejun Zheng
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| | | | - Igor N. Berezovsky
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
- Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, 117579, Singapore
| |
Collapse
|
26
|
Cuevas-Zuviría B, Garcia AK, Rivier AJ, Rucker HR, Carruthers BM, Kaçar B. Emergence of an Orphan Nitrogenase Protein Following Atmospheric Oxygenation. Mol Biol Evol 2024; 41:msae067. [PMID: 38526235 PMCID: PMC11018506 DOI: 10.1093/molbev/msae067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 03/06/2024] [Accepted: 03/19/2024] [Indexed: 03/26/2024] Open
Abstract
Molecular innovations within key metabolisms can have profound impacts on element cycling and ecological distribution. Yet, much of the molecular foundations of early evolved enzymes and metabolisms are unknown. Here, we bring one such mystery to relief by probing the birth and evolution of the G-subunit protein, an integral component of certain members of the nitrogenase family, the only enzymes capable of biological nitrogen fixation. The G-subunit is a Paleoproterozoic-age orphan protein that appears more than 1 billion years after the origin of nitrogenases. We show that the G-subunit arose with novel nitrogenase metal dependence and the ecological expansion of nitrogen-fixing microbes following the transition in environmental metal availabilities and atmospheric oxygenation that began ∼2.5 billion years ago. We identify molecular features that suggest early G-subunit proteins mediated cofactor or protein interactions required for novel metal dependency, priming ancient nitrogenases and their hosts to exploit these newly diversified geochemical environments. We further examined the degree of functional specialization in G-subunit evolution with extant and ancestral homologs using laboratory reconstruction experiments. Our results indicate that permanent recruitment of the orphan protein depended on the prior establishment of conserved molecular features and showcase how contingent evolutionary novelties might shape ecologically important microbial innovations.
Collapse
Affiliation(s)
| | - Amanda K Garcia
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Alex J Rivier
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Holly R Rucker
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Brooke M Carruthers
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Betül Kaçar
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
27
|
Mustieles-del-Ser P, Ruano-Gallego D, Parro V. Immunoanalytical Detection of Conserved Peptides: Refining the Universe of Biomarker Targets in Planetary Exploration. Anal Chem 2024; 96:4764-4773. [PMID: 38484023 PMCID: PMC10975014 DOI: 10.1021/acs.analchem.3c04165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 01/14/2024] [Accepted: 02/19/2024] [Indexed: 03/27/2024]
Abstract
Ancient peptides are remnants of early biochemistry that continue to play pivotal roles in current proteins. They are simple molecules yet complex enough to exhibit independent functions, being products of an evolved biochemistry at the interface of life and nonlife. Their adsorption to minerals may contribute to their stabilization and preservation over time. To investigate the feasibility of conserved peptide sequences and structures as target biomarkers for the search for life on Mars or other planetary bodies, we conducted a bioinformatics selection of well-conserved ancient peptides and produced polyclonal antibodies for their detection using fluorescence microarray immunoassays. Additionally, we explored how adsorbing peptides to Mars-representative minerals to form organomineral complexes could affect their immunological detection. The results demonstrated that the selected peptides exhibited autonomous folding, with some of them regaining their structure, even after denaturation. Furthermore, their cognate antibodies detected their conformational features regardless of amino acid sequences, thereby broadening the spectrum of target peptide sequences. While certain antibodies displayed unspecific binding to bare minerals, we validated that peptide-mineral complexes can be detected using sandwich immunoassays, as confirmed through desorption and competitive assays. Consequently, we conclude that the diversity of peptide sequences and structures suitable for use as target biomarkers in astrobiology can be constrained to a few well conserved sets, and they can be detected even if they are adsorbed in organomineral complexes.
Collapse
Affiliation(s)
- Pedro Mustieles-del-Ser
- Centro
de Astrobiología (CAB) INTA-CSIC, Torrejón de Ardoz 28850, Spain
- Departments
of Physics and Mathematics, and Automatics, Universidad de Alcalá (UAH), Alcalá de Henares 28805, Spain
| | | | - Víctor Parro
- Centro
de Astrobiología (CAB) INTA-CSIC, Torrejón de Ardoz 28850, Spain
| |
Collapse
|
28
|
Ye W, Krishna Behra PR, Dyrhage K, Seeger C, Joiner JD, Karlsson E, Andersson E, Chi CN, Andersson SGE, Jemth P. Folded Alpha Helical Putative New Proteins from Apilactobacillus kunkeei. J Mol Biol 2024; 436:168490. [PMID: 38355092 DOI: 10.1016/j.jmb.2024.168490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 02/07/2024] [Accepted: 02/08/2024] [Indexed: 02/16/2024]
Abstract
The emergence of new proteins is a central question in biology. Most tertiary protein folds known to date appear to have an ancient origin, but it is clear from bioinformatic analyses that new proteins continuously emerge in all organismal groups. However, there is a paucity of experimental data on new proteins regarding their structure and biophysical properties. We performed a detailed phylogenetic analysis and identified 48 putative open reading frames in the honeybee-associated bacterium Apilactobacillus kunkeei for which no or few homologs could be identified in closely-related species, suggesting that they could be relatively new on an evolutionary time scale and represent recently evolved proteins. Using circular dichroism-, fluorescence- and nuclear magnetic resonance (NMR) spectroscopy we investigated six of these proteins and show that they are not intrinsically disordered, but populate alpha-helical dominated folded states with relatively low thermodynamic stability (0-3 kcal/mol). The NMR and biophysical data demonstrate that small new proteins readily adopt simple folded conformations suggesting that more complex tertiary structures can be continuously re-invented during evolution by fusion of such simple secondary structure elements. These findings have implications for the general view on protein evolution, where de novo emergence of folded proteins may be a common event.
Collapse
Affiliation(s)
- Weihua Ye
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, 75123 Uppsala, Sweden
| | - Phani Rama Krishna Behra
- Department of Molecular Evolution, Cell and Molecular Biology, Biomedical Centre, Science for Life Laboratory, Uppsala University, 75236 Uppsala, Sweden
| | - Karl Dyrhage
- Department of Molecular Evolution, Cell and Molecular Biology, Biomedical Centre, Science for Life Laboratory, Uppsala University, 75236 Uppsala, Sweden
| | - Christian Seeger
- Department of Molecular Evolution, Cell and Molecular Biology, Biomedical Centre, Science for Life Laboratory, Uppsala University, 75236 Uppsala, Sweden
| | - Joe D Joiner
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, 75123 Uppsala, Sweden
| | - Elin Karlsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, 75123 Uppsala, Sweden
| | - Eva Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, 75123 Uppsala, Sweden
| | - Celestine N Chi
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, 75123 Uppsala, Sweden.
| | - Siv G E Andersson
- Department of Molecular Evolution, Cell and Molecular Biology, Biomedical Centre, Science for Life Laboratory, Uppsala University, 75236 Uppsala, Sweden.
| | - Per Jemth
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, 75123 Uppsala, Sweden.
| |
Collapse
|
29
|
McGuinness KN, Fehon N, Feehan R, Miller M, Mutter AC, Rybak LA, Nam J, AbuSalim JE, Atkinson JT, Heidari H, Losada N, Kim JD, Koder RL, Lu Y, Silberg JJ, Slusky JSG, Falkowski PG, Nanda V. The energetics and evolution of oxidoreductases in deep time. Proteins 2024; 92:52-59. [PMID: 37596815 DOI: 10.1002/prot.26563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/06/2023] [Indexed: 08/20/2023]
Abstract
The core metabolic reactions of life drive electrons through a class of redox protein enzymes, the oxidoreductases. The energetics of electron flow is determined by the redox potentials of organic and inorganic cofactors as tuned by the protein environment. Understanding how protein structure affects oxidation-reduction energetics is crucial for studying metabolism, creating bioelectronic systems, and tracing the history of biological energy utilization on Earth. We constructed ProtReDox (https://protein-redox-potential.web.app), a manually curated database of experimentally determined redox potentials. With over 500 measurements, we can begin to identify how proteins modulate oxidation-reduction energetics across the tree of life. By mapping redox potentials onto networks of oxidoreductase fold evolution, we can infer the evolution of electron transfer energetics over deep time. ProtReDox is designed to include user-contributed submissions with the intention of making it a valuable resource for researchers in this field.
Collapse
Affiliation(s)
- Kenneth N McGuinness
- Department of Natural Sciences, Caldwell University, Caldwell, New Jersey, USA
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
| | - Nolan Fehon
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, New Jersey, USA
| | - Ryan Feehan
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Michelle Miller
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, New Jersey, USA
| | - Andrew C Mutter
- Department of Physics, The City College of New York, New York, New York, USA
| | - Laryssa A Rybak
- Department of Physics, The City College of New York, New York, New York, USA
| | - Justin Nam
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
| | - Jenna E AbuSalim
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
| | - Joshua T Atkinson
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA
| | - Hirbod Heidari
- Department of Chemistry, University of Texas at Austin, Austin, Texas, USA
| | - Natalie Losada
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
| | - J Dongun Kim
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, New Jersey, USA
| | - Ronald L Koder
- Department of Physics, The City College of New York, New York, New York, USA
| | - Yi Lu
- Department of Chemistry, University of Texas at Austin, Austin, Texas, USA
| | - Jonathan J Silberg
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA
| | - Joanna S G Slusky
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
- Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, USA
| | - Paul G Falkowski
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, New Jersey, USA
- Department of Earth and Planetary Sciences, Rutgers University, New Brunswick, New Jersey, USA
| | - Vikas Nanda
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
- Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA
| |
Collapse
|
30
|
Wright Z, Seymour M, Paszczak K, Truttmann T, Senn K, Stilp S, Jansen N, Gosz M, Goeden L, Anantharaman V, Aravind L, Waters LS. The small protein MntS evolved from a signal peptide and acquired a novel function regulating manganese homeostasis in Escherichia coli. Mol Microbiol 2024; 121:152-166. [PMID: 38104967 PMCID: PMC10842292 DOI: 10.1111/mmi.15206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/17/2023] [Accepted: 11/24/2023] [Indexed: 12/19/2023]
Abstract
Small proteins (<50 amino acids) are emerging as ubiquitous and important regulators in organisms ranging from bacteria to humans, where they commonly bind to and regulate larger proteins during stress responses. However, fundamental aspects of small proteins, such as their molecular mechanism of action, downregulation after they are no longer needed, and their evolutionary provenance, are poorly understood. Here, we show that the MntS small protein involved in manganese (Mn) homeostasis binds and inhibits the MntP Mn transporter. Mn is crucial for bacterial survival in stressful environments but is toxic in excess. Thus, Mn transport is tightly controlled at multiple levels to maintain optimal Mn levels. The small protein MntS adds a new level of regulation for Mn transporters, beyond the known transcriptional and post-transcriptional control. We also found that MntS binds to itself in the presence of Mn, providing a possible mechanism of downregulating MntS activity to terminate its inhibition of MntP Mn export. MntS is homologous to the signal peptide of SitA, the periplasmic metal-binding subunit of a Mn importer. Remarkably, the homologous signal peptide regions can substitute for MntS, demonstrating a functional relationship between MntS and these signal peptides. Conserved gene neighborhoods support that MntS evolved from the signal peptide of an ancestral SitA protein, acquiring a life of its own with a distinct function in Mn homeostasis.
Collapse
Affiliation(s)
- Zachary Wright
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Mackenzie Seymour
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Kalista Paszczak
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Taylor Truttmann
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Katherine Senn
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Samuel Stilp
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Nickolas Jansen
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Magdalyn Gosz
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Lindsay Goeden
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| | - Vivek Anantharaman
- National Center for Biotechnology Information, National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - L. Aravind
- National Center for Biotechnology Information, National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Lauren S. Waters
- Department of Chemistry, 800 Algoma Blvd, University of Wisconsin, Oshkosh, WI 54901, USA
| |
Collapse
|
31
|
Dhar R, Bowman AM, Hatungimana B, Sg Slusky J. Evolutionary Engineering a Larger Porin Using a Loop-to-Hairpin Mechanism. J Mol Biol 2023; 435:168292. [PMID: 37769963 PMCID: PMC11215794 DOI: 10.1016/j.jmb.2023.168292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 09/20/2023] [Accepted: 09/21/2023] [Indexed: 10/03/2023]
Abstract
In protein evolution, diversification is generally driven by genetic duplication. The hallmarks of this mechanism are visible in the repeating topology of various proteins. In outer membrane β-barrels, duplication is visible with β-hairpins as the repeating unit of the barrel. In contrast to the overall use of duplication in diversification, a computational study hypothesized evolutionary mechanisms other than hairpin duplications leading to increases in the number of strands in outer membrane β-barrels. Specifically, the topology of some 16- and 18-stranded β-barrels appear to have evolved through a loop to β-hairpin transition. Here we test this novel evolutionary mechanism by creating a chimeric protein from an 18-stranded β-barrel and an evolutionarily related 16-stranded β-barrel. The chimeric combination of the two was created by replacing loop L3 of the 16-stranded barrel with the sequentially matched transmembrane β-hairpin region of the 18-stranded barrel. We find the resulting chimeric protein is stable and has characteristics of increased strand number. This study provides the first experimental evidence supporting the evolution through a loop to β-hairpin transition.
Collapse
Affiliation(s)
- Rik Dhar
- Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66045, USA. https://twitter.com/Rik_Skywalker
| | - Alexander M Bowman
- Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66045, USA
| | - Brunojoel Hatungimana
- Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66045, USA
| | - Joanna Sg Slusky
- Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66045, USA; Computational Biology Program, The University of Kansas, Lawrence, KS 66047, USA.
| |
Collapse
|
32
|
Michel F, Romero‐Romero S, Höcker B. Retracing the evolution of a modern periplasmic binding protein. Protein Sci 2023; 32:e4793. [PMID: 37788980 PMCID: PMC10601554 DOI: 10.1002/pro.4793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/20/2023] [Accepted: 09/22/2023] [Indexed: 10/05/2023]
Abstract
Investigating the evolution of structural features in modern multidomain proteins helps to understand their immense diversity and functional versatility. The class of periplasmic binding proteins (PBPs) offers an opportunity to interrogate one of the main processes driving diversification: the duplication and fusion of protein sequences to generate new architectures. The symmetry of their two-lobed topology, their mechanism of binding, and the organization of their operon structure led to the hypothesis that PBPs arose through a duplication and fusion event of a single common ancestor. To investigate this claim, we set out to reverse the evolutionary process and recreate the structural equivalent of a single-lobed progenitor using ribose-binding protein (RBP) as our model. We found that this modern PBP can be deconstructed into its lobes, producing two proteins that represent possible progenitor halves. The isolated halves of RBP are well folded and monomeric proteins, albeit with a lower thermostability, and do not retain the original binding function. However, the two entities readily form a heterodimer in vitro and in-cell. The x-ray structure of the heterodimer closely resembles the parental protein. Moreover, the binding function is fully regained upon formation of the heterodimer with a ligand affinity similar to that observed in the modern RBP. This highlights how a duplication event could have given rise to a stable and functional PBP-like fold and provides insights into how more complex functional structures can evolve from simpler molecular components.
Collapse
Affiliation(s)
- Florian Michel
- Department of BiochemistryUniversity of BayreuthBayreuthGermany
| | | | - Birte Höcker
- Department of BiochemistryUniversity of BayreuthBayreuthGermany
| |
Collapse
|
33
|
Kaminski K, Ludwiczak J, Pawlicki K, Alva V, Dunin-Horkawicz S. pLM-BLAST: distant homology detection based on direct comparison of sequence representations from protein language models. Bioinformatics 2023; 39:btad579. [PMID: 37725369 PMCID: PMC10576641 DOI: 10.1093/bioinformatics/btad579] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 07/09/2023] [Accepted: 09/15/2023] [Indexed: 09/21/2023] Open
Abstract
MOTIVATION The detection of homology through sequence comparison is a typical first step in the study of protein function and evolution. In this work, we explore the applicability of protein language models to this task. RESULTS We introduce pLM-BLAST, a tool inspired by BLAST, that detects distant homology by comparing single-sequence representations (embeddings) derived from a protein language model, ProtT5. Our benchmarks reveal that pLM-BLAST maintains a level of accuracy on par with HHsearch for both highly similar sequences (with >50% identity) and markedly divergent sequences (with <30% identity), while being significantly faster. Additionally, pLM-BLAST stands out among other embedding-based tools due to its ability to compute local alignments. We show that these local alignments, produced by pLM-BLAST, often connect highly divergent proteins, thereby highlighting its potential to uncover previously undiscovered homologous relationships and improve protein annotation. AVAILABILITY AND IMPLEMENTATION pLM-BLAST is accessible via the MPI Bioinformatics Toolkit as a web server for searching precomputed databases (https://toolkit.tuebingen.mpg.de/tools/plmblast). It is also available as a standalone tool for building custom databases and performing batch searches (https://github.com/labstructbioinf/pLM-BLAST).
Collapse
Affiliation(s)
- Kamil Kaminski
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw 02-089, Poland
- Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, Warsaw 02-097, Poland
| | - Jan Ludwiczak
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw 02-089, Poland
| | - Kamil Pawlicki
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw 02-089, Poland
| | - Vikram Alva
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen 72076, Germany
| | - Stanislaw Dunin-Horkawicz
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw 02-089, Poland
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen 72076, Germany
| |
Collapse
|
34
|
Dhar R, Bowman AM, Hatungimana B, Slusky JS. Evolutionary engineering a larger porin using a loop-to-hairpin mechanism. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.14.544993. [PMID: 37398247 PMCID: PMC10312768 DOI: 10.1101/2023.06.14.544993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
In protein evolution, diversification is generally driven by genetic duplication. The hallmarks of this mechanism are visible in the repeating topology of various proteins. In outer membrane β-barrels, duplication is visible with β-hairpins as the repeating unit of the barrel. In contrast to the overall use of duplication in diversification, a computational study hypothesized evolutionary mechanisms other than hairpin duplications leading to increases in the number of strands in outer membrane β-barrels. Specifically, the topology of some 16- and 18-stranded β-barrels appear to have evolved through a loop to β-hairpin transition. Here we test this novel evolutionary mechanism by creating a chimeric protein from an 18-stranded β-barrel and an evolutionarily related 16-stranded β-barrel. The chimeric combination of the two was created by replacing loop L3 of the 16-stranded barrel with the sequentially matched transmembrane β-hairpin region of the 18-stranded barrel. We find the resulting chimeric protein is stable and has characteristics of increased strand number. This study provides the first experimental evidence supporting the evolution through a loop to β-hairpin transition. Highlights We find evidence supporting a novel diversification mechanism in membrane β-barrelsThe mechanism is the conversion of an extracellular loop to transmembrane β-hairpinA chimeric protein modeling this mechanism folds stably in the membraneThe chimera has more β-structure and a larger pore, consistent with a loop-to-hairpin transition.
Collapse
Affiliation(s)
- Rik Dhar
- Department of Molecular Biosciences, The University of Kansas, Lawrence KS 66045
| | - Alexander M Bowman
- Department of Molecular Biosciences, The University of Kansas, Lawrence KS 66045
| | | | - Joanna Sg Slusky
- Department of Molecular Biosciences, The University of Kansas, Lawrence KS 66045
- Computational Biology Program, The University of Kansas, Lawrence KS 66047
| |
Collapse
|
35
|
Aziz MF, Mughal F, Caetano-Anollés G. Tracing the birth of structural domains from loops during protein evolution. Sci Rep 2023; 13:14688. [PMID: 37673948 PMCID: PMC10482863 DOI: 10.1038/s41598-023-41556-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Accepted: 08/28/2023] [Indexed: 09/08/2023] Open
Abstract
The structures and functions of proteins are embedded into the loop scaffolds of structural domains. Their origin and evolution remain mysterious. Here, we use a novel graph-theoretical approach to describe how modular and non-modular loop prototypes combine to form folded structures in protein domain evolution. Phylogenomic data-driven chronologies reoriented a bipartite network of loops and domains (and its projections) into 'waterfalls' depicting an evolving 'elementary functionome' (EF). Two primordial waves of functional innovation involving founder 'p-loop' and 'winged-helix' domains were accompanied by an ongoing emergence and reuse of structural and functional novelty. Metabolic pathways expanded before translation functionalities. A dual hourglass recruitment pattern transferred scale-free properties from loop to domain components of the EF network in generative cycles of hierarchical modularity. Modeling the evolutionary emergence of the oldest P-loop and winged-helix domains with AlphFold2 uncovered rapid convergence towards folded structure, suggesting that a folding vocabulary exists in loops for protein fold repurposing and design.
Collapse
Affiliation(s)
- M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA.
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL, 61801, USA.
| |
Collapse
|
36
|
Porter LL. Fluid protein fold space and its implications. Bioessays 2023; 45:e2300057. [PMID: 37431685 PMCID: PMC10529699 DOI: 10.1002/bies.202300057] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023]
Abstract
Fold-switching proteins, which remodel their secondary and tertiary structures in response to cellular stimuli, suggest a new view of protein fold space. For decades, experimental evidence has indicated that protein fold space is discrete: dissimilar folds are encoded by dissimilar amino acid sequences. Challenging this assumption, fold-switching proteins interconnect discrete groups of dissimilar protein folds, making protein fold space fluid. Three recent observations support the concept of fluid fold space: (1) some amino acid sequences interconvert between folds with distinct secondary structures, (2) some naturally occurring sequences have switched folds by stepwise mutation, and (3) fold switching is evolutionarily selected and likely confers advantage. These observations indicate that minor amino acid sequence modifications can transform protein structure and function. Consequently, proteomic structural and functional diversity may be expanded by alternative splicing, small nucleotide polymorphisms, post-translational modifications, and modified translation rates.
Collapse
Affiliation(s)
- Lauren L. Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD
| |
Collapse
|
37
|
Wright Z, Seymour M, Paszczak K, Truttmann T, Senn K, Stilp S, Jansen N, Gosz M, Goeden L, Anantharaman V, Aravind L, Waters LS. The small protein MntS evolved from a signal peptide and acquired a novel function regulating manganese homeostasis in Escherichia coli. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.02.543501. [PMID: 37398132 PMCID: PMC10312517 DOI: 10.1101/2023.06.02.543501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Small proteins (< 50 amino acids) are emerging as ubiquitous and important regulators in organisms ranging from bacteria to humans, where they commonly bind to and regulate larger proteins during stress responses. However, fundamental aspects of small proteins, such as their molecular mechanism of action, downregulation after they are no longer needed, and their evolutionary provenance are poorly understood. Here we show that the MntS small protein involved in manganese (Mn) homeostasis binds and inhibits the MntP Mn transporter. Mn is crucial for bacterial survival in stressful environments, but is toxic in excess. Thus, Mn transport is tightly controlled at multiple levels to maintain optimal Mn levels. The small protein MntS adds a new level of regulation for Mn transporters, beyond the known transcriptional and post-transcriptional control. We also found that MntS binds to itself in the presence of Mn, providing a possible mechanism of downregulating MntS activity to terminate its inhibition of MntP Mn export. MntS is homologous to the signal peptide of SitA, the periplasmic metal-binding subunit of a Mn importer. Remarkably, the homologous signal peptide regions can substitute for MntS, demonstrating a functional relationship between MntS and these signal peptides. Conserved gene-neighborhoods support that MntS evolved from an ancestral SitA, acquiring a life of its own with a distinct function in Mn homeostasis. Significance This study demonstrates that the MntS small protein binds and inhibits the MntP Mn exporter, adding another layer to the complex regulation of Mn homeostasis. MntS also interacts with itself in cells with Mn, which could prevent it from regulating MntP. We propose that MntS and other small proteins might sense environmental signals and shut off their own regulation via binding to ligands (e.g., metals) or other proteins. We also provide evidence that MntS evolved from the signal peptide region of the Mn importer, SitA. Homologous SitA signal peptides can recapitulate MntS activities, showing that they have a second function beyond protein secretion. Overall, we establish that small proteins can emerge and develop novel functionalities from gene remnants.
Collapse
|
38
|
Chakravarty D, Sreenivasan S, Swint-Kruse L, Porter LL. Identification of a covert evolutionary pathway between two protein folds. Nat Commun 2023; 14:3177. [PMID: 37264049 DOI: 10.1038/s41467-023-38519-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 05/03/2023] [Indexed: 06/03/2023] Open
Abstract
Although homologous protein sequences are expected to adopt similar structures, some amino acid substitutions can interconvert α-helices and β-sheets. Such fold switching may have occurred over evolutionary history, but supporting evidence has been limited by the: (1) abundance and diversity of sequenced genes, (2) quantity of experimentally determined protein structures, and (3) assumptions underlying the statistical methods used to infer homology. Here, we overcome these barriers by applying multiple statistical methods to a family of ~600,000 bacterial response regulator proteins. We find that their homologous DNA-binding subunits assume divergent structures: helix-turn-helix versus α-helix + β-sheet (winged helix). Phylogenetic analyses, ancestral sequence reconstruction, and AlphaFold2 models indicate that amino acid substitutions facilitated a switch from helix-turn-helix into winged helix. This structural transformation likely expanded DNA-binding specificity. Our approach uncovers an evolutionary pathway between two protein folds and provides a methodology to identify secondary structure switching in other protein families.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Shwetha Sreenivasan
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Lauren L Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
39
|
Abstract
The mechanism and the evolution of DNA replication and transcription, the key elements of the central dogma of biology, are fundamentally well explained by the physicochemical complementarity between strands of nucleic acids. However, the determinants that have shaped the third part of the dogma-the process of biological translation and the universal genetic code-remain unclear. We review and seek parallels between different proposals that view the evolution of translation through the prism of weak, noncovalent interactions between biological macromolecules. In particular, we focus on a recent proposal that there exists a hitherto unrecognized complementarity at the heart of biology, that between messenger RNA coding regions and the proteins that they encode, especially if the two are unstructured. Reflecting the idea that the genetic code evolved from intrinsic binding propensities between nucleotides and amino acids, this proposal promises to forge a link between the distant past and the present of biological systems.
Collapse
Affiliation(s)
- Bojan Zagrovic
- Department of Structural and Computational Biology, Max Perutz Labs & University of Vienna, Vienna, Austria;
| | - Marlene Adlhart
- Department of Structural and Computational Biology, Max Perutz Labs & University of Vienna, Vienna, Austria;
| | - Thomas H Kapral
- Department of Structural and Computational Biology, Max Perutz Labs & University of Vienna, Vienna, Austria;
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical University of Vienna, Vienna, Austria
| |
Collapse
|
40
|
Moreaud L, Viollet S, Urvoas A, Valerio-Lepiniec M, Mesneau A, Li de la Sierra-Gallay I, Miller J, Ouldali M, Marcelot C, Balor S, Soldan V, Meriadec C, Artzner F, Dujardin E, Minard P. Design, synthesis, and characterization of protein origami based on self-assembly of a brick and staple artificial protein pair. Proc Natl Acad Sci U S A 2023; 120:e2218428120. [PMID: 36893280 PMCID: PMC10089216 DOI: 10.1073/pnas.2218428120] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 02/03/2023] [Indexed: 03/11/2023] Open
Abstract
A versatile strategy to create an inducible protein assembly with predefined geometry is demonstrated. The assembly is triggered by a binding protein that staples two identical protein bricks together in a predictable spatial conformation. The brick and staple proteins are designed for mutual directional affinity and engineered by directed evolution from a synthetic modular repeat protein library. As a proof of concept, this article reports on the spontaneous, extremely fast and quantitative self-assembly of two designed alpha-repeat (αRep) brick and staple proteins into macroscopic tubular superhelices at room temperature. Small-angle X-ray scattering (SAXS) and transmission electron microscopy (TEM with staining agent and cryoTEM) elucidate the resulting superhelical arrangement that precisely matches the a priori intended 3D assembly. The highly ordered, macroscopic biomolecular construction sustains temperatures as high as 75 °C thanks to the robust αRep building blocks. Since the α-helices of the brick and staple proteins are highly programmable, their design allows encoding the geometry and chemical surfaces of the final supramolecular protein architecture. This work opens routes toward the design and fabrication of multiscale protein origami with arbitrarily programmed shapes and chemical functions.
Collapse
Affiliation(s)
- Laureen Moreaud
- Centre d’Elaboration des Matériaux et d’Etudes Structurales, CNRS UPR8011F-31055, Toulouse, France
| | - Sébastien Viollet
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| | - Agathe Urvoas
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| | - Marie Valerio-Lepiniec
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| | - Agnès Mesneau
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| | - Inès Li de la Sierra-Gallay
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| | - Jessalyn Miller
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
- Department of Chemistry, Emory University, Atlanta, GA30322
| | - Malika Ouldali
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| | - Cécile Marcelot
- Centre d’Elaboration des Matériaux et d’Etudes Structurales, CNRS UPR8011F-31055, Toulouse, France
| | - Stéphanie Balor
- Microscopie Electronique Intégrative Toulouse, Centre de Biologie Intégrative, Université de Toulouse, CNRS, 31062, Toulouse, France
| | - Vanessa Soldan
- Microscopie Electronique Intégrative Toulouse, Centre de Biologie Intégrative, Université de Toulouse, CNRS, 31062, Toulouse, France
| | - Cristelle Meriadec
- Institut de Physique de Rennes, CNRS, UMR6251, Université de Rennes 1F-35042, Rennes, France
| | - Franck Artzner
- Institut de Physique de Rennes, CNRS, UMR6251, Université de Rennes 1F-35042, Rennes, France
| | - Erik Dujardin
- Centre d’Elaboration des Matériaux et d’Etudes Structurales, CNRS UPR8011F-31055, Toulouse, France
- Laboratoire Interdisciplinaire Carnot de Bourgogne, CNRS, UMR6303, Université de Bourgogne Franche-Comté21000, Dijon, France
| | - Philippe Minard
- CEA, CNRS, Institute for Integrative Biology of the Cell, Université Paris-Saclay91198, Gif-sur-Yvette, France
| |
Collapse
|
41
|
Benton R, Himmel NJ. Structural screens identify candidate human homologs of insect chemoreceptors and cryptic Drosophila gustatory receptor-like proteins. eLife 2023; 12:85537. [PMID: 36803935 PMCID: PMC9998090 DOI: 10.7554/elife.85537] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 02/16/2023] [Indexed: 02/22/2023] Open
Abstract
Insect odorant receptors and gustatory receptors define a superfamily of seven transmembrane domain ion channels (referred to here as 7TMICs), with homologs identified across Animalia except Chordata. Previously, we used sequence-based screening methods to reveal conservation of this family in unicellular eukaryotes and plants (DUF3537 proteins) (Benton et al., 2020). Here, we combine three-dimensional structure-based screening, ab initio protein folding predictions, phylogenetics, and expression analyses to characterize additional candidate homologs with tertiary but little or no primary structural similarity to known 7TMICs, including proteins in disease-causing Trypanosoma. Unexpectedly, we identify structural similarity between 7TMICs and PHTF proteins, a deeply conserved family of unknown function, whose human orthologs display enriched expression in testis, cerebellum, and muscle. We also discover divergent groups of 7TMICs in insects, which we term the gustatory receptor-like (Grl) proteins. Several Drosophila melanogaster Grls display selective expression in subsets of taste neurons, suggesting that they are previously unrecognized insect chemoreceptors. Although we cannot exclude the possibility of remarkable structural convergence, our findings support the origin of 7TMICs in a eukaryotic common ancestor, counter previous assumptions of complete loss of 7TMICs in Chordata, and highlight the extreme evolvability of this protein fold, which likely underlies its functional diversification in different cellular contexts.
Collapse
Affiliation(s)
- Richard Benton
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of LausanneLausanneSwitzerland
| | - Nathaniel J Himmel
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of LausanneLausanneSwitzerland
| |
Collapse
|
42
|
Sykes J, Holland BR, Charleston MA. A review of visualisations of protein fold networks and their relationship with sequence and function. Biol Rev Camb Philos Soc 2023; 98:243-262. [PMID: 36210328 PMCID: PMC10092621 DOI: 10.1111/brv.12905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 09/08/2022] [Accepted: 09/09/2022] [Indexed: 01/12/2023]
Abstract
Proteins form arguably the most significant link between genotype and phenotype. Understanding the relationship between protein sequence and structure, and applying this knowledge to predict function, is difficult. One way to investigate these relationships is by considering the space of protein folds and how one might move from fold to fold through similarity, or potential evolutionary relationships. The many individual characterisations of fold space presented in the literature can tell us a lot about how well the current Protein Data Bank represents protein fold space, how convergence and divergence may affect protein evolution, how proteins affect the whole of which they are part, and how proteins themselves function. A synthesis of these different approaches and viewpoints seems the most likely way to further our knowledge of protein structure evolution and thus, facilitate improved protein structure design and prediction.
Collapse
Affiliation(s)
- Janan Sykes
- School of Natural Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania, 7001, Australia
| | - Barbara R Holland
- School of Natural Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania, 7001, Australia
| | - Michael A Charleston
- School of Natural Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania, 7001, Australia
| |
Collapse
|
43
|
Evolutionary Conserved Short Linear Motifs Provide Insights into the Cellular Response to Stress. Antioxidants (Basel) 2022; 12:antiox12010096. [PMID: 36670957 PMCID: PMC9854524 DOI: 10.3390/antiox12010096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 11/22/2022] [Accepted: 12/22/2022] [Indexed: 01/03/2023] Open
Abstract
Short linear motifs (SLiMs) are evolutionarily conserved functional modules of proteins composed of 3 to 10 residues and involved in multiple cellular functions. Here, we performed a search for SLiMs that exert sequence similarity to two segments of alpha-fetoprotein (AFP), a major mammalian embryonic and cancer-associated protein. Biological activities of the peptides, LDSYQCT (AFP14-20) and EMTPVNPGV (GIP-9), have been previously confirmed under in vitro and in vivo conditions. In our study, we retrieved a vast array of proteins that contain SLiMs of interest from both prokaryotic and eukaryotic species, including viruses, bacteria, archaea, invertebrates, and vertebrates. Comprehensive Gene Ontology enrichment analysis showed that proteins from multiple functional classes, including enzymes, transcription factors, as well as those involved in signaling, cell cycle, and quality control, and ribosomal proteins were implicated in cellular adaptation to environmental stress conditions. These include response to oxidative and metabolic stress, hypoxia, DNA and RNA damage, protein degradation, as well as antimicrobial, antiviral, and immune response. Thus, our data enabled insights into the common functions of SLiMs evolutionary conserved across all taxonomic categories. These SLiMs can serve as important players in cellular adaptation to stress, which is crucial for cell functioning.
Collapse
|
44
|
Abstract
Mechanisms of emergence and divergence of protein folds pose central questions in biological sciences. Incremental mutation and stepwise adaptation explain relationships between topologically similar protein folds. However, the universe of folds is diverse and riotous, suggesting more potent and creative forces are at play. Sequence and structure similarity are observed between distinct folds, indicating that proteins with distinct folds may share common ancestry. We found evidence of common ancestry between three distinct β-barrel folds: Scr kinase family homology (SH3), oligonucleotide/oligosaccharide-binding (OB), and cradle loop barrel (CLB). The data suggest a mechanism of fold evolution that interconverts SH3, OB, and CLB. This mechanism, which we call creative destruction, can be generalized to explain many examples of fold evolution including circular permutation. In creative destruction, an open reading frame duplicates or otherwise merges with another to produce a fused polypeptide. A merger forces two ancestral domains into a new sequence and spatial context. The fused polypeptide can explore folding landscapes that are inaccessible to either of the independent ancestral domains. However, the folding landscapes of the fused polypeptide are not fully independent of those of the ancestral domains. Creative destruction is thus partially conservative; a daughter fold inherits some motifs from ancestral folds. After merger and refolding, adaptive processes such as mutation and loss of extraneous segments optimize the new daughter fold. This model has application in disease states characterized by genetic instability. Fused proteins observed in cancer cells are likely to experience remodeled folding landscapes and realize altered folds, conferring new or altered functions.
Collapse
|
45
|
Yu H, Kalutantirige FC, Yao L, Schroeder CM, Chen Q, Moore JS. Self-Assembly of Repetitive Segment and Random Segment Polymer Architectures. ACS Macro Lett 2022; 11:1366-1372. [PMID: 36413761 DOI: 10.1021/acsmacrolett.2c00495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Recent advances in chemical synthesis have created new methodologies for synthesizing sequence-controlled synthetic polymers, but rational design of monomer sequence for desired properties remains challenging. In this work, we synthesize periodic polymers with repetitive segments using a sequence-controlled ring-opening metathesis polymerization (ROMP) method, which draws inspiration from proteins containing repetitive sequence motifs. The repetitive segment architecture is shown to dramatically affect the self-assembly behavior of these materials. Our results show that polymers with identical repetitive sequences assemble into uniform spherical nanoparticles after thermal annealing, whereas copolymers with random placement of segments with different sequences exhibit disordered assemblies without a well-defined morphology. Overall, these results bring a new understanding to the role of periodic repetitive sequences in polymer assembly.
Collapse
Affiliation(s)
- Hao Yu
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Falon C Kalutantirige
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Lehan Yao
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Charles M Schroeder
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Qian Chen
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Jeffrey S Moore
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
46
|
Verma A, Åberg-Zingmark E, Sparrman T, Mushtaq AU, Rogne P, Grundström C, Berntsson R, Sauer UH, Backman L, Nam K, Sauer-Eriksson E, Wolf-Watz M. Insights into the evolution of enzymatic specificity and catalysis: From Asgard archaea to human adenylate kinases. SCIENCE ADVANCES 2022; 8:eabm4089. [PMID: 36332013 PMCID: PMC9635829 DOI: 10.1126/sciadv.abm4089] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 09/15/2022] [Indexed: 06/16/2023]
Abstract
Enzymatic catalysis is critically dependent on selectivity, active site architecture, and dynamics. To contribute insights into the interplay of these properties, we established an approach with NMR, crystallography, and MD simulations focused on the ubiquitous phosphotransferase adenylate kinase (AK) isolated from Odinarchaeota (OdinAK). Odinarchaeota belongs to the Asgard archaeal phylum that is believed to be the closest known ancestor to eukaryotes. We show that OdinAK is a hyperthermophilic trimer that, contrary to other AK family members, can use all NTPs for its phosphorylation reaction. Crystallographic structures of OdinAK-NTP complexes revealed a universal NTP-binding motif, while 19F NMR experiments uncovered a conserved and rate-limiting dynamic signature. As a consequence of trimerization, the active site of OdinAK was found to be lacking a critical catalytic residue and is therefore considered to be "atypical." On the basis of discovered relationships with human monomeric homologs, our findings are discussed in terms of evolution of enzymatic substrate specificity and cold adaptation.
Collapse
Affiliation(s)
- Apoorv Verma
- Department of Chemistry, Umeå University, 901 87 Umeå, Sweden
| | | | - Tobias Sparrman
- Department of Chemistry, Umeå University, 901 87 Umeå, Sweden
| | | | - Per Rogne
- Department of Chemistry, Umeå University, 901 87 Umeå, Sweden
| | | | - Ronnie Berntsson
- Department of Medical Biochemistry and Biophysics, Umeå University, 901 87 Umeå, Sweden
- Wallenberg Centre for Molecular Medicine, Umeå University, 901 87 Umeå, Sweden
| | - Uwe H. Sauer
- Department of Chemistry, Umeå University, 901 87 Umeå, Sweden
| | - Lars Backman
- Department of Chemistry, Umeå University, 901 87 Umeå, Sweden
| | - Kwangho Nam
- Department of Chemistry and Biochemistry, University of Texas at Arlington, Arlington, TX 76019, USA
| | | | | |
Collapse
|
47
|
Tong CL, Kanwar N, Morrone DJ, Seelig B. Nature-inspired engineering of an artificial ligase enzyme by domain fusion. Nucleic Acids Res 2022; 50:11175-11185. [PMID: 36243966 PMCID: PMC9638898 DOI: 10.1093/nar/gkac858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 08/30/2022] [Accepted: 09/26/2022] [Indexed: 11/20/2022] Open
Abstract
The function of most proteins is accomplished through the interplay of two or more protein domains and fine-tuned by natural evolution. In contrast, artificial enzymes have often been engineered from a single domain scaffold and frequently have lower catalytic activity than natural enzymes. We previously generated an artificial enzyme that catalyzed an RNA ligation by >2 million-fold but was likely limited in its activity by low substrate affinity. Inspired by nature's concept of domain fusion, we fused the artificial enzyme to a series of protein domains known to bind nucleic acids with the goal of improving its catalytic activity. The effect of the fused domains on catalytic activity varied greatly, yielding severalfold increases but also reductions caused by domains that previously enhanced nucleic acid binding in other protein engineering projects. The combination of the two better performing binding domains improved the activity of the parental ligase by more than an order of magnitude. These results demonstrate for the first time that nature's successful evolutionary mechanism of domain fusion can also improve an unevolved primordial-like protein whose structure and function had just been created in the test tube. The generation of multi-domain proteins might therefore be an ancient evolutionary process.
Collapse
Affiliation(s)
- Cher Ling Tong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
- BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| | - Nisha Kanwar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
- BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| | - Dana J Morrone
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
- BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| | - Burckhard Seelig
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
- BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| |
Collapse
|
48
|
A Short Tale of the Origin of Proteins and Ribosome Evolution. Microorganisms 2022; 10:microorganisms10112115. [DOI: 10.3390/microorganisms10112115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 09/30/2022] [Accepted: 10/19/2022] [Indexed: 11/16/2022] Open
Abstract
Proteins are the workhorses of the cell and have been key players throughout the evolution of all organisms, from the origin of life to the present era. How might life have originated from the prebiotic chemistry of early Earth? This is one of the most intriguing unsolved questions in biology. Currently, however, it is generally accepted that amino acids, the building blocks of proteins, were abiotically available on primitive Earth, which would have made the formation of early peptides in a similar fashion possible. Peptides are likely to have coevolved with ancestral forms of RNA. The ribosome is the most evident product of this coevolution process, a sophisticated nanomachine that performs the synthesis of proteins codified in genomes. In this general review, we explore the evolution of proteins from their peptide origins to their folding and regulation based on the example of superoxide dismutase (SOD1), a key enzyme in oxygen metabolism on modern Earth.
Collapse
|
49
|
Marquet C, Heinzinger M, Olenyi T, Dallago C, Erckert K, Bernhofer M, Nechaev D, Rost B. Embeddings from protein language models predict conservation and variant effects. Hum Genet 2022; 141:1629-1647. [PMID: 34967936 PMCID: PMC8716573 DOI: 10.1007/s00439-021-02411-y] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 12/06/2021] [Indexed: 12/13/2022]
Abstract
The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or masked amino acids from the context of entire sequence regions. Here, we used pLM representations (embeddings) to predict sequence conservation and SAV effects without multiple sequence alignments (MSAs). Embeddings alone predicted residue conservation almost as accurately from single sequences as ConSeq using MSAs (two-state Matthews Correlation Coefficient-MCC-for ProtT5 embeddings of 0.596 ± 0.006 vs. 0.608 ± 0.006 for ConSeq). Inputting the conservation prediction along with BLOSUM62 substitution scores and pLM mask reconstruction probabilities into a simplistic logistic regression (LR) ensemble for Variant Effect Score Prediction without Alignments (VESPA) predicted SAV effect magnitude without any optimization on DMS data. Comparing predictions for a standard set of 39 DMS experiments to other methods (incl. ESM-1v, DeepSequence, and GEMME) revealed our approach as competitive with the state-of-the-art (SOTA) methods using MSA input. No method outperformed all others, neither consistently nor statistically significantly, independently of the performance measure applied (Spearman and Pearson correlation). Finally, we investigated binary effect predictions on DMS experiments for four human proteins. Overall, embedding-based methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. Our method predicted SAV effects for the entire human proteome (~ 20 k proteins) within 40 min on one Nvidia Quadro RTX 8000. All methods and data sets are freely available for local and online execution through bioembeddings.com, https://github.com/Rostlab/VESPA , and PredictProtein.
Collapse
Affiliation(s)
- Céline Marquet
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany.
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany.
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Tobias Olenyi
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Christian Dallago
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Kyra Erckert
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Michael Bernhofer
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Dmitrii Nechaev
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, Garching, 85748, Munich, Germany
- TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
50
|
Kozlova MI, Shalaeva DN, Dibrova DV, Mulkidjanian AY. Common Patterns of Hydrolysis Initiation in P-loop Fold Nucleoside Triphosphatases. Biomolecules 2022; 12:1345. [PMID: 36291554 PMCID: PMC9599529 DOI: 10.3390/biom12101345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 08/20/2022] [Accepted: 09/14/2022] [Indexed: 11/24/2022] Open
Abstract
The P-loop fold nucleoside triphosphate (NTP) hydrolases (also known as Walker NTPases) function as ATPases, GTPases, and ATP synthases, are often of medical importance, and represent one of the largest and evolutionarily oldest families of enzymes. There is still no consensus on their catalytic mechanism. To clarify this, we performed the first comparative structural analysis of more than 3100 structures of P-loop NTPases that contain bound substrate Mg-NTPs or their analogues. We proceeded on the assumption that structural features common to these P-loop NTPases may be essential for catalysis. Our results are presented in two articles. Here, in the first, we consider the structural elements that stimulate hydrolysis. Upon interaction of P-loop NTPases with their cognate activating partners (RNA/DNA/protein domains), specific stimulatory moieties, usually Arg or Lys residues, are inserted into the catalytic site and initiate the cleavage of gamma phosphate. By analyzing a plethora of structures, we found that the only shared feature was the mechanistic interaction of stimulators with the oxygen atoms of gamma-phosphate group, capable of causing its rotation. One of the oxygen atoms of gamma phosphate coordinates the cofactor Mg ion. The rotation must pull this oxygen atom away from the Mg ion. This rearrangement should affect the properties of the other Mg ligands and may initiate hydrolysis according to the mechanism elaborated in the second article.
Collapse
Affiliation(s)
- Maria I. Kozlova
- School of Physics, Osnabrueck University, D-49069 Osnabrueck, Germany
| | - Daria N. Shalaeva
- School of Physics, Osnabrueck University, D-49069 Osnabrueck, Germany
| | - Daria V. Dibrova
- School of Physics, Osnabrueck University, D-49069 Osnabrueck, Germany
| | - Armen Y. Mulkidjanian
- School of Physics, Osnabrueck University, D-49069 Osnabrueck, Germany
- Center of Cellular Nanoanalytics, Osnabrueck University, D-49069 Osnabrueck, Germany
| |
Collapse
|