1
|
Brown SM, Mayer-Bacon C, Freeland S. Xeno Amino Acids: A Look into Biochemistry as We Do Not Know It. Life (Basel) 2023; 13:2281. [PMID: 38137883 PMCID: PMC10744825 DOI: 10.3390/life13122281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 11/18/2023] [Accepted: 11/20/2023] [Indexed: 12/24/2023] Open
Abstract
Would another origin of life resemble Earth's biochemical use of amino acids? Here, we review current knowledge at three levels: (1) Could other classes of chemical structure serve as building blocks for biopolymer structure and catalysis? Amino acids now seem both readily available to, and a plausible chemical attractor for, life as we do not know it. Amino acids thus remain important and tractable targets for astrobiological research. (2) If amino acids are used, would we expect the same L-alpha-structural subclass used by life? Despite numerous ideas, it is not clear why life favors L-enantiomers. It seems clearer, however, why life on Earth uses the shortest possible (alpha-) amino acid backbone, and why each carries only one side chain. However, assertions that other backbones are physicochemically impossible have relaxed into arguments that they are disadvantageous. (3) Would we expect a similar set of side chains to those within the genetic code? Many plausible alternatives exist. Furthermore, evidence exists for both evolutionary advantage and physicochemical constraint as explanatory factors for those encoded by life. Overall, as focus shifts from amino acids as a chemical class to specific side chains used by post-LUCA biology, the probable role of physicochemical constraint diminishes relative to that of biological evolution. Exciting opportunities now present themselves for laboratory work and computing to explore how changing the amino acid alphabet alters the universe of protein folds. Near-term milestones include: (a) expanding evidence about amino acids as attractors within chemical evolution; (b) extending characterization of other backbones relative to biological proteins; and (c) merging computing and laboratory explorations of structures and functions unlocked by xeno peptides.
Collapse
|
2
|
Mayer-Bacon C, Agboha N, Muscalli M, Freeland S. Evolution as a Guide to Designing xeno Amino Acid Alphabets. Int J Mol Sci 2021; 22:ijms22062787. [PMID: 33801827 PMCID: PMC8000707 DOI: 10.3390/ijms22062787] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 03/01/2021] [Accepted: 03/05/2021] [Indexed: 02/02/2023] Open
Abstract
Here, we summarize a line of remarkably simple, theoretical research to better understand the chemical logic by which life’s standard alphabet of 20 genetically encoded amino acids evolved. The connection to the theme of this Special Issue, “Protein Structure Analysis and Prediction with Statistical Scoring Functions”, emerges from the ways in which current bioinformatics currently lacks empirical science when it comes to xenoproteins composed largely or entirely of amino acids from beyond the standard genetic code. Our intent is to present new perspectives on existing data from two different frontiers in order to suggest fresh ways in which their findings complement one another. These frontiers are origins/astrobiology research into the emergence of the standard amino acid alphabet, and empirical xenoprotein synthesis.
Collapse
Affiliation(s)
- Christopher Mayer-Bacon
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA; (C.M.-B.); (N.A.)
| | - Neyiasuo Agboha
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA; (C.M.-B.); (N.A.)
| | - Mickey Muscalli
- Individualized Study Program, University of Maryland, Baltimore County, Baltimore, MD 21250, USA;
| | - Stephen Freeland
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA; (C.M.-B.); (N.A.)
- Individualized Study Program, University of Maryland, Baltimore County, Baltimore, MD 21250, USA;
- Correspondence:
| |
Collapse
|
3
|
Mayer-Bacon C, Freeland SJ. A broader context for understanding amino acid alphabet optimality. J Theor Biol 2021; 520:110661. [PMID: 33684404 DOI: 10.1016/j.jtbi.2021.110661] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 02/23/2021] [Accepted: 02/25/2021] [Indexed: 12/21/2022]
Abstract
A series of prior publications has reported unusual properties of the set of genetically encoded amino acids shared by all known life. This work uses quantitative measures (descriptors) of size, charge and hydrophobicity to compare the distribution of the genetically encoded amino acids with random samples of plausible alternatives. Results show that the standard "alphabet" of amino acids established by the time of LUCA is distributed with unusual evenness over a broad range for the three, key physicochemical properties. However, different publications have used slightly different assumptions, including variations in the precise descriptors used, the set of plausible alternative molecules considered, and the format in which results have been presented. Here we consolidate these findings into a unified framework in order to clarify unusual features. We find that in general, the remarkable features of the full set of 20 genetically encoded amino acids are robust when compared with random samples drawn from a densely populated picture of plausible, alternative L-α-amino acids. In particular, the genetically encoded set is distributed across an exceptionally broad range of volumes, and distributed exceptionally evenly within a modest range of hydrophobicities. Surprisingly, range and evenness of charge (pKa) is exceptional only for the full amino acid structures, not for their sidechains - a result inconsistent with prior interpretations involving the role that amino acid sidechains play within protein sequences. In stark contrast, these remarkable features are far less clear when the prebiotically plausible subset of genetically encoded amino acids is compared with a much smaller pool of prebiotically plausible alternatives. By considering the nature of the "optimality theory" approach taken to derive these and prior insights, we suggest productive avenues for further research.
Collapse
Affiliation(s)
- Christopher Mayer-Bacon
- Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 25250, USA.
| | - Stephen J Freeland
- Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 25250, USA
| |
Collapse
|
4
|
Adaptive Properties of the Genetically Encoded Amino Acid Alphabet Are Inherited from Its Subsets. Sci Rep 2019; 9:12468. [PMID: 31462646 PMCID: PMC6713743 DOI: 10.1038/s41598-019-47574-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 07/08/2019] [Indexed: 01/11/2023] Open
Abstract
Life uses a common set of 20 coded amino acids (CAAs) to construct proteins. This set was likely canonicalized during early evolution; before this, smaller amino acid sets were gradually expanded as new synthetic, proofreading and coding mechanisms became biologically available. Many possible subsets of the modern CAAs or other presently uncoded amino acids could have comprised the earlier sets. We explore the hypothesis that the CAAs were selectively fixed due to their unique adaptive chemical properties, which facilitate folding, catalysis, and solubility of proteins, and gave adaptive value to organisms able to encode them. Specifically, we studied in silico hypothetical CAA sets of 3–19 amino acids comprised of 1913 structurally diverse α-amino acids, exploring the adaptive value of their combined physicochemical properties relative to those of the modern CAA set. We find that even hypothetical sets containing modern CAA members are especially adaptive; it is difficult to find sets even among a large choice of alternatives that cover the chemical property space more amply. These results suggest that each time a CAA was discovered and embedded during evolution, it provided an adaptive value unusual among many alternatives, and each selective step may have helped bootstrap the developing set to include still more CAAs.
Collapse
|
5
|
Extraordinarily adaptive properties of the genetically encoded amino acids. Sci Rep 2015; 5:9414. [PMID: 25802223 PMCID: PMC4371090 DOI: 10.1038/srep09414] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 02/12/2015] [Indexed: 02/02/2023] Open
Abstract
Using novel advances in computational chemistry, we demonstrate that the set of 20 genetically encoded amino acids, used nearly universally to construct all coded terrestrial proteins, has been highly influenced by natural selection. We defined an adaptive set of amino acids as one whose members thoroughly cover relevant physico-chemical properties, or “chemistry space.” Using this metric, we compared the encoded amino acid alphabet to random sets of amino acids. These random sets were drawn from a computationally generated compound library containing 1913 alternative amino acids that lie within the molecular weight range of the encoded amino acids. Sets that cover chemistry space better than the genetically encoded alphabet are extremely rare and energetically costly. Further analysis of more adaptive sets reveals common features and anomalies, and we explore their implications for synthetic biology. We present these computations as evidence that the set of 20 amino acids found within the standard genetic code is the result of considerable natural selection. The amino acids used for constructing coded proteins may represent a largely global optimum, such that any aqueous biochemistry would use a very similar set.
Collapse
|
6
|
Meringer M, Cleaves HJ, Freeland SJ. Beyond terrestrial biology: charting the chemical universe of α-amino acid structures. J Chem Inf Model 2013; 53:2851-62. [PMID: 24152173 DOI: 10.1021/ci400209n] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
α-Amino acids are fundamental to biochemistry as the monomeric building blocks with which cells construct proteins according to genetic instructions. However, the 20 amino acids of the standard genetic code represent a tiny fraction of the number of α-amino acid chemical structures that could plausibly play such a role, both from the perspective of natural processes by which life emerged and evolved, and from the perspective of human-engineered genetically coded proteins. Until now, efforts to describe the structures comprising this broader set, or even estimate their number, have been hampered by the complex combinatorial properties of organic molecules. Here, we use computer software based on graph theory and constructive combinatorics in order to conduct an efficient and exhaustive search of the chemical structures implied by two careful and precise definitions of the α-amino acids relevant to coded biological proteins. Our results include two virtual libraries of α-amino acid structures corresponding to these different approaches, comprising 121 044 and 3 846 structures, respectively, and suggest a simple approach to exploring much larger, as yet uncomputed, libraries of interest.
Collapse
Affiliation(s)
- Markus Meringer
- German Aerospace Center (DLR), Earth Observation Center (EOC) , Münchner Straße 20, D-82234 Oberpfaffenhofen-Wessling, Germany
| | | | | |
Collapse
|
7
|
Philip GK, Freeland SJ. Did evolution select a nonrandom "alphabet" of amino acids? ASTROBIOLOGY 2011; 11:235-240. [PMID: 21434765 DOI: 10.1089/ast.2010.0567] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The last universal common ancestor of contemporary biology (LUCA) used a precise set of 20 amino acids as a standard alphabet with which to build genetically encoded protein polymers. Considerable evidence indicates that some of these amino acids were present through nonbiological syntheses prior to the origin of life, while the rest evolved as inventions of early metabolism. However, the same evidence indicates that many alternatives were also available, which highlights the question: what factors led biological evolution on our planet to define its standard alphabet? One possibility is that natural selection favored a set of amino acids that exhibits clear, nonrandom properties-a set of especially useful building blocks. However, previous analysis that tested whether the standard alphabet comprises amino acids with unusually high variance in size, charge, and hydrophobicity (properties that govern what protein structures and functions can be constructed) failed to clearly distinguish evolution's choice from a sample of randomly chosen alternatives. Here, we demonstrate unambiguous support for a refined hypothesis: that an optimal set of amino acids would spread evenly across a broad range of values for each fundamental property. Specifically, we show that the standard set of 20 amino acids represents the possible spectra of size, charge, and hydrophobicity more broadly and more evenly than can be explained by chance alone.
Collapse
Affiliation(s)
- Gayle K Philip
- NASA Astrobiology Institute, University of Hawaii, Honolulu, 96822, USA
| | | |
Collapse
|
8
|
Lu Y, Freeland SJ. A quantitative investigation of the chemical space surrounding amino acid alphabet formation. J Theor Biol 2007; 250:349-61. [PMID: 18005995 DOI: 10.1016/j.jtbi.2007.10.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2007] [Revised: 09/21/2007] [Accepted: 10/08/2007] [Indexed: 11/29/2022]
Abstract
To date, explanations for the origin and emergence of the alphabet of amino acids encoded by the standard genetic code have been largely qualitative and speculative. Here, with the help of computational chemistry, we present the first quantitative exploration of nature's "choices" set against various models for plausible alternatives. Specifically, we consider the chemical space defined by three fundamental biophysical properties (size, charge, and hydrophobicity) to ask whether the amino acids that entered the genetic code exhibit a higher diversity than random samples of similar size drawn from several different definitions of amino acid possibility space. We found that in terms of the properties studied, the full, standard set of 20 biologically encoded amino acids is indeed significantly more diverse than an equivalently sized group drawn at random from the set of plausible, prebiotic alternatives (using the Murchison meteorite as a model for pre-biotic plausibility). However, when the set of possible amino acids is enlarged to include those that are produced by standard biosynthetic pathways (reflecting the widespread idea that many members of the standard alphabet were recruited in this way), then the genetically encoded amino acids can no longer be distinguished as more diverse than a random sample. Finally, if we turn to consider the overlap between biologically encoded amino acids and those that are prebiotically plausible, then we find that the biologically encoded subset are no more diverse as a group than would be expected from a random sample, unless the definition of "random sample" is adjusted to reflect possible prebiotic abundance (again, using the contents of the Murchison meteorite as our estimator). This final result is contingent on the accuracy of our computational estimates for amino acid properties, and prebiotic abundances, and an exploration of the likely effect of errors in our estimation reveals that our results should be treated with caution. We thus present this work as a first step in quantifying and thus testing various origin-of-life hypotheses regarding the origin and evolution of life's amino acid alphabet, and advocate the progress that would add valuable information in the future.
Collapse
Affiliation(s)
- Yi Lu
- Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 25250, USA
| | | |
Collapse
|
9
|
Lu Y, Bulka B, desJardins M, Freeland SJ. Amino acid quantitative structure property relationship database: a web-based platform for quantitative investigations of amino acids. Protein Eng Des Sel 2007; 20:347-51. [PMID: 17557765 DOI: 10.1093/protein/gzm027] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Here, we present the AA-QSPR Db (Amino Acid Quantitative Structure Property Relationship Database): a novel, freely available web-resource of data pertaining to amino acids, both engineered and naturally occurring. In addition to presenting fundamental molecular descriptors of size, charge and hydrophobicity, it also includes online visualization tools for users to perform instant, interactive analyses of amino acid sub-sets in which they are interested. The database has been designed with extensible markup language technology to provide a flexible structure, suitable for future development. In addition to providing easy access for queries by external computers, it also offers a user-friendly web-based interface that facilitates human interactions (submission, storage and retrieval of amino acid data) and an associated e-forum that encourages users to question and discuss current and future database contents.
Collapse
Affiliation(s)
- Yi Lu
- Department of Biological Sciences, University of Maryland Baltimore County, Baltimore, MD 21250, USA
| | | | | | | |
Collapse
|