1
|
Abel DL. "Assembly Theory" in life-origin models: A critical review. Biosystems 2025; 247:105378. [PMID: 39710183 DOI: 10.1016/j.biosystems.2024.105378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Revised: 11/30/2024] [Accepted: 12/04/2024] [Indexed: 12/24/2024]
Abstract
Any homeostatic protometabolism would have required orchestration of disparate biochemical pathways into integrated circuits. Extraordinarily specific molecular assemblies were also required at the right time and place. Assembly Theory conflated with its cousins-Complexity Theory, Chaos theory, Quantum Mechanics, Irreversible Nonequilibrium Thermodynamics and Molecular Evolution theory- collectively have great naturalistic appeal in hopes of their providing the needed exquisite steering and controls. They collectively offer the best hope of circumventing the need for active selection required to formally orchestrate bona fide formal organization (as opposed to the mere self-ordering of chaos theory) (Abel and Trevors, 2006b). This paper focuses specifically on AT's contribution to naturalistic life-origin models.
Collapse
Affiliation(s)
- David Lynn Abel
- ProtoBioCybernetics & Protocellular Metabolomics, The Gene Emergence Project, The Origin of Life Science Foundation, Inc, USA.
| |
Collapse
|
2
|
Thorvaldsen S, Hössjer O. Use of directed quasi-metric distances for quantifying the information of gene families. Biosystems 2024; 243:105256. [PMID: 38871243 DOI: 10.1016/j.biosystems.2024.105256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/06/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024]
Abstract
A large hindrance to analyzing information in genetic or protein sequence data has been a lack of a mathematical framework for doing so. In this paper, we present a multinomial probability space X as a general foundation for multicategory discrete data, where categories refer to variants/alleles of biosequences. The external information that is infused in order to generate a sample of such data is quantified as a distance on X between the prior distribution of data and the empirical distribution of the sample. A number of distances on X are treated. All of them have an information theoretic interpretation, reflecting the information that the sampling mechanism provides about which variants that have a selective advantage and therefore appear more frequently compared to prior expectations. This includes distances on X based on mutual information, conditional mutual information, active information, and functional information. The functional information distance is singled out as particularly useful. It is simple and has intuitive interpretations in terms of 1) a rejection sampling mechanism, where functional entities are retained, whereas non-functional categories are censored, and 2) evolutionary waiting times. The functional information is also a quasi-metric on X, with information being measured in an asymmetric, mountainous landscape. This quasi-metric property is also retained for a robustified version of the functional information distance that allows for mutations in the sampling mechanism. The functional information quasi-metric has been applied with success on bioinformatics data sets, for proteins and sequence alignment of protein families.
Collapse
Affiliation(s)
- Steinar Thorvaldsen
- Dept. of Education, Division of Science, UiT the Arctic University of Norway, Norway.
| | - Ola Hössjer
- Dept. of Mathematics, Stockholm University, Sweden.
| |
Collapse
|
3
|
Díaz-Pachón DA, Hössjer O. Assessing, Testing and Estimating the Amount of Fine-Tuning by Means of Active Information. ENTROPY 2022; 24:1323. [PMCID: PMC9601319 DOI: 10.3390/e24101323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 09/19/2022] [Indexed: 06/29/2023]
Abstract
A general framework is introduced to estimate how much external information has been infused into a search algorithm, the so-called active information. This is rephrased as a test of fine-tuning, where tuning corresponds to the amount of pre-specified knowledge that the algorithm makes use of in order to reach a certain target. A function f quantifies specificity for each possible outcome x of a search, so that the target of the algorithm is a set of highly specified states, whereas fine-tuning occurs if it is much more likely for the algorithm to reach the target as intended than by chance. The distribution of a random outcome X of the algorithm involves a parameter θ that quantifies how much background information has been infused. A simple choice of this parameter is to use θf in order to exponentially tilt the distribution of the outcome of the search algorithm under the null distribution of no tuning, so that an exponential family of distributions is obtained. Such algorithms are obtained by iterating a Metropolis–Hastings type of Markov chain, which makes it possible to compute their active information under the equilibrium and non-equilibrium of the Markov chain, with or without stopping when the targeted set of fine-tuned states has been reached. Other choices of tuning parameters θ are discussed as well. Nonparametric and parametric estimators of active information and tests of fine-tuning are developed when repeated and independent outcomes of the algorithm are available. The theory is illustrated with examples from cosmology, student learning, reinforcement learning, a Moran type model of population genetics, and evolutionary programming.
Collapse
Affiliation(s)
| | - Ola Hössjer
- Department of Mathematics, Stockholm University, 114 19 Stockholm, Sweden
| |
Collapse
|
4
|
Evidence of genomic information and structural restrictions of HIV-1 PR and RT gene regions from individuals experiencing antiretroviral virologic failure. INFECTION GENETICS AND EVOLUTION 2019; 78:104134. [PMID: 31837484 DOI: 10.1016/j.meegid.2019.104134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 11/28/2019] [Accepted: 12/04/2019] [Indexed: 10/25/2022]
Abstract
OBJECTIVES This study analyzed Protease-PR and Reverse Transcriptase-RT HIV-1 genomic information entropy metrics among patients under antiretroviral virologic failure, according to the numbers of virologic failures or resistance mutations. METHODS For this purpose, we used genomic sequences from PR and RT of HIV-1 from a cohort of chronic patients followed up at São Paulo Hospital. RESULTS Informational entropy proportionally increases with the number of antiretroviral virologic failures in PR and RT (p < .001). Affected regions of PR were related to catalytic and structural functions, such as Fulcrum (K20) Flap (M46) and Cantilever (A71). In RT, this occurred at Fingers (E44) and Palm (K219). Informational entropy increases according to the number of resistance mutations in PR and RT (p < .001). Higher PR entropy was proportional to the resistance mutation numbers in Fulcrum (L10), Active site (L24) Flap (M46), Cantilever (L63) and near Interface (L90). In RT, they related to regions responsible for protein stability such as Fingers (T39) and Palm (L100). CONCLUSIONS The antiretroviral selective pressure affects HIV genomic informational entropy at the PR and RT regions, leading to the emergence of more unstable virions. Mapping the three-dimensional structure in these HIV-1 proteins is relevant to designing new antiretroviral targeting resistant strains.
Collapse
|
5
|
Durston KK, Chiu DKY, Wong AKC, Li GCL. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2012; 2012:8. [PMID: 22793672 PMCID: PMC3524763 DOI: 10.1186/1687-4153-2012-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Accepted: 05/29/2012] [Indexed: 11/10/2022]
Abstract
UNLABELLED BACKGROUND Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. RESULTS The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. CONCLUSIONS Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.
Collapse
Affiliation(s)
- Kirk K Durston
- School of Computer Science, University of Guelph, 50 Stone Road East, Guelph, ON, N1G 2W1, Canada
| | - David KY Chiu
- School of Computer Science, University of Guelph, 50 Stone Road East, Guelph, ON, N1G 2W1, Canada
| | - Andrew KC Wong
- Department of System Design Engineering, University of Waterloo, 200 University Ave. W, Waterloo, ON, N2L 3G1, Canada
| | - Gary CL Li
- Department of System Design Engineering, University of Waterloo, 200 University Ave. W, Waterloo, ON, N2L 3G1, Canada
| |
Collapse
|
6
|
D'Onofrio DJ, Abel DL, Johnson DE. Dichotomy in the definition of prescriptive information suggests both prescribed data and prescribed algorithms: biosemiotics applications in genomic systems. Theor Biol Med Model 2012; 9:8. [PMID: 22413926 PMCID: PMC3319427 DOI: 10.1186/1742-4682-9-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2011] [Accepted: 03/14/2012] [Indexed: 11/26/2022] Open
Abstract
The fields of molecular biology and computer science have cooperated over recent years to create a synergy between the cybernetic and biosemiotic relationship found in cellular genomics to that of information and language found in computational systems. Biological information frequently manifests its "meaning" through instruction or actual production of formal bio-function. Such information is called Prescriptive Information (PI). PI programs organize and execute a prescribed set of choices. Closer examination of this term in cellular systems has led to a dichotomy in its definition suggesting both prescribed data and prescribed algorithms are constituents of PI. This paper looks at this dichotomy as expressed in both the genetic code and in the central dogma of protein synthesis. An example of a genetic algorithm is modeled after the ribosome, and an examination of the protein synthesis process is used to differentiate PI data from PI algorithms.
Collapse
Affiliation(s)
- David J D'Onofrio
- Control Systems Modeling and Simulation, General Dynamics, Sterling Heights MI, USA.
| | | | | |
Collapse
|
7
|
Abstract
Is life physicochemically unique? No. Is life unique? Yes. Life manifests innumerable formalisms that cannot be generated or explained by physicodynamics alone. Life pursues thousands of biofunctional goals, not the least of which is staying alive. Neither physicodynamics, nor evolution, pursue goals. Life is largely directed by linear digital programming and by the Prescriptive Information (PI) instantiated particularly into physicodynamically indeterminate nucleotide sequencing. Epigenomic controls only compound the sophistication of these formalisms. Life employs representationalism through the use of symbol systems. Life manifests autonomy, homeostasis far from equilibrium in the harshest of environments, positive and negative feedback mechanisms, prevention and correction of its own errors, and organization of its components into Sustained Functional Systems (SFS). Chance and necessity-heat agitation and the cause-and-effect determinism of nature's orderliness-cannot spawn formalisms such as mathematics, language, symbol systems, coding, decoding, logic, organization (not to be confused with mere self-ordering), integration of circuits, computational success, and the pursuit of functionality. All of these characteristics of life are formal, not physical.
Collapse
Affiliation(s)
- David L Abel
- Department of ProtoBioCybernetics and ProtoBioSemiotics, Origin of Life Science Foundation, Inc., 113-120 Hedgewood Drive, Greenbelt, MD 20770, USA.
| |
Collapse
|
8
|
Trevors JT, Saier MH. Thermodynamic perspectives on genetic instructions, the laws of biology and diseased states. C R Biol 2010; 334:1-5. [PMID: 21262480 DOI: 10.1016/j.crvi.2010.11.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Revised: 11/28/2010] [Accepted: 11/29/2010] [Indexed: 10/18/2022]
Abstract
This article examines in a broad perspective entropy and some examples of its relationship to evolution, genetic instructions and how we view diseases. Living organisms are programmed by functional genetic instructions (FGI), through cellular communication pathways, to grow and reproduce by maintaining a variety of hemistable, ordered structures (low entropy). Living organisms are far from equilibrium with their surrounding environmental systems, which tends towards increasing disorder (increasing entropy). Organisms free themselves from high entropy (high disorder) to maintain their cellular structures for a period of time sufficient to allow reproduction and the resultant offspring to reach reproductive ages. This time interval varies for different species. Bacteria, for example need no sexual parents; dividing cells are nearly identical to the previous generation of cells, and can begin a new cell cycle without delay under appropriate conditions. By contrast, human infants require years of care before they can reproduce. Living organisms maintain order in spite of their changing surrounding environment that decreases order according to the second law of thermodynamics. These events actually work together since living organisms create ordered biological structures by increasing local entropy. From a disease perspective, viruses and other disease agents interrupt the normal functioning of cells. The pressure for survival may result in mechanisms that allow organisms to resist attacks by viruses, other pathogens, destructive chemicals and physical agents such as radiation. However, when the attack is successful, the organism can be damaged until the cell, tissue, organ or entire organism is no longer functional and entropy increases.
Collapse
Affiliation(s)
- Jack T Trevors
- School of Environmental Sciences, University of Guelph, N1G 2W1, Guelph, Ontario, Canada.
| | | |
Collapse
|
9
|
Biomolecular information gained through in vitro evolution. Biophys Rev 2010; 2:1-11. [DOI: 10.1007/s12551-009-0021-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2009] [Accepted: 10/22/2009] [Indexed: 11/30/2022] Open
|
10
|
The capabilities of chaos and complexity. Int J Mol Sci 2009; 10:247-291. [PMID: 19333445 PMCID: PMC2662469 DOI: 10.3390/ijms10010247] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2008] [Revised: 12/27/2008] [Accepted: 01/04/2009] [Indexed: 11/17/2022] Open
Abstract
To what degree could chaos and complexity have organized a Peptide or RNA World of crude yet necessarily integrated protometabolism? How far could such protolife evolve in the absence of a heritable linear digital symbol system that could mutate, instruct, regulate, optimize and maintain metabolic homeostasis? To address these questions, chaos, complexity, self-ordered states, and organization must all be carefully defined and distinguished. In addition their cause-and-effect relationships and mechanisms of action must be delineated. Are there any formal (non physical, abstract, conceptual, algorithmic) components to chaos, complexity, self-ordering and organization, or are they entirely physicodynamic (physical, mass/energy interaction alone)? Chaos and complexity can produce some fascinating self-ordered phenomena. But can spontaneous chaos and complexity steer events and processes toward pragmatic benefit, select function over non function, optimize algorithms, integrate circuits, produce computational halting, organize processes into formal systems, control and regulate existing systems toward greater efficiency? The question is pursued of whether there might be some yet-to-be discovered new law of biology that will elucidate the derivation of prescriptive information and control. “System” will be rigorously defined. Can a low-informational rapid succession of Prigogine’s dissipative structures self-order into bona fide organization?
Collapse
|