1
|
Fimmel E, Strüngmann L. The spiderweb of error-detecting codes in the genetic information. Biosystems 2023; 233:105009. [PMID: 37640191 DOI: 10.1016/j.biosystems.2023.105009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 08/21/2023] [Accepted: 08/21/2023] [Indexed: 08/31/2023]
Abstract
Nature possesses inherent mechanisms for error detection and correction during the translation of genetic information, as demonstrated by the discovery of a self-complementary circular C3-code called X0 in various organisms such as bacteria, eukaryotes, plasmids, and viruses (Arquès and Michel, 1996; Michel, 2015, 2017). Since then, extensive research has focused on circular codes, which are believed to be remnants of ancient comma-free codes. These codes can be regarded as an additional genetic code specifically optimized for detecting and preserving the proper reading frame in protein-coding sequences. A study by Fimmel et al. in 2014 identified that a total of 216 maximal self-complementary C3-codes can be grouped into 27 equivalence classes with eight codes in each class. In this work, we study how the 27 equivalence classes are related to each other. While the codes in each equivalence class obtained by Fimmel et al. in 2014 are permutations of each other, i.e. one code can be obtained from the other by applying a permutation of the bases, it has not been clear how the equvalence classes are connected. We show that there is an ordering of the equivalence classes such that one gets from one class to the next one by substituting only one pair of codon/anticodon in the corresponding codes, i.e. the corresponding codes have a maximal intersection of 18 codons. To perform this analysis, we define two graphs, G216 and G27, whose vertices are, respectively, all 216 maximal self-complementary C3-codes and 27 equivalence classes. Several properties of the graphs are obtained. Most surprisingly, it turns out that G27 contains Hamiltonian paths of length 27. This fact ultimately leads to a representation of the set of all 216 maximal self-complementary C3-codes as a kind of spider web. Finally, we define dinucleotide cuts of such codes by projecting each codon to its first two bases and show that the paths of lengths 27 in G216 can even be chosen so that all the codes contain a special subset of dinucleotides defined by Rumer's roots. These observations raise a lot of new questions about the biological function of such structures.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
2
|
Fimmel E, Michel CJ, Strüngmann L. Circular mixed sets. Biosystems 2023; 229:104906. [PMID: 37196893 DOI: 10.1016/j.biosystems.2023.104906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 04/29/2023] [Indexed: 05/19/2023]
Abstract
In this article, we introduce the new mathematical concept of circular mixed sets of words over an arbitrary finite alphabet. These circular mixed sets may not be codes in the classical sense and hence allow a higher amount of information to be encoded. After describing their basic properties, we generalize a recent graph theoretical approach for circularity and apply it to distinguish codes from sets (i.e. non-codes). Moreover, several methods are given to construct circular mixed sets. Finally, this approach allows us to propose a new evolution model of the present genetic code that could have evolved from a dinucleotide world to a trinucleotide world via circular mixed sets of dinucleotides and trinucleotides.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Christian J Michel
- Theoretical bioinformatics, ICube, University of Strasbourg, C.N.R.S., 300 Boulevard Sébastien Brant, 67400 Illkirch, France.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
3
|
Borah C, Ali T. Genetic code noise immunity features: Degeneracy and frameshift correction. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
4
|
Giannerini S, Gonzalez DL, Goracci G, Danielli A. A role for circular code properties in translation. Sci Rep 2021; 11:9218. [PMID: 33911089 PMCID: PMC8080828 DOI: 10.1038/s41598-021-87534-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 03/23/2021] [Indexed: 11/19/2022] Open
Abstract
Circular codes represent a form of coding allowing detection/correction of frame-shift errors. Building on recent theoretical advances on circular codes, we provide evidence that protein coding sequences exhibit in-frame circular code marks, that are absent in introns and are intimately linked to the keto-amino transformation of codon bases. These properties strongly correlate with translation speed, codon influence and protein synthesis levels. Strikingly, circular code marks are absent at the beginning of coding sequences, but stably occur 40 codons after the initiator codon, hinting at the translation elongation process. Finally, we use the lens of circular codes to show that codon influence on translation correlates with the strong-weak dichotomy of the first two bases of the codon. The results can lead to defining new universal tools for sequence indicators and sequence optimization for bioinformatics and biotechnological applications, and can shed light on the molecular mechanisms behind the decoding process.
Collapse
Affiliation(s)
- Simone Giannerini
- Department of Statistical Sciences, University of Bologna, Bologna, 40126, Italy.
| | - Diego Luis Gonzalez
- Department of Statistical Sciences, University of Bologna, Bologna, 40126, Italy.,Institute for Microelectronics and Microsystems - Bologna Unit, CNR, Bologna, 40129, Italy
| | - Greta Goracci
- Department of Statistical Sciences, University of Bologna, Bologna, 40126, Italy
| | - Alberto Danielli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, 40126, Italy
| |
Collapse
|
5
|
Abstract
The origin of the modern genetic code and the mechanisms that have contributed to its present form raise many questions. The main goal of this work is to test two hypotheses concerning the development of the genetic code for their compatibility and complementarity and see if they could benefit from each other. On the one hand, Gonzalez, Giannerini and Rosa developed a theory, based on four-based codons, which they called tesserae. This theory can explain the degeneracy of the modern vertebrate mitochondrial code. On the other hand, in the 1990s, so-called circular codes were discovered in nature, which seem to ensure the maintenance of a correct reading-frame during the translation process. It turns out that the two concepts not only do not contradict each other, but on the contrary complement and enrichen each other.
Collapse
|
6
|
Gonzalez DL, Giannerini S, Rosa R. On the origin of degeneracy in the genetic code. Interface Focus 2019; 9:20190038. [PMID: 31641429 PMCID: PMC6802134 DOI: 10.1098/rsfs.2019.0038] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 09/05/2019] [Indexed: 01/11/2023] Open
Abstract
The degeneracy of amino acid coding is one of the most crucial and enigmatic aspects of the genetic code. Different theories about the origin of the genetic code have been developed. However, to date, there is no comprehensive hypothesis on the mechanism that might have generated the degeneracy as we observe it. Here, we provide a new theory that explains the origin of the degeneracy based only on symmetry principles. The approach allows one to describe exactly the degeneracy of the early code (progenitor of the genetic code of LUCA, the last universal common ancestor) which is hypothesized to have the same degeneracy as the present vertebrate mitochondrial genetic code. The theory is based upon the tessera code, that fits as the progenitor of the early code. Moreover, we describe in detail the possible evolutionary transitions implied by our theory. The approach is supported by a unified mathematical framework that accounts for the degeneracy properties of both nuclear and mitochondrial genetic codes. Our work provides a new perspective to the understanding of the origin of the genetic code and the roles of symmetry principles in the organization of genetic information.
Collapse
Affiliation(s)
- D L Gonzalez
- CNR-IMM, UOS di Bologna, Via Gobetti 101, 40129 Bologna, Italy.,Dipartimento di Scienze Statistiche, Università di Bologna, via delle Belle Arti 41, 40126 Bologna, Italy
| | - S Giannerini
- Dipartimento di Scienze Statistiche, Università di Bologna, via delle Belle Arti 41, 40126 Bologna, Italy
| | - R Rosa
- CNR-IMM, UOS di Bologna, Via Gobetti 101, 40129 Bologna, Italy
| |
Collapse
|
7
|
Seligmann H, Warthi G. Chimeric Translation for Mitochondrial Peptides: Regular and Expanded Codons. Comput Struct Biotechnol J 2019; 17:1195-1202. [PMID: 31534643 PMCID: PMC6742854 DOI: 10.1016/j.csbj.2019.08.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 08/19/2019] [Accepted: 08/21/2019] [Indexed: 02/07/2023] Open
Abstract
Frameshifting protein translation occasionally results from insertion of amino acids at isolated mono- or dinucleotide-expanded codons by tRNAs with expanded anticodons. Previous analyses of two different types of human mitochondrial MS proteomic data (Fisher and Waters technologies) detect peptides entirely corresponding to expanded codon translation. Here, these proteomic data are reanalyzed searching for peptides consisting of at least eight consecutive amino acids translated according to regular tricodons, and at least eight adjacent consecutive amino acids translated according to expanded codons. Both datasets include chimerically translated peptides (mono- and dinucleotide expansions, 42 and 37, respectively). The regular tricodon-encoded part of some chimeric peptides corresponds to standard human mitochondrial proteins (mono- and dinucleotide expansions, six (AT6, CytB, ND1, 2xND2, ND5) and one (ND1), respectively). Chimeric translation probably increases the diversity of mitogenome-encoded proteins, putatively producing functional proteins. These might result from translation by tRNAs with expanded anticodons, or from regular tricodon translation of RNAs where transcription/posttranscriptional edition systematically deleted mono- or dinucleotides after each trinucleotide. The pairwise matched combination of adjacent peptide parts translated from regular and expanded codons strengthens the hypothesis that translation of stretches of consecutive expanded codons occurs. Results indicate statistical translation producing distributions of alternative proteins. Genetic engineering should account for potential unexpected, unwanted secondary products.
Collapse
Affiliation(s)
- Hervé Seligmann
- The National Natural History Collections, The Hebrew University of Jerusalem, 91404 Jerusalem, Israel
| | - Ganesh Warthi
- Aix-Marseille University, IRD, VITROME, Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, France
| |
Collapse
|
8
|
Fimmel E, Strüngmann L. Linear codes and the mitochondrial genetic code. Biosystems 2019; 184:103990. [PMID: 31326431 DOI: 10.1016/j.biosystems.2019.103990] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 07/09/2019] [Accepted: 07/10/2019] [Indexed: 11/29/2022]
Abstract
The origin of the genetic code can certainly be regarded as one of the most challenging problems in the theory of molecular evolution. Thus the known variants of the genetic code and a possible common ancestry of them haven been studied extensively in the literature. Gonzalez et al. (2012) developed the theory of a primeval mitochondrial genetic code composed of four base codons. These were called tesserae and it was shown that the tesserae code has some remarkable error detection capabilities. In our paper we will show that using classical coding theory we can construct the tessera code as a linear coding of the standard genetic code and at the same time it can be deduced from the code of all dinucleotides by Plotkin's construction. It shows that the tessera model of the mitochondrial code does not just have a biological explanation but also has a clear mathematical structure. This underlines the role that the tessera model might have played in evolution.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, and Competence Center for Algorithmic and Mathematical Methods in Biology, Biotechnology and Medicine, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, and Competence Center for Algorithmic and Mathematical Methods in Biology, Biotechnology and Medicine, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
9
|
Fimmel E, Michel CJ, Pirot F, Sereni JS, Strüngmann L. Mixed circular codes. Math Biosci 2019; 317:108231. [PMID: 31325443 DOI: 10.1016/j.mbs.2019.108231] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 07/16/2019] [Accepted: 07/17/2019] [Indexed: 12/11/2022]
Abstract
By an extensive statistical analysis in genes of bacteria, archaea, eukaryotes, plasmids and viruses, a maximal C3-self-complementary trinucleotide circular code has been found to have the highest average occurrence in the reading frame of the ribosome during translation. Circular codes may play an important role in maintaining the correct reading frame. On the other hand, as several evolutionary theories propose primeval codes based on dinucleotides, trinucleotides and tetranucleotides, mixed circular codes were investigated. By using a graph-theoretical approach of circular codes recently developed, we study mixed circular codes, which are the union of a dinucleotide circular code, a trinucleotide circular code and a tetranucleotide circular code. Maximal mixed circular codes of (di,tri)-nucleotides, (tri,tetra)-nucleotides and (di,tri,tetra)-nucleotides are constructed, respectively. In particular, we show that any maximal dinucleotide circular code of size 6 can be embedded into a maximal mixed (di,tri)-nucleotide circular code such that its trinucleotide component is a maximal C3-comma-free code. The growth function of self-complementary mixed circular codes of dinucleotides and trinucleotides is given. Self-complementary mixed circular codes could have been involved in primitive genetic processes.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, Mannheim 68163, Germany.
| | - Christian J Michel
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, Illkirch 67400, France.
| | - François Pirot
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, Illkirch 67400, France; LORIA (Orpailleur) and Dept. of Mathematics, University of Lorraine and Radboud University, Vandœuvre-lès-Nancy, France and Nijmegen, Netherlands.
| | - Jean-Sébastien Sereni
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, Illkirch 67400, France.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, Mannheim 68163, Germany.
| |
Collapse
|
10
|
Mathematical fundamentals for the noise immunity of the genetic code. Biosystems 2018; 164:186-198. [DOI: 10.1016/j.biosystems.2017.09.007] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Revised: 09/07/2017] [Accepted: 09/08/2017] [Indexed: 01/05/2023]
|
11
|
Bijective codon transformations show genetic code symmetries centered on cytosine's coding properties. Theory Biosci 2017; 137:17-31. [PMID: 29147851 DOI: 10.1007/s12064-017-0258-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Accepted: 11/13/2017] [Indexed: 12/11/2022]
Abstract
Homology of some RNAs with template DNA requires systematic exchanges between nucleotides. Such exchanges produce 'swinger' RNA along 23 bijective transformations (nine symmetric, X ↔ Y; and 14 asymmetric, X → Y → Z → X, for example A ↔ C and A → C → G → A, respectively). Here, analyses compare amino acids coded by swinger-transformed codons to those coded by untransformed codons, defining coding invariance after transformations. Swinger transformations cluster according to coding invariance in four groups characterized by transformations into cytosine (C = C, T → C, A → C, and G → C). C's central mutational coding role shows that swinger transformations constrained genetic code genesis. Coding invariance post-transformations correlate positively/negatively with mitochondrial swinger transcription/lepidosaurian body temperature. Presumably, low/high temperatures stabilize/revert rare swinger polymerization modes, producing long swinger sequences/point mutations, respectively. Coding invariance after swinger transformations might compensate effects of swinger polymerizations in species with low body temperatures. Hypothetically, swinger transcription increased coding potential of RNA self-replicating protolife systems under heating/cooling cycles.
Collapse
|
12
|
Diletter circular codes over finite alphabets. Math Biosci 2017; 294:120-129. [PMID: 29024747 DOI: 10.1016/j.mbs.2017.10.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Revised: 08/26/2017] [Accepted: 10/08/2017] [Indexed: 11/22/2022]
Abstract
The graph approach of circular codes recently developed (Fimmel et al., 2016) allows here a detailed study of diletter circular codes over finite alphabets. A new class of circular codes is identified, strong comma-free codes. New theorems are proved with the diletter circular codes of maximal length in relation to (i) a characterisation of their graphs as acyclic tournaments; (ii) their explicit description; and (iii) the non-existence of other maximal diletter circular codes. The maximal lengths of paths in the graphs of the comma-free and strong comma-free codes are determined. Furthermore, for the first time, diletter circular codes are enumerated over finite alphabets. Biological consequences of dinucleotide circular codes are analysed with respect to their embedding in the trinucleotide circular code X identified in genes and to the periodicity modulo 2 observed in introns. An evolutionary hypothesis of circular codes is also proposed according to their combinatorial properties.
Collapse
|
13
|
Hu Z, Petoukhov SV, Petukhova ES. I-Ching, dyadic groups of binary numbers and the geno-logic coding in living bodies. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 131:354-368. [PMID: 28935152 DOI: 10.1016/j.pbiomolbio.2017.08.018] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 08/25/2017] [Accepted: 08/29/2017] [Indexed: 02/05/2023]
Abstract
The ancient Chinese book I-Ching was written a few thousand years ago. It introduces the system of symbols Yin and Yang (equivalents of 0 and 1). It had a powerful impact on culture, medicine and science of ancient China and several other countries. From the modern standpoint, I-Ching declares the importance of dyadic groups of binary numbers for the Nature. The system of I-Ching is represented by the tables with dyadic groups of 4 bigrams, 8 trigrams and 64 hexagrams, which were declared as fundamental archetypes of the Nature. The ancient Chinese did not know about the genetic code of protein sequences of amino acids but this code is organized in accordance with the I-Ching: in particularly, the genetic code is constructed on DNA molecules using 4 nitrogenous bases, 16 doublets, and 64 triplets. The article also describes the usage of dyadic groups as a foundation of the bio-mathematical doctrine of the geno-logic code, which exists in parallel with the known genetic code of amino acids but serves for a different goal: to code the inherited algorithmic processes using the logical holography and the spectral logic of systems of genetic Boolean functions. Some relations of this doctrine with the I-Ching are discussed. In addition, the ratios of musical harmony that can be revealed in the parameters of DNA structure are also represented in the I-Ching book.
Collapse
Affiliation(s)
- Zhengbing Hu
- Central China Normal University, No. 152 Louyu Road, 430079, Wuhan, China
| | - Sergey V Petoukhov
- Mechanical Engineering Research Institute of Russian Academy of Sciences, Malyi Kharitonievsky Pereulok, 4, Moscow, 101990, Russia; Moscow State Conservatory by P.I. Tchaikovsky, Bolshaya Nikitskaya, 13/6, Moscow, 125009, Russia.
| | - Elena S Petukhova
- Mechanical Engineering Research Institute of Russian Academy of Sciences, Malyi Kharitonievsky Pereulok, 4, Moscow, 101990, Russia
| |
Collapse
|
14
|
Petoukhov SV. Genetic coding and united-hypercomplex systems in the models of algebraic biology. Biosystems 2017; 158:31-46. [DOI: 10.1016/j.biosystems.2017.05.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 05/10/2017] [Accepted: 05/10/2017] [Indexed: 11/26/2022]
|
15
|
The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences. INFORMATION 2017. [DOI: 10.3390/info8010012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
16
|
|
17
|
Seligmann H. Natural mitochondrial proteolysis confirms transcription systematically exchanging/deleting nucleotides, peptides coded by expanded codons. J Theor Biol 2016; 414:76-90. [PMID: 27899286 DOI: 10.1016/j.jtbi.2016.11.021] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 11/11/2016] [Accepted: 11/22/2016] [Indexed: 12/19/2022]
Abstract
Protein sequences have higher linguistic complexities than human languages. This indicates undeciphered multilayered, overprinted information/genetic codes. Some superimposed genetic information is revealed by detections of transcripts systematically (a) exchanging nucleotides (nine symmetric, e.g. A<->C, fourteen asymmetric, e.g. A->C->G->A, swinger RNAs) translated according to tri-, tetra- and pentacodons, and (b) deleting mono-, dinucleotides after each trinucleotide (delRNAs). Here analyses of two independent proteomic datasets considering natural proteolysis confirm independently translation of these non-canonical RNAs, also along tetra- and pentacodons, increasing coverage of putative, cryptically encoded proteins. Analyses assuming endoproteinase GluC and elastase digestions (cleavages after residues D, E, and A, L, I, V, respectively) detect additional peptides colocalizing with detected non-canonical RNAs. Analyses detect fewer peptides matching GluC-, elastase- than trypsin-digestions: artificial trypsin-digestion outweighs natural proteolysis. Results suggest occurrences of complete proteins entirely matching non-canonical, superimposed encoding(s). Protein-coding after bijective transformations could explain genetic code symmetries, such as along Rumer's transformation.
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Émergentes, Faculté de Médecine, URMITE CNRS-IRD 198 UMER 6236, IHU (Institut Hospitalo-Universitaire), Aix-Marseille University, Marseille, France.
| |
Collapse
|
18
|
Unbiased Mitoproteome Analyses Confirm Non-canonical RNA, Expanded Codon Translations. Comput Struct Biotechnol J 2016; 14:391-403. [PMID: 27830053 PMCID: PMC5094600 DOI: 10.1016/j.csbj.2016.09.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 09/28/2016] [Accepted: 09/29/2016] [Indexed: 01/14/2023] Open
Abstract
Proteomic MS/MS mass spectrometry detections are usually biased towards peptides cleaved by experimentally added digestion enzyme(s). Hence peptides resulting from spontaneous degradation and natural proteolysis usually remain undetected. Previous analyses of tryptic human proteome data (cleavage after K, R) detected non-canonical tryptic peptides translated according to tetra- and pentacodons (codons expanded by silent mono- and dinucleotides), and from transcripts systematically (a) deleting mono-, dinucleotides after trinucleotides (delRNAs), (b) exchanging nucleotides according to 23 bijective transformations. Nine symmetric and fourteen asymmetric nucleotide exchanges (X ↔ Y, e.g. A ↔ C; and X → Y → Z → X, e.g. A → C → G → A) produce swinger RNAs. Here unbiased reanalyses of these proteomic data detect preferentially non-canonical tryptic peptides despite assuming random cleavage. Unbiased analyses couldn't reconstruct experimental tryptic digestion if most detected non-canonical peptides were false positives. Detected non-tryptic non-canonical peptides map preferentially on corresponding, previously described non-canonical transcripts, as for tryptic non-canonical peptides. Hence unbiased analyses independently confirm previous trypsin-biased analyses that showed translations of del- and swinger RNA and expanded codons. Accounting for natural proteolysis completes trypsin-biased mitopeptidome analyses, independently confirms non-canonical transcriptions and translations.
Collapse
|
19
|
Seligmann H. Natural chymotrypsin-like-cleaved human mitochondrial peptides confirm tetra-, pentacodon, non-canonical RNA translations. Biosystems 2016; 147:78-93. [PMID: 27477600 DOI: 10.1016/j.biosystems.2016.07.010] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2016] [Revised: 07/15/2016] [Accepted: 07/26/2016] [Indexed: 12/22/2022]
Abstract
Mass spectra of human mitochondrial peptides match non-canonical transcripts systematically (a) deleting mono/dinucleotides after trinucleotides (delRNA), (b) exchanging nucleotides (swinger RNA), translated according to tri, (c) tetra- and pentacodons (codons expanded by a 4th (and 5th) silent nucleotide(s)). Swinger transcriptions are 23 bijective transformations, nine symmetric (X<->Y, e.g. A<->C) and fourteen asymmetric exchanges (X->Y->Z->X, e.g. A->C->G->A). Here, proteomic analyses assuming cleavage after W,Y, F (chymotrypsin-like, for trypsinized samples) detect fewer chymotrypsinized than trypsinized peptides. Detected non-canonical peptides map preferentially on detected non-canonical RNAs for chymotrypsinized peptides, as previously found for trypsinized peptides. This suggests residual natural chymotrypsin-like digestion detectable within experimentally trypsinized peptide data. Some trypsinized peptides are detected twice, by analyses assuming trypsin, and those assuming chymotrypsin cleavages. They have higher spectra counts than peptides detected only once, meaning that abundant peptides are more frequently detected, but detection certainties resemble those for peptides detected only once. Analyses assuming 'incorrect' digestions are inadequate negative controls for digestion enzymes naturally active in biological samples. Chymotrypsin-analyses confirm non-canonical transcriptions/translations independently of results obtained assuming trypsinization, increase non-canonical peptidome coverage, indicating mitogenome-encoding of yet undetected proteins.
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Émergentes, Faculté de Médecine, Université d'Aix-Marseille, URMITE CNRS-IRD 198 UMER 6236, Marseille, France.
| |
Collapse
|
20
|
Systematically frameshifting by deletion of every 4th or 4th and 5th nucleotides during mitochondrial transcription: RNA self-hybridization regulates delRNA expression. Biosystems 2016; 142-143:43-51. [PMID: 27018206 DOI: 10.1016/j.biosystems.2016.03.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 03/11/2016] [Accepted: 03/23/2016] [Indexed: 02/05/2023]
Abstract
In mitochondria, secondary structures punctuate post-transcriptional RNA processing. Recently described transcripts match the human mitogenome after systematic deletions of every 4th, respectively every 4th and 5th nucleotides, called delRNAs. Here I explore predicted stem-loop hairpin formation by delRNAs, and their associations with delRNA transcription and detected peptides matching their translation. Despite missing 25, respectively 40% of the nucleotides in the original sequence, del-transformed sequences form significantly more secondary structures than corresponding randomly shuffled sequences, indicating biological function, independently of, and in combination with, previously detected delRNA and thereof translated peptides. Self-hybridization decreases delRNA abundances, indicating downregulation. Systematic deletions of the human mitogenome reveal new, unsuspected coding and structural informations.
Collapse
|
21
|
Fimmel E, Strüngmann L. Maximal dinucleotide comma-free codes. J Theor Biol 2015; 389:206-13. [PMID: 26562635 DOI: 10.1016/j.jtbi.2015.10.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Revised: 10/16/2015] [Accepted: 10/19/2015] [Indexed: 10/22/2022]
Abstract
The problem of retrieval and maintenance of the correct reading frame plays a significant role in RNA transcription. Circular codes, and especially comma-free codes, can help to understand the underlying mechanisms of error-detection in this process. In recent years much attention has been paid to the investigation of trinucleotide circular codes (see, for instance, Fimmel et al., 2014; Fimmel and Strüngmann, 2015a; Michel and Pirillo, 2012; Michel et al., 2012, 2008), while dinucleotide codes had been touched on only marginally, even though dinucleotides are associated to important biological functions. Recently, all maximal dinucleotide circular codes were classified (Fimmel et al., 2015; Michel and Pirillo, 2013). The present paper studies maximal dinucleotide comma-free codes and their close connection to maximal dinucleotide circular codes. We give a construction principle for such codes and provide a graphical representation that allows them to be visualized geometrically. Moreover, we compare the results for dinucleotide codes with the corresponding situation for trinucleotide maximal self-complementary C(3)-codes. Finally, the results obtained are discussed with respect to Crick׳s hypothesis about frame-shift-detecting codes without commas.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty of Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty of Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
22
|
Michel CJ, Pellegrini M, Pirillo G. Maximal dinucleotide and trinucleotide circular codes. J Theor Biol 2015; 389:40-6. [PMID: 26382231 DOI: 10.1016/j.jtbi.2015.08.029] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 07/28/2015] [Accepted: 08/29/2015] [Indexed: 10/23/2022]
Abstract
We determine here the number and the list of maximal dinucleotide and trinucleotide circular codes. We prove that there is no maximal dinucleotide circular code having strictly less than 6 elements (maximum size of dinucleotide circular codes). On the other hand, a computer calculus shows that there are maximal trinucleotide circular codes with less than 20 elements (maximum size of trinucleotide circular codes). More precisely, there are maximal trinucleotide circular codes with 14, 15, 16, 17, 18 and 19 elements and no maximal trinucleotide circular code having less than 14 elements. We give the same information for the maximal self-complementary dinucleotide and trinucleotide circular codes. The amino acid distribution of maximal trinucleotide circular codes is also determined.
Collapse
Affiliation(s)
- Christian J Michel
- Theoretical Bioinformatics, ICube, University of Strasbourg, CNRS, 300 Boulevard Sébastien Brant, 67400 Illkirch, France.
| | - Marco Pellegrini
- Dipartimento di Matematica e Informatica "U.Dini", viale Morgagni 67/A, 50134 Firenze, Italy.
| | - Giuseppe Pirillo
- Consiglio Nazionale delle Ricerche, Istituto di Analisi dei Sistemi ed Informatica "Antonio Ruberti", Unità di Firenze, Dipartimento di Matematica e Informatica "U.Dini", viale Morgagni 67/A, 50134 Firenze, Italy; Université de Marne-la-Vallée, 5 boulevard Descartes, 77454 Marne-la-Vallée Cedex 2, France.
| |
Collapse
|