1
|
Sojitra D, Biswas Hathiwala M, Hathiwala G, Bishoyi AK. Significance of genetic code module structure in gene expression and GC content enhancement in RNA sequences. Biosystems 2024; 237:105135. [PMID: 38320621 DOI: 10.1016/j.biosystems.2024.105135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 01/29/2024] [Accepted: 01/30/2024] [Indexed: 02/08/2024]
Abstract
The existent algebraic models of the genetic code contribute to the understanding of the physio-chemical characteristics of the amino acids. However, the process of translating a gene into a phenotype is highly complex. Moreover, the intricacy of gene expression gets further multiplied due to the biases in the codon usage. This paper explores an algebraic structure called module on the set of codons as well as on that of RNA sequences. We study the potential implications of these structures on gene expression and the GC content of an RNA sequence. The base order {C,U,G,A} appears to possess greater biological significance than many of the orders previously studied. We have developed a novel algorithm to generate RNA sequences with high GC content, aiming to enhance the thermostability of biomolecules. The insights gained from this investigation may have applications in biomolecular modeling and docking, protein engineering, drug development, and related fields.
Collapse
Affiliation(s)
- Devangi Sojitra
- Department of Mathematics, Marwadi University, Rajkot, 360003, Gujarat, India.
| | | | - Gautam Hathiwala
- Department of Mathematics, Marwadi University, Rajkot, 360003, Gujarat, India.
| | - Ashok Kumar Bishoyi
- Department of Microbiology, Marwadi University, Rajkot, 360003, Gujarat, India.
| |
Collapse
|
2
|
Borah C, Ali T. Genetic code noise immunity features: Degeneracy and frameshift correction. Gene Reports 2022. [DOI: 10.1016/j.genrep.2022.101707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
3
|
Ali T, Borah C. Analysis of amino acids network based on mutation and base positions. Gene Reports 2021. [DOI: 10.1016/j.genrep.2021.101291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
4
|
Bora PK, Hazarika P, Baruah AK. Distance based amino acids network analysis. Gene Reports 2020. [DOI: 10.1016/j.genrep.2020.100933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
5
|
Fimmel E, Strüngmann L. Mathematical fundamentals for the noise immunity of the genetic code. Biosystems 2018; 164:186-98. [DOI: 10.1016/j.biosystems.2017.09.007] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Revised: 09/07/2017] [Accepted: 09/08/2017] [Indexed: 01/05/2023]
|
6
|
Mabrouk MS, Naeem SM, Eldosoky MA. DIFFERENT GENOMIC SIGNAL PROCESSING METHODS FOR EUKARYOTIC GENE PREDICTION: A SYSTEMATIC REVIEW. Biomed Eng Appl Basis Commun 2017. [DOI: 10.4015/s1016237217300012] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Bioinformatics field has now solidly settled itself as a control in molecular biology and incorporates an extensive variety of branches of knowledge from structural biology, genomics to gene expression studies. Bioinformatics is the application of computer technology to the management of biological information. Genomic signal processing (GSP) techniques have been connected most all around in bioinformatics and will keep on assuming an essential part in the investigation of biomedical issues. GSP refers to using the digital signal processing (DSP) methods for genomic data (e.g. DNA sequences) analysis. Recently, applications of GSP in bioinformatics have obtained great consideration such as identification of DNA protein coding regions, identification of reading frames, cancer detection and others. Cancer is one of the most dangerous diseases that the world faces and has raised the death rate in recent years, it is known medically as malignant neoplasm, so detection of it at the early stage can yield a promising approach to determine and take actions to treat with this risk. GSP is a method which can be used to detect the cancerous cells that are often caused due to genetic abnormality. This systematic review discusses some of the GSP applications in bioinformatics generally. The GSP techniques, used for cancer detection especially, are presented to collect the recent results and what has been reached at this point to be a new subject of research.
Collapse
Affiliation(s)
- Mai S. Mabrouk
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology (MUST University), Cairo, Egypt
| | - Safaa M. Naeem
- Biomedical Engineering Department, Faculty of Engineering, Helwan University, Cairo, Egypt
| | - Mohamed A. Eldosoky
- Biomedical Engineering Department, Faculty of Engineering, Helwan University, Cairo, Egypt
| |
Collapse
|
7
|
Sawamura J, Morishita S, Ishigooka J. A group matrix representation relevant to scales of measurement of clinical disease states via stratified vectors. Theor Biol Med Model 2016; 13:5. [PMID: 26856979 PMCID: PMC4746825 DOI: 10.1186/s12976-016-0031-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2015] [Accepted: 01/20/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Previously, we applied basic group theory and related concepts to scales of measurement of clinical disease states and clinical findings (including laboratory data). To gain a more concrete comprehension, we here apply the concept of matrix representation, which was not explicitly exploited in our previous work. METHODS Starting with a set of orthonormal vectors, called the basis, an operator Rj (an N-tuple patient disease state at the j-th session) was expressed as a set of stratified vectors representing plural operations on individual components, so as to satisfy the group matrix representation. RESULTS The stratified vectors containing individual unit operations were combined into one-dimensional square matrices [Rj]s. The [Rj]s meet the matrix representation of a group (ring) as a K-algebra. Using the same-sized matrix of stratified vectors, we can also express changes in the plural set of [Rj]s. The method is demonstrated on simple examples. CONCLUSIONS Despite the incompleteness of our model, the group matrix representation of stratified vectors offers a formal mathematical approach to clinical medicine, aligning it with other branches of natural science.
Collapse
Affiliation(s)
- Jitsuki Sawamura
- Department of Psychiatry, Tokyo Women's Medical University, Tokyo, Japan.
| | - Shigeru Morishita
- Depression Prevention Medical Center, Inariyama Takeda Hospital, Kyoto, Japan.
| | - Jun Ishigooka
- Department of Psychiatry, Tokyo Women's Medical University, Tokyo, Japan.
| |
Collapse
|
8
|
Abstract
The genetic code is the rule by which DNA stores the genetic information about formation of protein molecule. In this paper, a partial ordering is equipped on the genetic code and a lattice structure has been developed from it. The codon–anticodon interaction, hydrogen bond number and the chemical types of bases play an important role in the partial ordering. We have established some relations between the lattice structure of the genetic code and physico-chemical properties of amino acids. Taking into consideration the evolutionary importance of base positions in codons we have constructed a distance matrix for the amino acids. Further with a real life example we have demonstrated the relationship between frequently occurring mutations and codon distances.
Collapse
Affiliation(s)
- NISHA GOHAIN
- Department of Mathematics, Dibrugarh University, Dibrugarh, Assam, India
| | - TAZID ALI
- Department of Mathematics, Dibrugarh University, Dibrugarh, Assam, India
| | - ADIL AKHTAR
- Department of Mathematics, Dibrugarh University, Dibrugarh, Assam, India
| |
Collapse
|
9
|
Sawamura J, Morishita S, Ishigooka J. A symmetry model for genetic coding via a wallpaper group composed of the traditional four bases and an imaginary base E: towards category theory-like systematization of molecular/genetic biology. Theor Biol Med Model 2014; 11:18. [PMID: 24885369 PMCID: PMC4057574 DOI: 10.1186/1742-4682-11-18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Accepted: 02/09/2014] [Indexed: 01/20/2023] Open
Abstract
Background Previously, we suggested prototypal models that describe some clinical states based on group postulates. Here, we demonstrate a group/category theory-like model for molecular/genetic biology as an alternative application of our previous model. Specifically, we focus on deoxyribonucleic acid (DNA) base sequences. Results We construct a wallpaper pattern based on a five-letter cruciform motif with letters C, A, T, G, and E. Whereas the first four letters represent the standard DNA bases, the fifth is introduced for ease in formulating group operations that reproduce insertions and deletions of DNA base sequences. A basic group Z5 = {r, u, d, l, n} of operations is defined for the wallpaper pattern, with which a sequence of points can be generated corresponding to changes of a base in a DNA sequence by following the orbit of a point of the pattern under operations in group Z5. Other manipulations of DNA sequence can be treated using a vector-like notation ‘Dj’ corresponding to a DNA sequence but based on the five-letter base set; also, ‘Dj’s are expressed graphically. Insertions and deletions of a series of letters ‘E’ are admitted to assist in describing DNA recombination. Likewise, a vector-like notation Rj can be constructed for sequences of ribonucleic acid (RNA). The wallpaper group B = {Z5×∞, ●} (an ∞-fold Cartesian product of Z5) acts on Dj (or Rj) yielding changes to Dj (or Rj) denoted by ‘Dj◦B(j→k) = Dk’ (or ‘Rj◦B(j→k) = Rk’). Based on the operations of this group, two types of groups—a modulo 5 linear group and a rotational group over the Gaussian plane, acting on the five bases—are linked as parts of the wallpaper group for broader applications. As a result, changes, insertions/deletions and DNA (RNA) recombination (partial/total conversion) are described. As an exploratory study, a notation for the canonical “central dogma” via a category theory-like way is presented for future developments. Conclusions Despite the large incompleteness of our methodology, there is fertile ground to consider a symmetry model for genetic coding based on our specific wallpaper group. A more integrated formulation containing “central dogma” for future molecular/genetic biology remains to be explored.
Collapse
Affiliation(s)
- Jitsuki Sawamura
- Department of Psychiatry, Tokyo Women's Medical University, Tokyo, Japan.
| | | | | |
Collapse
|
10
|
Pérez-Montoto LG, Dea-Ayuela MA, Prado-Prado FJ, Bolas-Fernández F, Ubeira FM, González-Díaz H. Study of peptide fingerprints of parasite proteins and drug-DNA interactions with Markov-Mean-Energy invariants of biopolymer molecular-dynamic lattice networks. POLYMER 2009; 50:3857-3870. [PMID: 32287404 PMCID: PMC7111648 DOI: 10.1016/j.polymer.2009.05.055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Revised: 05/06/2009] [Accepted: 05/14/2009] [Indexed: 11/26/2022]
Abstract
Since the advent of Molecular Dynamics (MD) in biopolymers science with the study by Karplus et al. on protein dynamics, MD has become the by foremost well established, computational technique to investigate structure and function of biomolecules and their respective complexes and interactions. The analysis of the MD trajectories (MDTs) remains, however, the greatest challenge and requires a great deal of insight, experience, and effort. Here, we introduce a new class of invariants for MDTs based on the spatial distribution of Mean-Energy values ξk (L) on a 2D Euclidean space representation of the MDTs. The procedure forces one MD trajectory to fold into a 2D Cartesian coordinates system using a step-by-step procedure driven by simple rules. The ξk (L) values are invariants of a Markov matrix (1 Π), which describes the probabilities of transition between two states in the new 2D space; which is associated to a graph representation of MDTs similar to the lattice networks (LNs) of DNA and protein sequences. We also introduce a new algorithm to perform phylogenetic analysis of peptides based on MDTs instead of the sequence of the polypeptide. In a first experiment, we illustrate this algorithm for 35 peptides present on the Peptide Mass Fingerprint (PMF) of a new protein of Leishmania infantum studied in this work. We report, by the first time, 2D Electrophoresis isolation, MALDI TOF Mass Spectroscopy characterization, and MASCOT search results for this PMF. In a second experiment, we construct the LNs for 422 MDTs obtained in DNA-Drug Docking simulations of the interaction of 57 anticancer furocoumarins with a DNA oligonucleotide. We calculated the respective ξk (L) values for all these LNs and used them as inputs to train a new classifier with Accuracy = 85.44% and 84.91% in training and validation respectively. The new model can be used as scoring function to guide DNA-Drug Docking studies in drug design of new coumarins for PUVA therapy. The new phylogenetics analysis algorithms encode information different from sequence similarity and may be used to analyze MDTs obtained in Docking or modeling experiments for any classes of biopolymers. The work opens new perspective on the analysis and applications of MD in polymer sciences.
Collapse
Affiliation(s)
- Lázaro Guillermo Pérez-Montoto
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain,Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - María Auxiliadora Dea-Ayuela
- Departamento de Atención Sanitaria, Salud Pública y Sanidad Animal, Facultad CC Experimentales y de La Salud, Universidad CEU Cardenal Herrera, 46113 Moncada (Valencia), Spain
| | - Francisco J. Prado-Prado
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain,Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | | | - Florencio M. Ubeira
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Humberto González-Díaz
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain,Corresponding author. Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| |
Collapse
|
11
|
Sánchez R, Grau R. An algebraic hypothesis about the primeval genetic code architecture. Math Biosci 2009; 221:60-76. [PMID: 19607845 DOI: 10.1016/j.mbs.2009.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Revised: 06/23/2009] [Accepted: 07/09/2009] [Indexed: 11/26/2022]
Abstract
A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D,A,C,G,U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G identical with C and A=U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B(3))(N) of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.
Collapse
Affiliation(s)
- Robersy Sánchez
- Research Institute of Tropical Roots, Tuber Crops and Plantains (INIVIT), Biotechnology Group, Villa Clara, Cuba
| | | |
Collapse
|
12
|
GonzÁlez-DÍaz H, Prado-Prado FJ. Unified QSAR and network-based computational chemistry approach to antimicrobials, part 1: Multispecies activity models for antifungals. J Comput Chem 2007; 29:656-67. [DOI: 10.1002/jcc.20826] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
13
|
Sánchez R, Grau R. A novel algebraic structure of the genetic code over the galois field of four DNA bases. Acta Biotheor 2007; 54:27-42. [PMID: 16823609 DOI: 10.1007/s10441-006-6192-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2005] [Accepted: 12/22/2005] [Indexed: 11/29/2022]
Abstract
A novel algebraic structure of the genetic code is proposed. Here, the principal partitions of the genetic code table were obtained as equivalent classes of quotient spaces of the genetic code vector space over the Galois field of the four DNA bases. The new algebraic structure shows strong connections among algebraic relationships, codon assignment and physicochemical properties of amino acids. Moreover, a distance function defined between the codon binary representations in the vector space was demonstrated to have a linear behavior respect to physical variables such as the mean of amino acids interaction energies in proteins. It was also noticed that the distance between wild type and mutant codons approach to smaller values in mutational variants of four genes, i.e., human phenylalanine hydroxylase, human beta-globin, HIV-1 protease and HIV-1 reverse transcriptase. These results strongly suggest that deterministic rules must be involved in the genetic code origin.
Collapse
Affiliation(s)
- Robersy Sánchez
- Research Institute of Tropical Roots, Tuber Crops and Banana (INIVIT), Biotechnology Group, Santo Domingo, Villa Clara, Cuba.
| | | |
Collapse
|