1
|
Quadrini M, Tesei L, Merelli E. Automatic generation of pseudoknotted RNAs taxonomy. BMC Bioinformatics 2023; 23:575. [PMID: 37322429 DOI: 10.1186/s12859-023-05362-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND The ability to compare RNA secondary structures is important in understanding their biological function and for grouping similar organisms into families by looking at evolutionarily conserved sequences such as 16S rRNA. Most comparison methods and benchmarks in the literature focus on pseudoknot-free structures due to the difficulty of mapping pseudoknots in classical tree representations. Some approaches exist that permit to cluster pseudoknotted RNAs but there is not a general framework for evaluating their performance. RESULTS We introduce an evaluation framework based on a similarity/dissimilarity measure obtained by a comparison method and agglomerative clustering. Their combination automatically partition a set of molecules into groups. To illustrate the framework we define and make available a benchmark of pseudoknotted (16S and 23S) and pseudoknot-free (5S) rRNA secondary structures belonging to Archaea, Bacteria and Eukaryota. We also consider five different comparison methods from the literature that are able to manage pseudoknots. For each method we clusterize the molecules in the benchmark to obtain the taxa at the rank phylum according to the European Nucleotide Archive curated taxonomy. We compute appropriate metrics for each method and we compare their suitability to reconstruct the taxa.
Collapse
Affiliation(s)
- Michela Quadrini
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| | - Luca Tesei
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy.
| | - Emanuela Merelli
- School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032, Camerino, MC, Italy
| |
Collapse
|
2
|
Leeder WM, Geyer FK, Göringer HU. Fuzzy RNA recognition by the Trypanosoma brucei editosome. Nucleic Acids Res 2022; 50:5818-5833. [PMID: 35580050 PMCID: PMC9178004 DOI: 10.1093/nar/gkac357] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/20/2022] [Accepted: 04/26/2022] [Indexed: 11/30/2022] Open
Abstract
The assembly of high molecular mass ribonucleoprotein complexes typically relies on the binary interaction of defined RNA sequences or precisely folded RNA motifs with dedicated RNA-binding domains on the protein side. Here we describe a new molecular recognition principle of RNA molecules by a high molecular mass protein complex. By chemically probing the solvent accessibility of mitochondrial pre-mRNAs when bound to the Trypanosoma brucei editosome, we identified multiple similar but non-identical RNA motifs as editosome contact sites. However, by treating the different motifs as mathematical graph objects we demonstrate that they fit a consensus 2D-graph consisting of 4 vertices (V) and 3 edges (E) with a Laplacian eigenvalue of 0.5477 (λ2). We establish that synthetic 4V(3E)-RNAs are sufficient to compete for the editosomal pre-mRNA binding site and that they inhibit RNA editing in vitro. Furthermore, we demonstrate that only two topological indices are necessary to predict the binding of any RNA motif to the editosome with a high level of confidence. Our analysis corroborates that the editosome has adapted to the structural multiplicity of the mitochondrial mRNA folding space by recognizing a fuzzy continuum of RNA folds that fit a consensus graph descriptor.
Collapse
Affiliation(s)
| | - Felix Klaus Geyer
- Molecular Genetics, Technical University Darmstadt, 64287 Darmstadt, Germany
| | | |
Collapse
|
3
|
Zakh R, Churkin A, Totzeck F, Parr M, Tuller T, Etzion O, Dahari H, Roggendorf M, Frishman D, Barash D. A Mathematical Analysis of HDV Genotypes: From Molecules to Cells. MATHEMATICS (BASEL, SWITZERLAND) 2021; 9:2063. [PMID: 34540628 PMCID: PMC8445514 DOI: 10.3390/math9172063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hepatitis D virus (HDV) is classified according to eight genotypes. The various genotypes are included in the HDVdb database, where each HDV sequence is specified by its genotype. In this contribution, a mathematical analysis is performed on RNA sequences in HDVdb. The RNA folding predicted structures of the Genbank HDV genome sequences in HDVdb are classified according to their coarse-grain tree-graph representation. The analysis allows discarding in a simple and efficient way the vast majority of the sequences that exhibit a rod-like structure, which is important for the virus replication, to attempt to discover other biological functions by structure consideration. After the filtering, there remain only a small number of sequences that can be checked for their additional stem-loops besides the main one that is known to be responsible for virus replication. It is found that a few sequences contain an additional stem-loop that is responsible for RNA editing or other possible functions. These few sequences are grouped into two main classes, one that is well-known experimentally belonging to genotype 3 for patients from South America associated with RNA editing, and the other that is not known at present belonging to genotype 7 for patients from Cameroon. The possibility that another function besides virus replication reminiscent of the editing mechanism in HDV genotype 3 exists in HDV genotype 7 has not been explored before and is predicted by eigenvalue analysis. Finally, when comparing native and shuffled sequences, it is shown that HDV sequences belonging to all genotypes are accentuated in their mutational robustness and thermodynamic stability as compared to other viruses that were subjected to such an analysis.
Collapse
Affiliation(s)
- Rami Zakh
- Department of Computer Science, Ben-Gurion University, Beer-Sheva 8410501, Israel
| | - Alexander Churkin
- Department of Software Engineering, Sami Shamoon College of Engineering, Beer-Sheva 8410501, Israel
| | - Franziska Totzeck
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Marina Parr
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel-Aviv 6997801, Israel
| | - Ohad Etzion
- Soroka University Medical Center, Ben-Gurion University, Beer-Sheva 8410501, Israel
| | - Harel Dahari
- Stritch School of Medicine, Loyola University Chicago, Maywood, IL 60153, USA
| | - Michael Roggendorf
- Institute of Virology, Technische Universität München, 81675 Munich, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Danny Barash
- Department of Computer Science, Ben-Gurion University, Beer-Sheva 8410501, Israel
| |
Collapse
|
4
|
Abstract
Novel RNA motif design is of great practical importance for technology and medicine. Increasingly, computational design plays an important role in such efforts. Our coarse-grained RAG (RNA-As-Graphs) framework offers strategies for enumerating the universe of RNA 2D folds, selecting "RNA-like" candidates for design, and determining sequences that fold onto these candidates. In RAG, RNA secondary structures are represented as tree or dual graphs. Graphs with known RNA structures are called "existing", and the others are labeled "hypothetical". By using simplified features for RNA graphs, we have clustered the hypothetical graphs into "RNA-like" and "non-RNA-like" groups and proposed RNA-like graphs as candidates for design. Here, we propose a new way of designing graph features by using Fiedler vectors. The new features reflect graph shapes better, and they lead to a more clustered organization of existing graphs. We show significant increases in K-means clustering accuracy by using the new features (e.g., up to 95% and 98% accuracy for tree and dual graphs, respectively). In addition, we propose a scoring model for top graph candidate selection. This scoring model allows users to set a threshold for candidates, and it incorporates weighing of existing graphs based on their corresponding number of known RNAs. We include a list of top scored RNA-like candidates, which we hope will stimulate future novel RNA design.
Collapse
Affiliation(s)
- Qiyao Zhu
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, United States
| | - Tamar Schlick
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, United States
- Department of Chemistry, New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry, NYU Shanghai, Shanghai 200062, P. R. China
| |
Collapse
|
5
|
Kimchi O, Cragnolini T, Brenner MP, Colwell LJ. A Polymer Physics Framework for the Entropy of Arbitrary Pseudoknots. Biophys J 2019; 117:520-532. [PMID: 31353036 PMCID: PMC6697467 DOI: 10.1016/j.bpj.2019.06.037] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 06/21/2019] [Accepted: 06/27/2019] [Indexed: 11/18/2022] Open
Abstract
The accurate prediction of RNA secondary structure from primary sequence has had enormous impact on research from the past 40 years. Although many algorithms are available to make these predictions, the inclusion of non-nested loops, termed pseudoknots, still poses challenges arising from two main factors: 1) no physical model exists to estimate the loop entropies of complex intramolecular pseudoknots, and 2) their NP-complete enumeration has impeded their study. Here, we address both challenges. First, we develop a polymer physics model that can address arbitrarily complex pseudoknots using only two parameters corresponding to concrete physical quantities-over an order of magnitude fewer than the sparsest state-of-the-art phenomenological methods. Second, by coupling this model to exhaustive enumeration of the set of possible structures, we compute the entire free energy landscape of secondary structures resulting from a primary RNA sequence. We demonstrate that for RNA structures of ∼80 nucleotides, with minimal heuristics, the complete enumeration of possible secondary structures can be accomplished quickly despite the NP-complete nature of the problem. We further show that despite our loop entropy model's parametric sparsity, it performs better than or on par with previously published methods in predicting both pseudoknotted and non-pseudoknotted structures on a benchmark data set of RNA structures of ≤80 nucleotides. We suggest ways in which the accuracy of the model can be further improved.
Collapse
Affiliation(s)
- Ofer Kimchi
- Harvard Graduate Program in Biophysics, Harvard University, Cambridge, Massachusetts.
| | - Tristan Cragnolini
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Michael P Brenner
- School of Engineering and Applied Sciences, Cambridge, Massachusetts; Kavli Institute for Bionano Science and Technology, Harvard University, Cambridge, Massachusetts
| | - Lucy J Colwell
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom.
| |
Collapse
|
6
|
Bayrak CS, Kim N, Schlick T. Using sequence signatures and kink-turn motifs in knowledge-based statistical potentials for RNA structure prediction. Nucleic Acids Res 2017; 45:5414-5422. [PMID: 28158755 PMCID: PMC5435971 DOI: 10.1093/nar/gkx045] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 01/22/2017] [Indexed: 12/15/2022] Open
Abstract
Kink turns are widely occurring motifs in RNA, located in internal loops and associated with many biological functions including translation, regulation and splicing. The associated sequence pattern, a 3-nt bulge and G-A, A-G base-pairs, generates an angle of ∼50° along the helical axis due to A-minor interactions. The conserved sequence and distinct secondary structures of kink-turns (k-turn) suggest computational folding rules to predict k-turn-like topologies from sequence. Here, we annotate observed k-turn motifs within a non-redundant RNA dataset based on sequence signatures and geometrical features, analyze bending and torsion angles, and determine distinct knowledge-based potentials with and without k-turn motifs. We apply these scoring potentials to our RAGTOP (RNA-As-Graph-Topologies) graph sampling protocol to construct and sample coarse-grained graph representations of RNAs from a given secondary structure. We present graph-sampling results for 35 RNAs, including 12 k-turn and 23 non k-turn internal loops, and compare the results to solved structures and to RAGTOP results without special k-turn potentials. Significant improvements are observed with the updated scoring potentials compared to the k-turn-free potentials. Because k-turns represent a classic example of sequence/structure motif, our study suggests that other such motifs with sequence signatures and unique geometrical features can similarly be utilized for RNA structure prediction and design.
Collapse
Affiliation(s)
- Cigdem Sevim Bayrak
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| | - Namhee Kim
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| | - Tamar Schlick
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| |
Collapse
|
7
|
Accurate Classification of RNA Structures Using Topological Fingerprints. PLoS One 2016; 11:e0164726. [PMID: 27755571 PMCID: PMC5068708 DOI: 10.1371/journal.pone.0164726] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 09/29/2016] [Indexed: 12/26/2022] Open
Abstract
While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity-an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint.
Collapse
|
8
|
Baba N, Elmetwaly S, Kim N, Schlick T. Predicting Large RNA-Like Topologies by a Knowledge-Based Clustering Approach. J Mol Biol 2015; 428:811-821. [PMID: 26478223 DOI: 10.1016/j.jmb.2015.10.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 10/06/2015] [Indexed: 11/19/2022]
Abstract
An analysis and expansion of our resource for classifying, predicting, and designing RNA structures, RAG (RNA-As-Graphs), is presented, with the goal of understanding features of RNA-like and non-RNA-like motifs and exploiting this information for RNA design. RAG was first reported in 2004 for cataloging RNA secondary structure motifs using graph representations. In 2011, the RAG resource was updated with the increased availability of RNA structures and was improved by utilities for analyzing RNA structures, including substructuring and search tools. We also classified RNA structures as graphs up to 10 vertices (~200 nucleotides) into three classes: existing, RNA-like, and non-RNA-like using clustering approaches. Here, we focus on the tree graphs and evaluate the newly founded RNAs since 2011, which also support our refined predictions of RNA-like motifs. We expand the RAG resource for large tree graphs up to 13 vertices (~260 nucleotides), thereby cataloging more than 10 times as many secondary structures. We apply clustering algorithms based on features of RNA secondary structures translated from known tertiary structures to suggest which hypothetical large RNA motifs can be considered "RNA-like". The results by the PAM (Partitioning Around Medoids) approach, in particular, reveal good accuracy, with small error for the largest cases. The RAG update here up to 13 vertices offers a useful graph-based tool for exploring RNA motifs and suggesting large RNA motifs for design.
Collapse
Affiliation(s)
- Naoto Baba
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA; Department of Chemistry, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464-8601, Japan
| | - Shereef Elmetwaly
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| | - Namhee Kim
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| | - Tamar Schlick
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA; NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Road North, Shanghai, 200062, China.
| |
Collapse
|
9
|
Laing C, Jung S, Kim N, Elmetwaly S, Zahran M, Schlick T. Predicting helical topologies in RNA junctions as tree graphs. PLoS One 2013; 8:e71947. [PMID: 23991010 PMCID: PMC3753280 DOI: 10.1371/journal.pone.0071947] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 07/05/2013] [Indexed: 01/11/2023] Open
Abstract
RNA molecules are important cellular components involved in many fundamental biological processes. Understanding the mechanisms behind their functions requires knowledge of their tertiary structures. Though computational RNA folding approaches exist, they often require manual manipulation and expert intuition; predicting global long-range tertiary contacts remains challenging. Here we develop a computational approach and associated program module (RNAJAG) to predict helical arrangements/topologies in RNA junctions. Our method has two components: junction topology prediction and graph modeling. First, junction topologies are determined by a data mining approach from a given secondary structure of the target RNAs; second, the predicted topology is used to construct a tree graph consistent with geometric preferences analyzed from solved RNAs. The predicted graphs, which model the helical arrangements of RNA junctions for a large set of 200 junctions using a cross validation procedure, yield fairly good representations compared to the helical configurations in native RNAs, and can be further used to develop all-atom models as we show for two examples. Because junctions are among the most complex structural elements in RNA, this work advances folding structure prediction methods of large RNAs. The RNAJAG module is available to academic users upon request.
Collapse
Affiliation(s)
- Christian Laing
- Department of Biology, Wilkes University, Wilkes-Barre, Pennsylvania, United States of America
- Department of Mathematics and Computer Science, Wilkes University, Wilkes-Barre, Pennsylvania, United States of America
| | - Segun Jung
- Department of Chemistry, New York University, New York, United States of America
| | - Namhee Kim
- Department of Chemistry, New York University, New York, United States of America
| | - Shereef Elmetwaly
- Department of Chemistry, New York University, New York, United States of America
| | - Mai Zahran
- Department of Chemistry, New York University, New York, United States of America
| | - Tamar Schlick
- Department of Chemistry, New York University, New York, United States of America
- Courant Institute of Mathematical Sciences, New York University, New York, United States of America
- * E-mail:
| |
Collapse
|
10
|
Churkin A, Gabdank I, Barash D. On topological indices for small RNA graphs. Comput Biol Chem 2012; 41:35-40. [PMID: 23147564 DOI: 10.1016/j.compbiolchem.2012.10.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2012] [Revised: 10/11/2012] [Accepted: 10/12/2012] [Indexed: 11/29/2022]
Abstract
The secondary structure of RNAs can be represented by graphs at various resolutions. While it was shown that RNA secondary structures can be represented by coarse grain tree-graphs and meaningful topological indices can be used to distinguish between various structures, small RNAs are needed to be represented by full graphs. No meaningful topological index has yet been suggested for the analysis of such type of RNA graphs. Recalling that the second eigenvalue of the Laplacian matrix can be used to track topological changes in the case of coarse grain tree-graphs, it is plausible to assume that a topological index such as the Wiener index that represents all Laplacian eigenvalues may provide a similar guide for full graphs. However, by its original definition, the Wiener index was defined for acyclic graphs. Nevertheless, similarly to cyclic chemical graphs, small RNA graphs can be analyzed using elementary cuts, which enables the calculation of topological indices for small RNAs in an intuitive way. We show how to calculate a structural descriptor that is suitable for cyclic graphs, the Szeged index, for small RNA graphs by elementary cuts. We discuss potential uses of such a procedure that considers all eigenvalues of the associated Laplacian matrices to quantify the topology of small RNA graphs.
Collapse
Affiliation(s)
- Alexander Churkin
- Department of Computer Science, Ben-Gurion University, 84105 Beer-Sheva, Israel
| | | | | |
Collapse
|
11
|
Koessler DR, Knisley DJ, Knisley J, Haynes T. A predictive model for secondary RNA structure using graph theory and a neural network. BMC Bioinformatics 2010; 11 Suppl 6:S21. [PMID: 20946605 PMCID: PMC3026369 DOI: 10.1186/1471-2105-11-s6-s21] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Determining the secondary structure of RNA from the primary structure is a challenging computational problem. A number of algorithms have been developed to predict the secondary structure from the primary structure. It is agreed that there is still room for improvement in each of these approaches. In this work we build a predictive model for secondary RNA structure using a graph-theoretic tree representation of secondary RNA structure. We model the bonding of two RNA secondary structures to form a larger secondary structure with a graph operation we call merge. We consider all combinatorial possibilities using all possible tree inputs, both those that are RNA-like in structure and those that are not. The resulting data from each tree merge operation is represented by a vector. We use these vectors as input values for a neural network and train the network to recognize a tree as RNA-like or not, based on the merge data vector. The network estimates the probability of a tree being RNA-like. Results The network correctly assigned a high probability of RNA-likeness to trees previously identified as RNA-like and a low probability of RNA-likeness to those classified as not RNA-like. We then used the neural network to predict the RNA-likeness of the unclassified trees. Conclusions There are a number of secondary RNA structure prediction algorithms available online. These programs are based on finding the secondary structure with the lowest total free energy. In this work, we create a predictive tool for secondary RNA structures using graph-theoretic values as input for a neural network. The use of a graph operation to theoretically describe the bonding of secondary RNA is novel and is an entirely different approach to the prediction of secondary RNA structures. Our method correctly predicted trees to be RNA-like or not RNA-like for all known cases. In addition, our results convey a measure of likelihood that a tree is RNA-like or not RNA-like. Given that the majority of secondary RNA folding algorithms return more than one possible outcome, our method provides a means of determining the best or most likely structures among all of the possible outcomes.
Collapse
Affiliation(s)
- Denise R Koessler
- Department of Mathematics and Statistics, East Tennessee State University, Johnson City, TN 37614, USA
| | | | | | | |
Collapse
|
12
|
Laing C, Schlick T. Computational approaches to 3D modeling of RNA. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2010; 22:283101. [PMID: 21399271 PMCID: PMC6286080 DOI: 10.1088/0953-8984/22/28/283101] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Many exciting discoveries have recently revealed the versatility of RNA and its importance in a variety of functions within the cell. Since the structural features of RNA are of major importance to their biological function, there is much interest in predicting RNA structure, either in free form or in interaction with various ligands, including proteins, metabolites and other molecules. In recent years, an increasing number of researchers have developed novel RNA algorithms for predicting RNA secondary and tertiary structures. In this review, we describe current experimental and computational advances and discuss recent ideas that are transforming the traditional view of RNA folding. To evaluate the performance of the most recent RNA 3D folding algorithms, we provide a comparative study in order to test the performance of available 3D structure prediction algorithms for an RNA data set of 43 structures of various lengths and motifs. We find that the algorithms vary widely in terms of prediction quality across different RNA lengths and topologies; most predictions have very large root mean square deviations from the experimental structure. We conclude by outlining some suggestions for future RNA folding research.
Collapse
Affiliation(s)
- Christian Laing
- Department of Chemistry and Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA
| | | |
Collapse
|
13
|
Ortega-Broche SE, Marrero-Ponce Y, Díaz YE, Torrens F, Pérez-Giménez F. tomocomd-camps and protein bilinear indices - novel bio-macromolecular descriptors for protein research: I. Predicting protein stability effects of a complete set of alanine substitutions in the Arc repressor. FEBS J 2010; 277:3118-46. [DOI: 10.1111/j.1742-4658.2010.07711.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
14
|
|
15
|
Nucleotide's bilinear indices: novel bio-macromolecular descriptors for bioinformatics studies of nucleic acids. I. Prediction of paromomycin's affinity constant with HIV-1 Psi-RNA packaging region. J Theor Biol 2009; 259:229-41. [PMID: 19272394 DOI: 10.1016/j.jtbi.2009.02.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Revised: 02/24/2009] [Accepted: 02/25/2009] [Indexed: 02/03/2023]
Abstract
A new set of nucleotide-based bio-macromolecular descriptors are presented. This novel approach to bio-macromolecular design from a linear algebra point of view is relevant to nucleic acids quantitative structure-activity relationship (QSAR) studies. These bio-macromolecular indices are based on the calculus of bilinear maps on Re(n)[b(mk)(x (m),y (m)):Re(n) x Re(n)-->Re] in canonical basis. Nucleic acid's bilinear indices are calculated from kth power of non-stochastic and stochastic nucleotide's graph-theoretic electronic-contact matrices, M(m)(k) and (s)M(m)(k), respectively. That is to say, the kth non-stochastic and stochastic nucleic acid's bilinear indices are calculated using M(m)(k) and (s)M(m)(k) as matrix operators of bilinear transformations. Moreover, biochemical information is codified by using different pair combinations of nucleotide-base properties as weightings (experimental molar absorption coefficient epsilon(260) at 260 nm and pH=7.0, first (Delta E(1)) and second (Delta E(2)) single excitation energies in eV, and first (f(1)) and second (f(2)) oscillator strength values (of the first singlet excitation energies) of the nucleotide DNA-RNA bases. As example of this approach, an interaction study of the antibiotic paromomycin with the packaging region of the HIV-1 Psi-RNA have been performed and it have been obtained several linear models in order to predict the interaction strength. The best linear model obtained by using non-stochastic bilinear indices explains about 91% of the variance of the experimental Log K (R=0.95 and s=0.08 x 10(-4)M(-1)) as long as the best stochastic bilinear indices-based equation account for 93% of the Log K variance (R=0.97 and s=0.07 x 10(-4)M(-1)). The leave-one-out (LOO) press statistics, evidenced high predictive ability of both models (q(2)=0.86 and s(cv)=0.09 x 10(-4)M(-1) for non-stochastic and q(2)=0.91 and s(cv)=0.08 x 10(-4)M(-1) for stochastic bilinear indices). The nucleic acid's bilinear indices-based models compared favorably with other nucleic acid's indices-based approaches reported nowadays. These models also permit the interpretation of the driving forces of the interaction process. In this sense, developed equations involve short-reaching (k<or=3), middle-reaching (4<k<9), and far-reaching (k=10 or greater) nucleotide's bilinear indices. This situation points to electronic and topologic nucleotide's backbone interactions control of the stability profile of paromomycin-RNA complexes. Consequently, the present approach represents a novel and rather promising way to theoretical-biology studies.
Collapse
|
16
|
Bindewald E, Grunewald C, Boyle B, O’Connor M, Shapiro BA. Computational strategies for the automated design of RNA nanoscale structures from building blocks using NanoTiler. J Mol Graph Model 2008; 27:299-308. [PMID: 18838281 PMCID: PMC3744370 DOI: 10.1016/j.jmgm.2008.05.004] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2008] [Accepted: 05/19/2008] [Indexed: 01/24/2023]
Abstract
One approach to designing RNA nanoscale structures is to use known RNA structural motifs such as junctions, kissing loops or bulges and to construct a molecular model by connecting these building blocks with helical struts. We previously developed an algorithm for detecting internal loops, junctions and kissing loops in RNA structures. Here we present algorithms for automating or assisting many of the steps that are involved in creating RNA structures from building blocks: (1) assembling building blocks into nanostructures using either a combinatorial search or constraint satisfaction; (2) optimizing RNA 3D ring structures to improve ring closure; (3) sequence optimisation; (4) creating a unique non-degenerate RNA topology descriptor. This effectively creates a computational pipeline for generating molecular models of RNA nanostructures and more specifically RNA ring structures with optimized sequences from RNA building blocks. We show several examples of how the algorithms can be utilized to generate RNA tecto-shapes.
Collapse
Affiliation(s)
- Eckart Bindewald
- Basic Research Program, SAIC-Frederick, Inc., NCI-Frederick, Frederick, MD 21702, USA
| | - Calvin Grunewald
- Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, MD 21702, USA
| | - Brett Boyle
- Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, MD 21702, USA
| | - Mary O’Connor
- Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, MD 21702, USA
| | - Bruce A. Shapiro
- Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, MD 21702, USA
| |
Collapse
|
17
|
Shu W, Bo X, Zheng Z, Wang S. A novel representation of RNA secondary structure based on element-contact graphs. BMC Bioinformatics 2008; 9:188. [PMID: 18402706 PMCID: PMC2373570 DOI: 10.1186/1471-2105-9-188] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2007] [Accepted: 04/11/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Depending on their specific structures, noncoding RNAs (ncRNAs) play important roles in many biological processes. Interest in developing new topological indices based on RNA graphs has been revived in recent years, as such indices can be used to compare, identify and classify RNAs. Although the topological indices presented before characterize the main topological features of RNA secondary structures, information on RNA structural details is ignored to some degree. Therefore, it is necessity to identify topological features with low degeneracy based on complete and fine-grained RNA graphical representations. RESULTS In this study, we present a complete and fine scheme for RNA graph representation as a new basis for constructing RNA topological indices. We propose a combination of three vertex-weighted element-contact graphs (ECGs) to describe the RNA element details and their adjacent patterns in RNA secondary structure. Both the stem and loop topologies are encoded completely in the ECGs. The relationship among the three typical topological index families defined by their ECGs and RNA secondary structures was investigated from a dataset of 6,305 ncRNAs. The applicability of topological indices is illustrated by three application case studies. Based on the applied small dataset, we find that the topological indices can distinguish true pre-miRNAs from pseudo pre-miRNAs with about 96% accuracy, and can cluster known types of ncRNAs with about 98% accuracy, respectively. CONCLUSION The results indicate that the topological indices can characterize the details of RNA structures and may have a potential role in identifying and classifying ncRNAs. Moreover, these indices may lead to a new approach for discovering novel ncRNAs. However, further research is needed to fully resolve the challenging problem of predicting and classifying noncoding RNAs.
Collapse
Affiliation(s)
- Wenjie Shu
- Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | | | | | | |
Collapse
|
18
|
Shu W, Bo X, Liu R, Zhao D, Zheng Z, Wang S. RDMAS: a web server for RNA deleterious mutation analysis. BMC Bioinformatics 2006; 7:404. [PMID: 16956394 PMCID: PMC1574353 DOI: 10.1186/1471-2105-7-404] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2006] [Accepted: 09/06/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The diverse functions of ncRNAs critically depend on their structures. Mutations in ncRNAs disrupting the structures of functional sites are expected to be deleterious. RNA deleterious mutations have attracted wide attentions because some of them in cells result in serious disease, and some others in microbes influence their fitness. RESULTS The RDMAS web server we describe here is an online tool for evaluating structural deleteriousness of single nucleotide mutation in RNA genes. Several structure comparison methods have been integrated; sub-optimal structures predicted can be optionally involved to mitigate the uncertainty of secondary structure prediction. With a user-friendly interface, the web application is easy to use. Intuitive illustrations are provided along with the original computational results to facilitate quick analysis. CONCLUSION RDMAS can be used to explore the structure alterations which cause mutations pathogenic, and to predict deleterious mutations which may help to determine the functionally critical regions. RDMAS is freely accessed via http://biosrv1.bmi.ac.cn/rdmas.
Collapse
Affiliation(s)
- Wenjie Shu
- Beijing Institute of Radiation Medicine, Beijing 100850, China
- College of Electro-Mechanic and Automation, National University of Defense Technology, Changsha, Hunan 410073, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Rujia Liu
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| | - Dongsheng Zhao
- Beijing Institute of Health Administration and Medicine Information, Beijing 100850, China
| | - Zhiqiang Zheng
- College of Electro-Mechanic and Automation, National University of Defense Technology, Changsha, Hunan 410073, China
| | - Shengqi Wang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| |
Collapse
|
19
|
Leontis NB, Lescoute A, Westhof E. The building blocks and motifs of RNA architecture. Curr Opin Struct Biol 2006; 16:279-87. [PMID: 16713707 PMCID: PMC4857889 DOI: 10.1016/j.sbi.2006.05.009] [Citation(s) in RCA: 258] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2006] [Revised: 04/12/2006] [Accepted: 05/10/2006] [Indexed: 10/24/2022]
Abstract
RNA motifs can be defined broadly as recurrent structural elements containing multiple intramolecular RNA-RNA interactions, as observed in atomic-resolution RNA structures. They constitute the modular building blocks of RNA architecture, which is organized hierarchically. Recent work has focused on analyzing RNA backbone conformations to identify, define and search for new instances of recurrent motifs in X-ray structures. One current view asserts that recurrent RNA strand segments with characteristic backbone configurations qualify as independent motifs. Other considerations indicate that, to characterize modular motifs, one must take into account the larger structural context of such strand segments. This follows the biologically relevant motivation, which is to identify RNA structural characteristics that are subject to sequence constraints and that thus relate RNA architectures to sequences.
Collapse
Affiliation(s)
- Neocles B Leontis
- Department of Chemistry and Center for Biomolecular Sciences, Bowling Green State University, Bowling Green, OH 43402, USA
| | | | | |
Collapse
|
20
|
Haynes T, Knisley D, Seier E, Zou Y. A quantitative analysis of secondary RNA structure using domination based parameters on trees. BMC Bioinformatics 2006; 7:108. [PMID: 16515683 PMCID: PMC1420337 DOI: 10.1186/1471-2105-7-108] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2005] [Accepted: 03/03/2006] [Indexed: 11/30/2022] Open
Abstract
Background It has become increasingly apparent that a comprehensive database of RNA motifs is essential in order to achieve new goals in genomic and proteomic research. Secondary RNA structures have frequently been represented by various modeling methods as graph-theoretic trees. Using graph theory as a modeling tool allows the vast resources of graphical invariants to be utilized to numerically identify secondary RNA motifs. The domination number of a graph is a graphical invariant that is sensitive to even a slight change in the structure of a tree. The invariants selected in this study are variations of the domination number of a graph. These graphical invariants are partitioned into two classes, and we define two parameters based on each of these classes. These parameters are calculated for all small order trees and a statistical analysis of the resulting data is conducted to determine if the values of these parameters can be utilized to identify which trees of orders seven and eight are RNA-like in structure. Results The statistical analysis shows that the domination based parameters correctly distinguish between the trees that represent native structures and those that are not likely candidates to represent RNA. Some of the trees previously identified as candidate structures are found to be "very" RNA like, while others are not, thereby refining the space of structures likely to be found as representing secondary RNA structure. Conclusion Search algorithms are available that mine nucleotide sequence databases. However, the number of motifs identified can be quite large, making a further search for similar motif computationally difficult. Much of the work in the bioinformatics arena is toward the development of better algorithms to address the computational problem. This work, on the other hand, uses mathematical descriptors to more clearly characterize the RNA motifs and thereby reduce the corresponding search space. These preliminary findings demonstrate that graph-theoretic quantifiers utilized in fields such as computer network design hold significant promise as an added tool for genomics and proteomics.
Collapse
Affiliation(s)
- Teresa Haynes
- Mathematics and Statistics Department, Box 70663, East Tennessee State University, Johnson City, TN, USA
| | - Debra Knisley
- Mathematics and Statistics Department, Box 70663, East Tennessee State University, Johnson City, TN, USA
| | - Edith Seier
- Mathematics and Statistics Department, Box 70663, East Tennessee State University, Johnson City, TN, USA
| | - Yue Zou
- Department of Biochemistry and Molecular Biology, Quillen College of Medicine, East Tennessee State University, Johnson City, TN, USA
| |
Collapse
|
21
|
Pasquali S, Gan HH, Schlick T. Modular RNA architecture revealed by computational analysis of existing pseudoknots and ribosomal RNAs. Nucleic Acids Res 2005; 33:1384-98. [PMID: 15745998 PMCID: PMC552955 DOI: 10.1093/nar/gki267] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Modular architecture is a hallmark of RNA structures, implying structural, and possibly functional, similarity among existing RNAs. To systematically delineate the existence of smaller topologies within larger structures, we develop and apply an efficient RNA secondary structure comparison algorithm using a newly developed two-dimensional RNA graphical representation. Our survey of similarity among 14 pseudoknots and subtopologies within ribosomal RNAs (rRNAs) uncovers eight pairs of structurally related pseudoknots with non-random sequence matches and reveals modular units in rRNAs. Significantly, three structurally related pseudoknot pairs have functional similarities not previously known: one pair involves the 3′ end of brome mosaic virus genomic RNA (PKB134) and the alternative hammerhead ribozyme pseudoknot (PKB173), both of which are replicase templates for viral RNA replication; the second pair involves structural elements for translation initiation and ribosome recruitment found in the viral internal ribosome entry site (PKB223) and the V4 domain of 18S rRNA (PKB205); the third pair involves 18S rRNA (PKB205) and viral tRNA-like pseudoknot (PKB134), which probably recruits ribosomes via structural mimicry and base complementarity. Additionally, we quantify the modularity of 16S and 23S rRNAs by showing that RNA motifs can be constructed from at least 210 building blocks. Interestingly, we find that the 5S rRNA and two tree modules within 16S and 23S rRNAs have similar topologies and tertiary shapes. These modules can be applied to design novel RNA motifs via build-up-like procedures for constructing sequences and folds.
Collapse
Affiliation(s)
| | - Hin Hark Gan
- Department of Chemistry, New York University251 Mercer Street, New York, NY 10021, USA
| | - Tamar Schlick
- Department of Chemistry, New York University251 Mercer Street, New York, NY 10021, USA
- Courant Institute of Mathematical Sciences, New York University251 Mercer Street, New York, NY 10021, USA
- To whom correspondence should be addressed: Tel: +1 212 998 3116; Fax: +1 212 995 4152;
| |
Collapse
|
22
|
Kim N, Shiffeldrim N, Gan HH, Schlick T. Candidates for novel RNA topologies. J Mol Biol 2004; 341:1129-44. [PMID: 15321711 DOI: 10.1016/j.jmb.2004.06.054] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2004] [Revised: 06/10/2004] [Accepted: 06/21/2004] [Indexed: 10/26/2022]
Abstract
Because the functional repertiore of RNA molecules, like proteins, is closely linked to the diversity of their shapes, uncovering RNA's structural repertoire is vital for identifying novel RNAs, especially in genomic sequences. To help expand the limited number of known RNA families, we use graphical representation and clustering analysis of RNA secondary structures to predict novel RNA topologies and their abundance as a function of size. Representing the essential topological properties of RNA secondary structures as graphs enables enumeration, generation, and prediction of novel RNA motifs. We apply a probabilistic graph-growing method to construct the RNA structure space encompassing the topologies of existing and hypothetical RNAs and cluster all RNA topologies into two groups using topological descriptors and a standard clustering algorithm. Significantly, we find that nearly all existing RNAs fall into one group, which we refer to as "RNA-like"; we consider the other group "non-RNA-like". Our method predicts many candidates for novel RNA secondary topologies, some of which are remarkably similar to existing structures; interestingly, the centroid of the RNA-like group is the tmRNA fold, a pseudoknot having both tRNA-like and mRNA-like functions. Additionally, our approach allows estimation of the relative abundance of pseudoknot and other (e.g. tree) motifs using the "edge-cut" property of RNA graphs. This analysis suggests that pseudoknots dominate the RNA structure universe, representing more than 90% when the sequence length exceeds 120 nt; the predicted trend for <100 nt agrees with data for existing RNAs. Together with our predictions for novel "RNA-like" topologies, our analysis can help direct the design of functional RNAs and identification of novel RNA folds in genomes through an efficient topology-directed search, which grows much more slowly in complexity with RNA size compared to the traditional sequence-based search.
Collapse
Affiliation(s)
- Namhee Kim
- Department of Chemistry, New York University, 100 Washington Square East, Room 1001, New York, NY 10003, USA
| | | | | | | |
Collapse
|
23
|
Gan HH, Pasquali S, Schlick T. Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. Nucleic Acids Res 2003; 31:2926-43. [PMID: 12771219 PMCID: PMC156709 DOI: 10.1093/nar/gkg365] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Understanding the structural repertoire of RNA is crucial for RNA genomics research. Yet current methods for finding novel RNAs are limited to small or known RNA families. To expand known RNA structural motifs, we develop a two-dimensional graphical representation approach for describing and estimating the size of RNA's secondary structural repertoire, including naturally occurring and other possible RNA motifs. We employ tree graphs to describe RNA tree motifs and more general (dual) graphs to describe both RNA tree and pseudoknot motifs. Our estimates of RNA's structural space are vastly smaller than the nucleotide sequence space, suggesting a new avenue for finding novel RNAs. Specifically our survey shows that known RNA trees and pseudoknots represent only a small subset of all possible motifs, implying that some of the 'missing' motifs may represent novel RNAs. To help pinpoint RNA-like motifs, we show that the motifs of existing functional RNAs are clustered in a narrow range of topological characteristics. We also illustrate the applications of our approach to the design of novel RNAs and automated comparison of RNA structures; we report several occurrences of RNA motifs within larger RNAs. Thus, our graph theory approach to RNA structures has implications for RNA genomics, structure analysis and design.
Collapse
Affiliation(s)
- Hin Hark Gan
- Department of Chemistry, New York University, 251 Mercer Street, New York, 10012 NY, USA
| | | | | |
Collapse
|
24
|
Billoud B, Guerrucci MA, Masselot M, Deutsch JS. Cirripede phylogeny using a novel approach: molecular morphometrics. Mol Biol Evol 2000; 17:1435-45. [PMID: 11018151 DOI: 10.1093/oxfordjournals.molbev.a026244] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present a new method using nucleic acid secondary structure to assess phylogenetic relationships among species. In this method, which we term "molecular morphometrics," the measurable structural parameters of the molecules (geometrical features, bond energies, base composition, etc.) are used as specific characters to construct a phylogenetic tree. This method relies both on traditional morphological comparison and on molecular sequence comparison. Applied to the phylogenetic analysis of Cirripedia, molecular morphometrics supports the most recent morphological analyses arguing for the monophyly of Cirripedia sensu stricto (Thoracica + Rhizocephala + Acrothoracica). As a proof, a classical multiple alignment was also performed, either using or not using the structural information to realign the sequence segments considered in the molecular morphometrics analysis. These methods yielded the same tree topology as the direct use of structural characters as a phylogenetic signal. By taking into account the secondary structure of nucleic acids, the new method allows investigators to use the regions in which multiple alignments are barely reliable because of a large number of insertions and deletions. It thus appears to be complementary to classical primary sequence analysis in phylogenetic studies.
Collapse
Affiliation(s)
- B Billoud
- Atelier de BioInformatique, Service Commun de Bio-Systématique, Université Pierre et Marie Curie, Paris, France.
| | | | | | | |
Collapse
|
25
|
Clark DE, Westhead DR. Evolutionary algorithms in computer-aided molecular design. J Comput Aided Mol Des 1996; 10:337-58. [PMID: 8877705 DOI: 10.1007/bf00124503] [Citation(s) in RCA: 75] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
In recent years, search and optimisation algorithms inspired by evolutionary processes have been applied with marked success to a wide variety of problems in diverse fields of study. In this review, we survey the growing application of these 'evolutionary algorithms' in one such area: computer-aided molecular design. In the course of the review, we seek to summarise the work to date and to indicate where evolutionary algorithms have met with success and where they have not fared so well. In addition to this, we also attempt to discern some future trends in both the basic research concerning these algorithms and their application to the elucidation, design and modelling of chemical and biochemical structures.
Collapse
Affiliation(s)
- D E Clark
- Proteus Molecular Design Ltd., Macclesfield, U.K
| | | |
Collapse
|