1
|
DeJaco RF, Roberts MJ, Romsos EL, Vallone PM, Kearsley AJ. Reducing Bias and Quantifying Uncertainty in Fluorescence Produced by PCR. Bull Math Biol 2023; 85:83. [PMID: 37574503 PMCID: PMC10423706 DOI: 10.1007/s11538-023-01182-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 06/20/2023] [Indexed: 08/15/2023]
Abstract
We present a new approach for relating nucleic-acid content to fluorescence in a real-time Polymerase Chain Reaction (PCR) assay. By coupling a two-type branching process for PCR with a fluorescence analog of Beer's Law, the approach reduces bias and quantifies uncertainty in fluorescence. As the two-type branching process distinguishes between complementary strands of DNA, it allows for a stoichiometric description of reactions between fluorescent probes and DNA and can capture the initial conditions encountered in assays targeting RNA. Analysis of the expected copy-number identifies additional dynamics that occur at short times (or, equivalently, low cycle numbers), while investigation of the variance reveals the contributions from liquid volume transfer, imperfect amplification, and strand-specific amplification (i.e., if one strand is synthesized more efficiently than its complement). Linking the branching process to fluorescence by the Beer's Law analog allows for an a priori description of background fluorescence. It also enables uncertainty quantification (UQ) in fluorescence which, in turn, leads to analytical relationships between amplification efficiency (probability) and limit of detection. This work sets the stage for UQ-PCR, where both the input copy-number and its uncertainty are quantified from fluorescence kinetics.
Collapse
Affiliation(s)
- Robert F. DeJaco
- Applied and Computational Mathematics Division, National Institute of Standards and Technology, 100 Bureau Dr., MS 8910, Gaithersburg, MD 20899-8910 USA
- Department of Chemistry and Biochemistry, University of Maryland, 8051 Regents Dr., College Park, MD 20742-4454 USA
| | - Matthew J. Roberts
- Applied and Computational Mathematics Division, National Institute of Standards and Technology, 100 Bureau Dr., MS 8910, Gaithersburg, MD 20899-8910 USA
- Cost Analysis and Research Division, Institute for Defense Analyses, 730 E. Glebe Rd., Alexandria, VA 22305-3086 USA
| | - Erica L. Romsos
- Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Dr., MS 8314, Gaithersburg, MD 20899-8314 USA
| | - Peter M. Vallone
- Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Dr., MS 8314, Gaithersburg, MD 20899-8314 USA
| | - Anthony J. Kearsley
- Applied and Computational Mathematics Division, National Institute of Standards and Technology, 100 Bureau Dr., MS 8910, Gaithersburg, MD 20899-8910 USA
| |
Collapse
|
2
|
Nov Y. Learning Context-Dependent DNA Mutation Patterns in Error-Prone Polymerase Chain Reaction. Biochemistry 2023; 62:345-350. [PMID: 36153985 DOI: 10.1021/acs.biochem.2c00292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We present a novel statistical learning method for studying context-dependent error rates in error-prone polymerase chain reaction (PCR) experiments. We demonstrate the method by applying it to error-prone PCR sequencing data and show how it may be exploited to improve the evolvability of genes in protein engineering.
Collapse
Affiliation(s)
- Yuval Nov
- Department of Statistics, University of Haifa, Haifa 3498838, Israel
| |
Collapse
|
3
|
Smart U, Budowle B, Ambers A, Soares Moura-Neto R, Silva R, Woerner AE. A novel phylogenetic approach for de novo discovery of putative nuclear mitochondrial (pNumt) haplotypes. Forensic Sci Int Genet 2019; 43:102146. [DOI: 10.1016/j.fsigen.2019.102146] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 08/09/2019] [Accepted: 08/13/2019] [Indexed: 10/26/2022]
|
4
|
Cowell RG. Computation of marginal distributions of peak-heights in electropherograms for analysing single source and mixture STR DNA samples. Forensic Sci Int Genet 2018; 35:164-168. [DOI: 10.1016/j.fsigen.2018.04.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 03/28/2018] [Accepted: 04/21/2018] [Indexed: 10/17/2022]
|
5
|
van Dijk T, Hwang S, Krug J, de Visser JAGM, Zwart MP. Mutation supply and the repeatability of selection for antibiotic resistance. Phys Biol 2017; 14:055005. [PMID: 28699625 DOI: 10.1088/1478-3975/aa7f36] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Whether evolution can be predicted is a key question in evolutionary biology. Here we set out to better understand the repeatability of evolution, which is a necessary condition for predictability. We explored experimentally the effect of mutation supply and the strength of selective pressure on the repeatability of selection from standing genetic variation. Different sizes of mutant libraries of antibiotic resistance gene TEM-1 β-lactamase in Escherichia coli, generated by error-prone PCR, were subjected to different antibiotic concentrations. We determined whether populations went extinct or survived, and sequenced the TEM gene of the surviving populations. The distribution of mutations per allele in our mutant libraries followed a Poisson distribution. Extinction patterns could be explained by a simple stochastic model that assumed the sampling of beneficial mutations was key for survival. In most surviving populations, alleles containing at least one known large-effect beneficial mutation were present. These genotype data also support a model which only invokes sampling effects to describe the occurrence of alleles containing large-effect driver mutations. Hence, evolution is largely predictable given cursory knowledge of mutational fitness effects, the mutation rate and population size. There were no clear trends in the repeatability of selected mutants when we considered all mutations present. However, when only known large-effect mutations were considered, the outcome of selection is less repeatable for large libraries, in contrast to expectations. We show experimentally that alleles carrying multiple mutations selected from large libraries confer higher resistance levels relative to alleles with only a known large-effect mutation, suggesting that the scarcity of high-resistance alleles carrying multiple mutations may contribute to the decrease in repeatability at large library sizes.
Collapse
Affiliation(s)
- Thomas van Dijk
- Laboratory of Genetics, Wageningen University, Wageningen, Netherlands. These authors contributed equally
| | | | | | | | | |
Collapse
|
6
|
Shagin DA, Shagina IA, Zaretsky AR, Barsova EV, Kelmanson IV, Lukyanov S, Chudakov DM, Shugay M. A high-throughput assay for quantitative measurement of PCR errors. Sci Rep 2017; 7:2718. [PMID: 28578414 PMCID: PMC5457411 DOI: 10.1038/s41598-017-02727-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Accepted: 04/18/2017] [Indexed: 01/01/2023] Open
Abstract
The accuracy with which DNA polymerase can replicate a template DNA sequence is an extremely important property that can vary by an order of magnitude from one enzyme to another. The rate of nucleotide misincorporation is shaped by multiple factors, including PCR conditions and proofreading capabilities, and proper assessment of polymerase error rate is essential for a wide range of sensitive PCR-based assays. In this paper, we describe a method for studying polymerase errors with exceptional resolution, which combines unique molecular identifier tagging and high-throughput sequencing. Our protocol is less laborious than commonly-used methods, and is also scalable, robust and accurate. In a series of nine PCR assays, we have measured a range of polymerase accuracies that is in line with previous observations. However, we were also able to comprehensively describe individual errors introduced by each polymerase after either 20 PCR cycles or a linear amplification, revealing specific substitution preferences and the diversity of PCR error frequency profiles. We also demonstrate that the detected high-frequency PCR errors are highly recurrent and that the position in the template sequence and polymerase-specific substitution preferences are among the major factors influencing the observed PCR error rate.
Collapse
Affiliation(s)
- Dmitriy A Shagin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Pirogov Russian National Research Medical University, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Irina A Shagina
- Pirogov Russian National Research Medical University, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Andrew R Zaretsky
- Pirogov Russian National Research Medical University, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Ekaterina V Barsova
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Ilya V Kelmanson
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Evrogen JSC, Moscow, Russia
| | - Sergey Lukyanov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.,Pirogov Russian National Research Medical University, Moscow, Russia
| | - Dmitriy M Chudakov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia. .,Pirogov Russian National Research Medical University, Moscow, Russia. .,Skolkovo Institute of Science and Technology, Moscow, Russia. .,Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
| | - Mikhail Shugay
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia. .,Pirogov Russian National Research Medical University, Moscow, Russia. .,Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
| |
Collapse
|
7
|
Abstract
Whole genome amplification is important for multipoint mapping by sperm or oocyte typing and genetic disease diagnosis. Polymerase chain reaction is not suitable for amplifying long DNA sequences. This paper studies a new technique, designated PEP-primer-extension-preamplification, for amplifying long DNA sequences using the theory of branching processes. A mathematical model for PEP is constructed and a closed formula for the expected target yield is obtained. A central limit theorem and a strong law of large numbers for the number of kth generation target sequences are proved.
Collapse
|
8
|
Abstract
Sun and Waterman model DNA mutations during the PCR reaction by a non-canonical branching process. Mean-field approximated values fit the simulated values surprisingly well. We prove this as a theoretical result, for a wide range of the parameters. Thus, we bound explicitly the biases, in law and in the mean, that the mean-field approximation induces in the random number of mutations of a DNA molecule, as a function of the initial number of molecules, of the number of PCR cycles, of the efficiency rate and of the mutation rate. The range where we prove that the approximation is good contains the observed mutation rates in many actual PCR reactions.
Collapse
|
9
|
Abstract
Whole genome amplification is important for multipoint mapping by sperm or oocyte typing and genetic disease diagnosis. Polymerase chain reaction is not suitable for amplifying long DNA sequences. This paper studies a new technique, designated PEP-primer-extension-preamplification, for amplifying long DNA sequences using the theory of branching processes. A mathematical model for PEP is constructed and a closed formula for the expected target yield is obtained. A central limit theorem and a strong law of large numbers for the number of kth generation target sequences are proved.
Collapse
|
10
|
Abstract
We study the harmonic moments of Galton-Watson processes that are possibly inhomogeneous and have positive values. Good estimates of these are needed to compute unbiased estimators for noncanonical branching Markov processes, which occur, for instance, in the modelling of the polymerase chain reaction. By convexity, the ratio of the harmonic mean to the mean is at most 1. We prove that, for every square-integrable branching mechanism, this ratio lies between 1-A/k and 1-A/k for every initial population of size k>A. The positive constants A and Aͤ are such that A≥Aͤ, are explicit, and depend only on the generation-by-generation branching mechanisms. In particular, we do not use the distribution of the limit of the classical martingale associated with the Galton-Watson process. Thus, emphasis is put on nonasymptotic bounds and on the dependence of the harmonic mean upon the size of the initial population. In the Bernoulli case, which is relevant for the modelling of the polymerase chain reaction, we prove essentially optimal bounds that are valid for every initial population size k≥1. Finally, in the general case and for sufficiently large initial populations, similar techniques yield sharp estimates of the harmonic moments of higher degree.
Collapse
|
11
|
Abstract
We study the harmonic moments of Galton-Watson processes that are possibly inhomogeneous and have positive values. Good estimates of these are needed to compute unbiased estimators for noncanonical branching Markov processes, which occur, for instance, in the modelling of the polymerase chain reaction. By convexity, the ratio of the harmonic mean to the mean is at most 1. We prove that, for every square-integrable branching mechanism, this ratio lies between 1-A/k and 1-A/k for every initial population of size k>A. The positive constants A and Aͤ are such that A≥Aͤ, are explicit, and depend only on the generation-by-generation branching mechanisms. In particular, we do not use the distribution of the limit of the classical martingale associated with the Galton-Watson process. Thus, emphasis is put on nonasymptotic bounds and on the dependence of the harmonic mean upon the size of the initial population. In the Bernoulli case, which is relevant for the modelling of the polymerase chain reaction, we prove essentially optimal bounds that are valid for every initial population size k≥1. Finally, in the general case and for sufficiently large initial populations, similar techniques yield sharp estimates of the harmonic moments of higher degree.
Collapse
|
12
|
Lalam N, Jacob C, Jagers P. Modelling the PCR amplification process by a size-dependent branching process and estimation of the efficiency. ADV APPL PROBAB 2016. [DOI: 10.1239/aap/1086957587] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We propose a stochastic modelling of the PCR amplification process by a size-dependent branching process starting as a supercritical Bienaymé-Galton-Watson transient phase and then having a saturation near-critical size-dependent phase. This model allows us to estimate the probability of replication of a DNA molecule at each cycle of a single PCR trajectory with a very good accuracy.
Collapse
|
13
|
Modelling the PCR amplification process by a size-dependent branching process and estimation of the efficiency. ADV APPL PROBAB 2016. [DOI: 10.1017/s0001867800013628] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We propose a stochastic modelling of the PCR amplification process by a size-dependent branching process starting as a supercritical Bienaymé-Galton-Watson transient phase and then having a saturation near-critical size-dependent phase. This model allows us to estimate the probability of replication of a DNA molecule at each cycle of a single PCR trajectory with a very good accuracy.
Collapse
|
14
|
Ferla MP. Mutanalyst, an online tool for assessing the mutational spectrum of epPCR libraries with poor sampling. BMC Bioinformatics 2016; 17:152. [PMID: 27044645 PMCID: PMC4820924 DOI: 10.1186/s12859-016-0996-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Accepted: 03/22/2016] [Indexed: 01/03/2023] Open
Abstract
Background Assessing library diversity is an important control step in a directed evolution experiment. To do this, a limited amount of colonies from a test library are sequenced and tested. In the case of an error-prone PCR library, the spectrum of the identified mutations — the proportions of mutations of a specific nucleobase to another— is calculated enabling the user to make more informed predictions on library diversity and coverage. However, the calculations of the mutational spectrum are severely affected by the limited sample sizes. Results Here an online program, called Mutanalyst, is presented, which not only automates the calculations, but also estimates errors involved. Specifically, the errors are calculated thanks to the complementarity of DNA, which means that a mutation has a complementary mutation on the other sequence. Additionally, in the case of determining the mean number of mutations per sequence it does so by fitting to a Poisson distribution, which is more robust than calculating the average in light of the small sampling size. Conclusion As a result of the added measures to keep into account of small sample size the user can better assess whether the library is satisfactory or whether error-prone PCR conditions should be adjusted. The program is available at www.mutanalyst.com.
Collapse
Affiliation(s)
- Matteo Paolo Ferla
- Formerly Department of Biochemistry, University of Otago, Dunedin, New Zealand. .,Present address: Biosyntia, DTU Centre for Biosustainability, Hørsholm, Denmark.
| |
Collapse
|
15
|
Abstract
In this paper we present novel results for discrete-time and Markovian continuous-time multitype branching processes. As a population develops, we are interested in the waiting time until a particular type of interest (such as an escape mutant) appears, and in how the distribution of individuals depends on whether this type has yet appeared. Specifically, both forward and backward equations for the distribution of type-specific population sizes over time, conditioned on the nonappearance of one or more particular types, are derived. In tandem, equations for the probability that one or more particular types have not yet appeared are also derived. Brief examples illustrate numerical methods and potential applications of these results in evolutionary biology and epidemiology.
Collapse
|
16
|
Alexander HK. Conditional Distributions and Waiting Times in Multitype Branching Processes. ADV APPL PROBAB 2016. [DOI: 10.1239/aap/1377868535] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
In this paper we present novel results for discrete-time and Markovian continuous-time multitype branching processes. As a population develops, we are interested in the waiting time until a particular type of interest (such as an escape mutant) appears, and in how the distribution of individuals depends on whether this type has yet appeared. Specifically, both forward and backward equations for the distribution of type-specific population sizes over time, conditioned on the nonappearance of one or more particular types, are derived. In tandem, equations for the probability that one or more particular types have not yet appeared are also derived. Brief examples illustrate numerical methods and potential applications of these results in evolutionary biology and epidemiology.
Collapse
|
17
|
Cowell RG, Graversen T, Lauritzen SL, Mortera J. Analysis of forensic DNA mixtures with artefacts. J R Stat Soc Ser C Appl Stat 2014. [DOI: 10.1111/rssc.12071] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
18
|
Probabilistic methods in directed evolution: library size, mutation rate, and diversity. Methods Mol Biol 2014; 1179:261-78. [PMID: 25055784 DOI: 10.1007/978-1-4939-1053-3_18] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Directed evolution has emerged as an important tool for engineering proteins with improved or novel properties. Because of their inherent reliance on randomness, directed evolution protocols are amenable to probabilistic modeling and analysis. This chapter summarizes and reviews in a nonmathematical way some of the probabilistic works related to directed evolution, with particular focus on three of the most widely used methods: saturation mutagenesis, error-prone PCR, and in vitro recombination. The ultimate aim is to provide the reader with practical information to guide the planning and design of directed evolution studies. Importantly, the applications and locations of freely available computational resources to assist with this process are described in detail.
Collapse
|
19
|
Peischl S, Kirkpatrick M. Establishment of new mutations in changing environments. Genetics 2012; 191:895-906. [PMID: 22542964 PMCID: PMC3389982 DOI: 10.1534/genetics.112.140756] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Accepted: 04/20/2012] [Indexed: 11/18/2022] Open
Abstract
Understanding adaptation in changing environments is an important topic in evolutionary genetics, especially in the light of climatic and environmental change. In this work, we study one of the most fundamental aspects of the genetics of adaptation in changing environments: the establishment of new beneficial mutations. We use the framework of time-dependent branching processes to derive simple approximations for the establishment probability of new mutations assuming that temporal changes in the offspring distribution are small. This approach allows us to generalize Haldane's classic result for the fixation probability in a constant environment to arbitrary patterns of temporal change in selection coefficients. Under weak selection, the only aspect of temporal variation that enters the probability of establishment is a weighted average of selection coefficients. These weights quantify how much earlier generations contribute to determining the establishment probability compared to later generations. We apply our results to several biologically interesting cases such as selection coefficients that change in consistent, periodic, and random ways and to changing population sizes. Comparison with exact results shows that the approximation is very accurate.
Collapse
Affiliation(s)
- Stephan Peischl
- Section of Integrative Biology, University of Texas, Austin, TX 78712, USA.
| | | |
Collapse
|
20
|
Schenk MF, Szendro IG, Krug J, de Visser JAGM. Quantifying the adaptive potential of an antibiotic resistance enzyme. PLoS Genet 2012; 8:e1002783. [PMID: 22761587 PMCID: PMC3386231 DOI: 10.1371/journal.pgen.1002783] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2012] [Accepted: 05/09/2012] [Indexed: 12/30/2022] Open
Abstract
For a quantitative understanding of the process of adaptation, we need to understand its "raw material," that is, the frequency and fitness effects of beneficial mutations. At present, most empirical evidence suggests an exponential distribution of fitness effects of beneficial mutations, as predicted for Gumbel-domain distributions by extreme value theory. Here, we study the distribution of mutation effects on cefotaxime (Ctx) resistance and fitness of 48 unique beneficial mutations in the bacterial enzyme TEM-1 β-lactamase, which were obtained by screening the products of random mutagenesis for increased Ctx resistance. Our contributions are threefold. First, based on the frequency of unique mutations among more than 300 sequenced isolates and correcting for mutation bias, we conservatively estimate that the total number of first-step mutations that increase Ctx resistance in this enzyme is 87 [95% CI 75-189], or 3.4% of all 2,583 possible base-pair substitutions. Of the 48 mutations, 10 are synonymous and the majority of the 38 non-synonymous mutations occur in the pocket surrounding the catalytic site. Second, we estimate the effects of the mutations on Ctx resistance by determining survival at various Ctx concentrations, and we derive their fitness effects by modeling reproduction and survival as a branching process. Third, we find that the distribution of both measures follows a Fréchet-type distribution characterized by a broad tail of a few exceptionally fit mutants. Such distributions have fundamental evolutionary implications, including an increased predictability of evolution, and may provide a partial explanation for recent observations of striking parallel evolution of antibiotic resistance.
Collapse
Affiliation(s)
- Martijn F. Schenk
- Institute for Genetics, University of Cologne, Köln, Germany
- Laboratory of Genetics, Wageningen University, Wageningen, The Netherlands
| | - Ivan G. Szendro
- Institute for Theoretical Physics, University of Cologne, Köln, Germany
| | - Joachim Krug
- Institute for Theoretical Physics, University of Cologne, Köln, Germany
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, Köln, Germany
| | | |
Collapse
|
21
|
Ye X, Zhang C, Zhang YHP. Engineering a large protein by combined rational and random approaches: stabilizing the Clostridium thermocellum cellobiose phosphorylase. MOLECULAR BIOSYSTEMS 2012; 8:1815-23. [DOI: 10.1039/c2mb05492b] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
22
|
Cowell RG. Validation of an STR peak area model. Forensic Sci Int Genet 2009; 3:193-9. [PMID: 19414168 DOI: 10.1016/j.fsigen.2009.01.006] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2008] [Revised: 01/06/2009] [Accepted: 01/07/2009] [Indexed: 11/24/2022]
Abstract
In analyzing a DNA mixture sample, the measured peak areas of alleles of STR markers amplified using the polymerase chain-reaction (PCR) technique provide valuable information concerning the relative amounts of DNA originating from each contributor to the mixture. This information can be exploited for the purpose of trying to predict the genetic profiles of those contributors whose genetic profiles are not known. The task is non-trivial, in part due to the need to take into account the stochastic nature of peak area values. Various methods have been proposed suggesting ways in which this may be done. One recent suggestion is a probabilistic expert system model that uses gamma distributions to model the size and stochastic variation in peak area values. In this paper we carry out a statistical analysis of the gamma distribution assumption, testing the assumption against synthetic peak area values computer generated using an independent model that simulates the PCR amplification process. Our analysis shows the gamma assumption works very well when allelic dropout is not present, but performs less and less well as dropout becomes more and more of an issue, such as occurs, for example, in Low Copy Template amplifications.
Collapse
Affiliation(s)
- Robert G Cowell
- Faculty of Actuarial Science and Insurance, Sir John Cass Business School, City University London, 106 Buhnill Row, London EC1Y 8TZ, UK.
| |
Collapse
|
23
|
Bull JJ. The optimal burst of mutation to create a phenotype. J Theor Biol 2008; 254:667-73. [PMID: 18619470 DOI: 10.1016/j.jtbi.2008.06.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2008] [Revised: 06/15/2008] [Accepted: 06/16/2008] [Indexed: 11/20/2022]
Abstract
Mutagenesis is commonly applied to genes and genomes to create novel variants with desired properties. This paper calculates the level of mutagenesis that maximizes the appearance of favorable mutants, assuming that the mutagenesis is applied in a single episode. The downside of mutagenesis is that a substantial fraction of mutations will destroy gene/genome function. The upside of mutagenesis is the production of beneficial mutations, but the desired phenotype may require that 1, 2 or more beneficial mutations be present simultaneously (the phenotype dimensionality). The optimum level of mutagenesis is sensitive to both properties. In the simplest model, the mutation optimum occurs when number of lethal equivalents per genome equals the phenotype dimensionality, a result first derived by Mundry and Gierer [1958. Production of mutations in tobacco mosaic virus by chemical treatment of its nucleic acid in vitro. Z. Vererbungsl. 89 (4), 614-630]. This level of mutation is shown to be an upper bound for the optimum in various extensions of the model, and the recovery of mutants is also reasonably tolerant to deviations from the optimum.
Collapse
Affiliation(s)
- J J Bull
- Section of Integrative Biology, The Institute for Cellular and Molecular Biology, The University of Texas at Austin, 1 University Station, C0930, Austin, TX 78712, USA.
| |
Collapse
|
24
|
Bloom JD, Lu Z, Chen D, Raval A, Venturelli OS, Arnold FH. Evolution favors protein mutational robustness in sufficiently large populations. BMC Biol 2007; 5:29. [PMID: 17640347 PMCID: PMC1995189 DOI: 10.1186/1741-7007-5-29] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2007] [Accepted: 07/17/2007] [Indexed: 11/26/2022] Open
Abstract
Background An important question is whether evolution favors properties such as mutational robustness or evolvability that do not directly benefit any individual but can influence the course of future evolution. Functionally similar proteins can differ substantially in their robustness to mutations and capacity to evolve new functions, but it has remained unclear whether any of these differences might be due to evolutionary selection for these properties. Results Here, we use laboratory experiments to demonstrate that evolution favors protein mutational robustness if the evolving population is sufficiently large. We neutrally evolve cytochrome P450 proteins under identical selection pressures and mutation rates in populations of different sizes, and show that proteins from the larger and thus more polymorphic population tend towards higher mutational robustness. Proteins from the larger population also evolve greater stability, a biophysical property that is known to enhance both mutational robustness and evolvability. The excess mutational robustness and stability is well described by mathematical theory, and can be quantitatively related to the way that the proteins occupy their neutral network. Conclusion Our work is the first experimental demonstration of the general tendency of evolution to favor mutational robustness and protein stability in highly polymorphic populations. We suggest that this phenomenon could contribute to the mutational robustness and evolvability of viruses and bacteria that exist in large populations.
Collapse
Affiliation(s)
- Jesse D Bloom
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Zhongyi Lu
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - David Chen
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Alpan Raval
- Keck Graduate Institute of Applied Life Sciences and School of Mathematical Sciences, Claremont Graduate University, Claremont, CA 91711, USA
| | - Ophelia S Venturelli
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Frances H Arnold
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| |
Collapse
|
25
|
Saha N, Watson LT, Kafadar K, Ramakrishnan N, Onufriev A, Mane S, Vasquez-Robinet C. Validation and estimation of parameters for a general probabilistic model of the PCR process. J Comput Biol 2007; 14:97-112. [PMID: 17381349 DOI: 10.1089/cmb.2006.0123] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Earlier work rigorously derived a general probabilistic model for the PCR process that includes as a special case the Velikanov-Kapral model where all nucleotide reaction rates are the same. In this model, the probability of binding of deoxy-nucleoside triphosphate (dNTP) molecules with template strands is derived from the microscopic chemical kinetics. A recursive solution for the probability function of binding of dNTPs is developed for a single cycle and is used to calculate expected yield for a multicycle PCR. The model is able to reproduce important features of the PCR amplification process quantitatively. With a set of favorable reaction conditions, the amplification of the target sequence is fast enough to rapidly outnumber all side products. Furthermore, the final yield of the target sequence in a multicycle PCR run always approaches an asymptotic limit that is less than one. The amplification process itself is highly sensitive to initial concentrations and the reaction rates of addition to the template strand of each type of dNTP in the solution. This paper extends the earlier Saha model with a physics based model of the dependence of the reaction rates on temperature, and estimates parameters in this new model by nonlinear regression. The calibrated model is validated using RT-PCR data.
Collapse
Affiliation(s)
- Nilanjan Saha
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, USA
| | | | | | | | | | | | | |
Collapse
|
26
|
Saha N, Watson LT, Kafadar K, Onufriev A, Ramakrishnan N, Vasquez-Robinet C, Watkinson J. A general probabilistic model of the PCR process. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2004:2813-6. [PMID: 17270862 DOI: 10.1109/iembs.2004.1403803] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
This work describes a general probabilistic model for the PCR process; this model includes as a special case the Velikanov-Kapral model where all nucleotide reaction rates are the same. In this model the probability of binding of deoxynucleoside triphosphate (dNTP) molecules with template strands is derived from the microscopic chemical kinetics. A recursive solution for the probability distribution of binding of dNTPs is developed for a single cycle and is used to calculate expected yield for a multicycle PCR. The model is able to reproduce important features of the PCR amplification process quantitatively. This model also suggests that the amplification process itself is highly sensitive to initial concentrations and the reaction rates of addition to the template strand of each type of dNTP in the solution.
Collapse
Affiliation(s)
- Nilanjan Saha
- Dept. of Comput. Sci., Virginia Polytech. Inst. & State Univ., Blacksburg, VA, USA
| | | | | | | | | | | | | |
Collapse
|
27
|
Piau D. Harmonic continuous-time branching moments. ANN APPL PROBAB 2006. [DOI: 10.1214/105051606000000493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
28
|
Estimation of the reaction efficiency in polymerase chain reaction. J Theor Biol 2006; 242:947-53. [PMID: 16843498 DOI: 10.1016/j.jtbi.2006.06.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2006] [Revised: 04/28/2006] [Accepted: 06/01/2006] [Indexed: 10/24/2022]
Abstract
Polymerase chain reaction (PCR) is largely used in molecular biology for increasing the copy number of a specific DNA fragment. The succession of 20 replication cycles makes it possible to multiply the quantity of the fragment of interest by a factor of 1 million. The PCR technique has revolutionized genomics research. Several quantification methodologies are available to determine the DNA replication efficiency of the reaction which is the probability of replication of a DNA molecule at a replication cycle. We elaborate a quantification procedure based on the exponential phase and the early saturation phase of PCR. The reaction efficiency is supposed to be constant in the exponential phase, and decreasing in the saturation phase. We propose to model the PCR amplification process by a branching process which starts as a Galton-Watson branching process followed by a size-dependent process. Using this stochastic modelling and the conditional least-squares estimation method, we infer the reaction efficiency from a single PCR trajectory.
Collapse
|
29
|
Bloom JD, Labthavikul ST, Otey CR, Arnold FH. Protein stability promotes evolvability. Proc Natl Acad Sci U S A 2006; 103:5869-74. [PMID: 16581913 PMCID: PMC1458665 DOI: 10.1073/pnas.0510098103] [Citation(s) in RCA: 879] [Impact Index Per Article: 46.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2005] [Indexed: 11/18/2022] Open
Abstract
The biophysical properties that enable proteins to so readily evolve to perform diverse biochemical tasks are largely unknown. Here, we show that a protein's capacity to evolve is enhanced by the mutational robustness conferred by extra stability. We use simulations with model lattice proteins to demonstrate how extra stability increases evolvability by allowing a protein to accept a wider range of beneficial mutations while still folding to its native structure. We confirm this view experimentally by mutating marginally stable and thermostable variants of cytochrome P450 BM3. Mutants of the stabilized parent were more likely to exhibit new or improved functions. Only the stabilized P450 parent could tolerate the highly destabilizing mutations needed to confer novel activities such as hydroxylating the antiinflammatory drug naproxen. Our work establishes a crucial link between protein stability and evolution. We show that we can exploit this link to discover protein functions, and we suggest how natural evolution might do the same.
Collapse
Affiliation(s)
| | | | - Christopher R. Otey
- Biochemistry and Molecular Biophysics Option, Mail Code 210-41, California Institute of Technology, Pasadena, CA 91125
| | | |
Collapse
|
30
|
Lee JY, Lim HW, Yoo SI, Zhang BT, Park TH. Simulation and real-time monitoring of polymerase chain reaction for its higher efficiency. Biochem Eng J 2006. [DOI: 10.1016/j.bej.2005.02.023] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
31
|
Patrick WM, Firth AE. Strategies and computational tools for improving randomized protein libraries. ACTA ACUST UNITED AC 2005; 22:105-12. [PMID: 16095966 DOI: 10.1016/j.bioeng.2005.06.001] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2005] [Revised: 06/20/2005] [Accepted: 06/21/2005] [Indexed: 11/15/2022]
Abstract
In the last decade, directed evolution has become a routine approach for engineering proteins with novel or altered properties. Concurrently, a trend away from purely 'blind' randomization strategies and towards more 'semi-rational' approaches has also become apparent. In this review, we discuss ways in which structural information and predictive computational tools are playing an increasingly important role in guiding the design of randomized libraries: web servers such as ConSurf-HSSP and SCHEMA allow the prediction of sites to target for producing functional variants, while algorithms such as GLUE, PEDEL and DRIVeR are useful for estimating library completeness and diversity. In addition, we review recent methodological developments that facilitate the construction of unbiased libraries, which are inherently more diverse than biased libraries and therefore more likely to yield improved variants.
Collapse
Affiliation(s)
- Wayne M Patrick
- Center for Fundamental and Applied Molecular Evolution, Emory University, 1510 Clifton Road, Atlanta GA 30322, USA.
| | | |
Collapse
|
32
|
Bosley AD, Ostermeier M. Mathematical expressions useful in the construction, description and evaluation of protein libraries. ACTA ACUST UNITED AC 2005; 22:57-61. [PMID: 15857784 DOI: 10.1016/j.bioeng.2004.11.002] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2004] [Revised: 10/31/2004] [Accepted: 11/01/2004] [Indexed: 11/24/2022]
Abstract
The creation of protein libraries by random mutagenesis and cassette mutagenesis has proven to be a successful method of protein engineering. Appropriate statistical analysis is important for the proper construction of these libraries and even more important for the interpretation of data from these libraries. We present simple mathematical expressions useful in the creation and evaluation of such libraries. These equations are useful in estimating the distribution of mutations, the degeneracy of the library and the frequency of a particular clone in the library. In addition, general equations addressing the probability that a particular clone is in a library, the probability that a library is complete, and as the consequences of retransformation of the library on these probabilities are presented.
Collapse
Affiliation(s)
- Allen D Bosley
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218, USA
| | | |
Collapse
|
33
|
Drummond DA, Iverson BL, Georgiou G, Arnold FH. Why High-error-rate Random Mutagenesis Libraries are Enriched in Functional and Improved Proteins. J Mol Biol 2005; 350:806-16. [PMID: 15939434 DOI: 10.1016/j.jmb.2005.05.023] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2005] [Revised: 05/06/2005] [Accepted: 05/10/2005] [Indexed: 10/25/2022]
Abstract
The fraction of proteins that retain wild-type function after mutation has long been observed to decline exponentially as the average number of mutations per gene increases. Recently, several groups have used error-prone polymerase chain reactions (PCR) to generate libraries with 15 to 30 mutations per gene, on average, and have reported that orders of magnitude more proteins retain function than would be expected from the low-mutation-rate trend. Proteins with improved or novel function were isolated disproportionately from these high-error-rate libraries, leading to claims that high mutation rates unlock regions of sequence space that are enriched in positively coupled mutations. Here, we show experimentally that error-prone PCR produces a broader non-Poisson distribution of mutations consistent with a detailed model of PCR. As error rates increase, this distribution leads directly to the observed excesses in functional clones. We then show that while very low mutation rates result in many functional sequences, only a small number are unique. By contrast, very high mutation rates produce mostly unique sequences, but few retain function. Thus an optimal mutation rate exists that balances uniqueness and retention of function. Overall, high-error-rate mutagenesis libraries are enriched in improved sequences because they contain more unique, functional clones. Our findings demonstrate how optimal error-prone PCR mutation rates may be calculated, and indicate that "optimal" rates depend on both the protein and the mutagenesis protocol.
Collapse
Affiliation(s)
- D Allan Drummond
- Program in Computation and Neural Systems, California Institute of Technology, Mail Code 210-41, Pasadena, CA 91125-4100, USA
| | | | | | | |
Collapse
|
34
|
Volles MJ, Lansbury PT. A computer program for the estimation of protein and nucleic acid sequence diversity in random point mutagenesis libraries. Nucleic Acids Res 2005; 33:3667-77. [PMID: 15990391 PMCID: PMC1166583 DOI: 10.1093/nar/gki669] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A computer program for the generation and analysis of in silico random point mutagenesis libraries is described. The program operates by mutagenizing an input nucleic acid sequence according to mutation parameters specified by the user for each sequence position and type of point mutation. The program can mimic almost any type of random mutagenesis library, including those produced via error-prone PCR (ep-PCR), mutator Escherichia coli strains, chemical mutagenesis, and doped or random oligonucleotide synthesis. The program analyzes the generated nucleic acid sequences and/or the associated protein library to produce several estimates of library diversity (number of unique sequences, point mutations, and single point mutants) and the rate of saturation of these diversities during experimental screening or selection of clones. This information allows one to select the optimal screen size for a given mutagenesis library, necessary to efficiently obtain a certain coverage of the sequence-space. The program also reports the abundance of each specific protein mutation at each sequence position, which is useful as a measure of the level and type of mutation bias in the library. Alternatively, one can use the program to evaluate the relative merits of preexisting libraries, or to examine various hypothetical mutation schemes to determine the optimal method for creating a library that serves the screen/selection of interest. Simulated libraries of at least 109 sequences are accessible by the numerical algorithm with currently available personal computers; an analytical algorithm is also available which can rapidly calculate a subset of the numerical statistics in libraries of arbitrarily large size. A multi-type double-strand stochastic model of ep-PCR is developed in an appendix to demonstrate the applicability of the algorithm to amplifying mutagenesis procedures. Estimators of DNA polymerase mutation-type-specific error rates are derived using the model. Analyses of an alpha-synuclein ep-PCR library and NNS synthetic oligonucleotide libraries are given as examples.
Collapse
Affiliation(s)
- Michael J Volles
- Center for Neurologic Diseases, Brigham and Women's Hospital and Department of Neurology, Harvard Medical School 65 Landsdowne Street, Cambridge, MA 02139, USA.
| | | |
Collapse
|
35
|
Pritchard L, Corne D, Kell D, Rowland J, Winson M. A general model of error-prone PCR. J Theor Biol 2005; 234:497-509. [PMID: 15808871 DOI: 10.1016/j.jtbi.2004.12.005] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2004] [Revised: 11/24/2004] [Accepted: 12/07/2004] [Indexed: 11/28/2022]
Abstract
In this paper, we generalize a previously-described model of the error-prone polymerase chain reaction (PCR) reaction to conditions of arbitrarily variable amplification efficiency and initial population size. Generalisation of the model to these conditions improves the correspondence to observed and expected behaviours of PCR, and restricts the extent to which the model may explore sequence space for a prescribed set of parameters. Error-prone PCR in realistic reaction conditions is predicted to be less effective at generating grossly divergent sequences than the original model. The estimate of mutation rate per cycle by sampling sequences from an in vitro PCR experiment is correspondingly affected by the choice of model and parameters.
Collapse
Affiliation(s)
- Leighton Pritchard
- Institute of Biological Sciences, University of Wales, Aberystwyth, Ceredigion, SY23 3DD, Wales, UK.
| | | | | | | | | |
Collapse
|
36
|
Piau D. Confidence intervals for nonhomogeneous branching processes and polymerase chain reactions. ANN PROBAB 2005. [DOI: 10.1214/009117904000000775] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
37
|
Gill P, Curran J, Elliot K. A graphical simulation model of the entire DNA process associated with the analysis of short tandem repeat loci. Nucleic Acids Res 2005; 33:632-43. [PMID: 15681615 PMCID: PMC548350 DOI: 10.1093/nar/gki205] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The use of expert systems to interpret short tandem repeat DNA profiles in forensic, medical and ancient DNA applications is becoming increasingly prevalent as high-throughput analytical systems generate large amounts of data that are time-consuming to process. With special reference to low copy number (LCN) applications, we use a graphical model to simulate stochastic variation associated with the entire DNA process starting with extraction of sample, followed by the processing associated with the preparation of a PCR reaction mixture and PCR itself. Each part of the process is modelled with input efficiency parameters. Then, the key output parameters that define the characteristics of a DNA profile are derived, namely heterozygote balance (Hb) and the probability of allelic drop-out p(D). The model can be used to estimate the unknown efficiency parameters, such as πextraction. ‘What-if’ scenarios can be used to improve and optimize the entire process, e.g. by increasing the aliquot forwarded to PCR, the improvement expected to a given DNA profile can be reliably predicted. We demonstrate that Hb and drop-out are mainly a function of stochastic effect of pre-PCR molecular selection. Whole genome amplification is unlikely to give any benefit over conventional PCR for LCN.
Collapse
Affiliation(s)
- Peter Gill
- Forensic Science Service, Birmingham UK.
| | | | | |
Collapse
|
38
|
Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH. Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci U S A 2005; 102:606-11. [PMID: 15644440 PMCID: PMC545518 DOI: 10.1073/pnas.0406744102] [Citation(s) in RCA: 261] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wild-type structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline determined by properties of the structure. Our theory also predicts that a protein can gain extra robustness to the first few substitutions by increasing its thermodynamic stability. We validate our theory with simulations on lattice protein models and by showing that it quantitatively predicts previously published experimental measurements on subtilisin and our own measurements on variants of TEM1 beta-lactamase. Our work unifies observations about the clustering of functional proteins in sequence space, and provides a basis for interpreting the response of proteins to substitutions in protein engineering applications.
Collapse
Affiliation(s)
- Jesse D Bloom
- Division of Chemistry and Chemical Engineering 210-41, California Institute of Technology, Pasadena, CA 91125, USA.
| | | | | | | | | | | |
Collapse
|
39
|
Abstract
The variability of the products of polymerase chain reactions, due to mutations and to incomplete replications, can have important clinical consequences. Sun (1995) and Weiss and von Haeseler (1995) modeled these errors by a branching process and introduced estimators of the mutation rate and of the efficiency of the reaction based, for example, on the empirical distribution of the mutations of a random sequence. This distribution involves a noncanonical branching Markov chain which, although easy to describe, is not analytically tractable except in the infinite-population limit. These authors for the infinite-target limit, and Wang et al. (2000) for finite targets, solved the infinite-population limit. In this paper, we provide bounds of the difference between the finite-target finite-population case and its finite-target infinite-population approximation. The bounds are explicit functions of the efficiency of the reaction, the mutation rate per site and per cycle, the size of the target, the number of cycles, and the size of the initial population. They concern every moment and, what might be more surprising, the histogram itself of the distributions. The bounds for the moments exhibit a phase transition at the value 1 - 1/N = 3/4 of the mutation rate per site and per cycle, where N = 4 is the number of letters in the encoding alphabet of DNA and RNA. Of course, in biological contexts, the mutation rates are much smaller than 3/4.
Collapse
Affiliation(s)
- Didier Piau
- Université Claude Bernard Lyon 1, Lyon Cedex, France.
| |
Collapse
|
40
|
|
41
|
Manrubia SC, Arribas M, Lázaro E. Supercritical branching processes and the role of fluctuations under exponential population growth. J Theor Biol 2003; 225:497-505. [PMID: 14615209 DOI: 10.1016/s0022-5193(03)00294-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We study some exact properties of supercritical branching processes. A proper rescaling of the relevant variable allows us to determine the distribution of population sizes after a number of generations have elapsed. Both time-continuous and discrete processes are analysed and compared. The obtained results are of relevance for the growth of populations that are not resource limited (a typical situation in some biological processes that can be modelled by laboratory experiments). Large fluctuations inherent to the process play a main role when bottlenecks occur.
Collapse
Affiliation(s)
- Susanna C Manrubia
- Centro de Astrobiología INTA-CSIC, Ctra. de Ajalvir km. 4, E-28850 Torrejón de Ardoz, Madrid, Spain.
| | | | | |
Collapse
|
42
|
Abstract
Even though the efficiency of the polymerase chain reaction (PCR) reaction decreases, analyses are made in terms of Galton-Watson processes, or simple deterministic models with constant replication probability (efficiency). Recently, Schnell and Mendoza have suggested that the form of the efficiency, can be derived from enzyme kinetics. This results in the sequence of molecules numbers forming a stochastic process with the properties of a branching process with population size dependence, which is supercritical, but has a mean reproduction number that approaches one. Such processes display ultimate linear growth, after an initial exponential phase, as is the case in PCR. It is also shown that the resulting stochastic process for a large Michaelis-Menten constant behaves like the deterministic sequence x(n) arising by iterations of the function f(x)=x+x/(1+x).
Collapse
Affiliation(s)
- Peter Jagers
- School of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden.
| | | |
Collapse
|
43
|
Urban C, Schweinberger A, Kundi M, Dorner F, Hämmerle T. Relationship between detection limit and bias of accuracy of quantification of RNA by RT-PCR. Mol Cell Probes 2003; 17:171-4. [PMID: 12944119 DOI: 10.1016/s0890-8508(03)00049-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Evidence is presented demonstrating that the distribution of data obtained applying a given RT-PCR method deviates from a normal distribution depending on the limit of detection. The effect of this is a bias towards higher values and concomitantly a systematic error in respect to the accuracy of the evaluation due to this deviation from normality. In addition, evidence is presented that an evaluation assuming a log-normal distribution is more appropriate.
Collapse
Affiliation(s)
- Carsten Urban
- Baxter Vaccine AG, Biomedical Research Center, Uferstr. 15, 2304, Orth/Donau, Austria
| | | | | | | | | |
Collapse
|
44
|
Shinde D, Lai Y, Sun F, Arnheim N. Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites. Nucleic Acids Res 2003; 31:974-80. [PMID: 12560493 PMCID: PMC149199 DOI: 10.1093/nar/gkg178] [Citation(s) in RCA: 169] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
During microsatellite polymerase chain reaction (PCR), insertion-deletion mutations produce stutter products differing from the original template by multiples of the repeat unit length. We analyzed the PCR slippage products of (CA)n and (A)n tracts cloned in a pUC18 vector. Repeat numbers varied from two to 14 (CA)n and four to 12 (A)n. Data was generated on approximately 10 single molecules for each clone type using two rounds of nested PCR. The size and peak areas of the products were obtained by capillary electrophoresis. A quasi- likelihood approach to the analysis of the data estimated the mutation rate/repeat/PCR cycle. The rate for (CA)n tracts was 3.6 x 10(-3) with contractions 14 times greater than expansions. For (A)n tracts the rate was 1.5 x 10(-2) and contractions outnumbered expansions by 5-fold. The threshold for detecting 'stutter' products was computed to be four repeats for (CA)n and eight repeats for (A)n or approximately 8 bp in both cases. A comparison was made between the computationally and experimentally derived threshold values. The threshold and expansion to contraction ratios are explained on the basis of the active site structure of Taq DNA polymerase and models of the energetics of slippage events, respectively.
Collapse
Affiliation(s)
- Deepali Shinde
- Program in Molecular and Computational Biology and Department of Mathematics, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | |
Collapse
|
45
|
Ingr M, Konvalinka J. Theoretical description of the direct exponential amplification and sequencing (DEXAS) method. Biol Chem 2000; 381:439-45. [PMID: 10937875 DOI: 10.1515/bc.2000.057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We present a theoretical description of the method of DNA sequencing with simultaneous exponential PCR amplification of the template (DEXAS). Based on the theory of probability, the formula determining the optimal ratio of concentrations of deoxy- and dideoxynucleotides in the reaction mixture is derived, as well as the length distribution of sequenced DNA fragments. The prediction of the number of mutations is given and the theoretically determined aspects of DEXAS are compared with the corresponding quantities of classical sequencing methods. Some other experimentally observed effects are also discussed.
Collapse
Affiliation(s)
- M Ingr
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic
| | | |
Collapse
|
46
|
Wang D, Zhao C, Cheng R, Sun F. Estimation of the mutation rate during error-prone polymerase chain reaction. J Comput Biol 2000; 7:143-58. [PMID: 10890392 DOI: 10.1089/10665270050081423] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Error-prone polymerase chain reaction (PCR) is widely used to introduce point mutations during in vitro evolution experiments. Accurate estimation of the mutation rate during error-prone PCR is important in studying the diversity of error-prone PCR product. Although many methods for estimating the mutation rate during PCR are available, all the existing methods depend on the assumption that the mutation rate is low and mutations occur at different places whenever they occur. The available methods may not be applicable to estimate the mutation rate during error-prone PCR. We develop a mathematical model for error-prone PCR and present methods to estimate the mutation rate during error-prone PCR without assuming low mutation rate. We also develop a computer program to simulate error-prone PCR. Using the program, we compare the newly developed methods with two other methods. We show that when the mutation rate is relatively low (< 10(-3) per base per PCR cycle), the newly developed methods give roughly the same results as previous methods. When the mutation rate is relatively high (> 5 x 10(-3) per base per PCR cycle, the mutation rate for most error-prone PCR experiments), the previous methods underestimate the mutation rate and the newly developed methods approximate the true mutation rate.
Collapse
Affiliation(s)
- D Wang
- Department of Mathematics, University of Southern California, Los Angeles 90089-1113, USA.
| | | | | | | |
Collapse
|
47
|
Abstract
A probabilistic approach to the kinetics of the polymerase chain reaction (PCR) is developed. The approach treats the primer extension step of PCR as a microscopic Markov process in which the molecules of deoxy-nucleoside triphosphate (dNTP) are bound to the 3' end of the primer strand one at a time. The binding probability rates are prescribed by combinatorial rules in accord with the microscopic chemical kinetics. As an example, a simple model based on this approach is proposed and analysed, and an exact solution for the probability distribution of lengths of synthesized DNA strands is found by analytical means. Using this solution, it is demonstrated that the model is able to reproduce the main features of PCR, such as extreme sensitivity to the variation of control parameters and the existence of an amplification plateau. A multidimensional optimization technique is used to find numerically the optimum values of control parameters which maximize the yield of the target sequence for a given PCR run while minimizing the overall run time.
Collapse
Affiliation(s)
- M V Velikanov
- Department of Chemistry, University of Toronto, Toronto, Ontario, M5S 3H6, Canada
| | | |
Collapse
|
48
|
Stolovitzky G, Cecchi G. Efficiency of DNA replication in the polymerase chain reaction. Proc Natl Acad Sci U S A 1996; 93:12947-52. [PMID: 8917524 PMCID: PMC24026 DOI: 10.1073/pnas.93.23.12947] [Citation(s) in RCA: 72] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
A detailed quantitative kinetic model for the polymerase chain reaction (PCR) is developed, which allows us to predict the probability of replication of a DNA molecule in terms of the physical parameters involved in the system. The important issue of the determination of the number of PCR cycles during which this probability can be considered to be a constant is solved within the framework of the model. New phenomena of multimodality and scaling behavior in the distribution of the number of molecules after a given number of PCR cycles are presented. The relevance of the model for quantitative PCR is discussed, and a novel quantitative PCR technique is proposed.
Collapse
Affiliation(s)
- G Stolovitzky
- Center for Studies in Physics and Biology, Rockefeller University, New York, NY 10021, USA
| | | |
Collapse
|