1
|
Cao S, Qiu Y, Kalin ML, Huang X. Integrative generalized master equation: A method to study long-timescale biomolecular dynamics via the integrals of memory kernels. J Chem Phys 2023; 159:134106. [PMID: 37787134 PMCID: PMC11005468 DOI: 10.1063/5.0167287] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/18/2023] [Indexed: 10/04/2023] Open
Abstract
The generalized master equation (GME) provides a powerful approach to study biomolecular dynamics via non-Markovian dynamic models built from molecular dynamics (MD) simulations. Previously, we have implemented the GME, namely the quasi Markov State Model (qMSM), where we explicitly calculate the memory kernel and propagate dynamics using a discretized GME. qMSM can be constructed with much shorter MD trajectories than the MSM. However, since qMSM needs to explicitly compute the time-dependent memory kernels, it is heavily affected by the numerical fluctuations of simulation data when applied to study biomolecular conformational changes. This can lead to numerical instability of predicted long-time dynamics, greatly limiting the applicability of qMSM in complicated biomolecules. We present a new method, the Integrative GME (IGME), in which we analytically solve the GME under the condition when the memory kernels have decayed to zero. Our IGME overcomes the challenges of the qMSM by using the time integrations of memory kernels, thereby avoiding the numerical instability caused by explicit computation of time-dependent memory kernels. Using our solutions of the GME, we have developed a new approach to compute long-time dynamics based on MD simulations in a numerically stable, accurate and efficient way. To demonstrate its effectiveness, we have applied the IGME in three biomolecules: the alanine dipeptide, FIP35 WW-domain, and Taq RNA polymerase. In each system, the IGME achieves significantly smaller fluctuations for both memory kernels and long-time dynamics compared to the qMSM. We anticipate that the IGME can be widely applied to investigate biomolecular conformational changes.
Collapse
Affiliation(s)
- Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Michael L. Kalin
- Biophysics Graduate Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
2
|
Rojewski A, Schweiger M, Sgouralis I, Comstock M, Pressé S. An accurate probabilistic step finder for time-series analysis. bioRxiv 2023:2023.09.19.558535. [PMID: 37786687 PMCID: PMC10541599 DOI: 10.1101/2023.09.19.558535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
Noisy time-series data is commonly collected from sources including Förster Resonance Energy Transfer experiments, patch clamp and force spectroscopy setups, among many others. Two of the most common paradigms for the detection of discrete transitions in such time-series data include: hidden Markov models (HMMs) and step-finding algorithms. HMMs, including their extensions to infinite state-spaces, inherently assume in analysis that holding times in discrete states visited are geometrically-or, loosely speaking in common language, exponentially-distributed. Thus the determination of step locations, especially in sparse and noisy data, is biased by HMMs toward identifying steps resulting in geometric holding times. In contrast, existing step-finding algorithms, while free of this restraint, often rely on ad hoc metrics to penalize steps recovered in time traces (by using various information criteria) and otherwise rely on approximate greedy algorithms to identify putative global optima. Here, instead, we devise a robust and general probabilistic (Bayesian) step-finding tool that neither relies on ad hoc metrics to penalize step numbers nor assumes geometric holding times in each state. As the number of steps themselves in a time-series are, a priori unknown, we treat these within a Bayesian nonparametric (BNP) paradigm. We find that the method developed, Bayesian Nonparametric Step (BNP-Step), accurately determines the number and location of transitions between discrete states without any assumed kinetic model and learns the emission distribution characteristic of each state. In doing so, we verify that BNP-Step can analyze sparser data sets containing higher noise and more closely-spaced states than otherwise resolved by current state-of-the-art methods. What is more, BNP-Step rigorously propagates measurement uncertainty into uncertainty over state transition locations, numbers, and emission levels as characterized by the posterior. We demonstrate the performance of BNP-Step on both synthetic data as well as data drawn from force spectroscopy experiments.
Collapse
Affiliation(s)
- Alex Rojewski
- Department of Physics, Arizona State University, Tempe, Arizona
- Center for Biological Physics, Arizona State University, Tempe, Arizona
| | - Maxwell Schweiger
- Department of Physics, Arizona State University, Tempe, Arizona
- Center for Biological Physics, Arizona State University, Tempe, Arizona
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee, Knoxville, Knoxville, Tennessee
| | - Matthew Comstock
- Department of Physics and Astronomy, Michigan State University, East Lansing, Michigan
| | - Steve Pressé
- Department of Physics, Arizona State University, Tempe, Arizona
- Center for Biological Physics, Arizona State University, Tempe, Arizona
- School of Molecular Sciences, Arizona State University, Tempe, Arizona
| |
Collapse
|
3
|
Xu (徐伟青) LW, Jazani S, Kilic Z, Pressé S. Single-Molecule Reaction-Diffusion. bioRxiv 2023:2023.09.05.556378. [PMID: 37732202 PMCID: PMC10508780 DOI: 10.1101/2023.09.05.556378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
We propose to capture reaction-diffusion on a molecule-by-molecule basis from the fastest acquirable timescale, namely individual photon arrivals. We illustrate our method on intrinsically disordered human proteins, the linker histone H1.0 as well as its chaperone prothymosin α , as these diffuse through an illuminated confocal spot and interact forming larger ternary complexes on millisecond timescales. Most importantly, single-molecule reaction-diffusion, smRD, reveals single molecule properties without trapping or otherwise confining molecules to surfaces. We achieve smRD within a Bayesian paradigm and term our method Bayes-smRD. Bayes-smRD is further free of the average, bulk, results inherent to the analysis of long photon arrival traces by fluorescence correlation spectroscopy. In learning from thousands of photon arrivals continuous spatial positions and discrete conformational and photophysical state changes, Bayes-smRD estimates kinetic parameters on a molecule-by-molecule basis with two to three orders of magnitude less data than tools such as fluorescence correlation spectroscopy thereby also dramatically reducing sample photodamage.
Collapse
Affiliation(s)
- Lance W.Q. Xu (徐伟青)
- Center for Biological Physics, Arizona State University, Tempe, AZ 85287, USA
- Department of Physics, Arizona State University, Tempe, AZ 85287, USA
| | - Sina Jazani
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins Medicine, Baltimore, MD 21205, USA
| | - Zeliha Kilic
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Steve Pressé
- Center for Biological Physics, Arizona State University, Tempe, AZ 85287, USA
- Department of Physics, Arizona State University, Tempe, AZ 85287, USA
- School of Molecular Science, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
4
|
Dominic AJ, Cao S, Montoya-Castillo A, Huang X. Memory Unlocks the Future of Biomolecular Dynamics: Transformative Tools to Uncover Physical Insights Accurately and Efficiently. J Am Chem Soc 2023; 145:9916-9927. [PMID: 37104720 DOI: 10.1021/jacs.3c01095] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Conformational changes underpin function and encode complex biomolecular mechanisms. Gaining atomic-level detail of how such changes occur has the potential to reveal these mechanisms and is of critical importance in identifying drug targets, facilitating rational drug design, and enabling bioengineering applications. While the past two decades have brought Markov state model techniques to the point where practitioners can regularly use them to glimpse the long-time dynamics of slow conformations in complex systems, many systems are still beyond their reach. In this Perspective, we discuss how including memory (i.e., non-Markovian effects) can reduce the computational cost to predict the long-time dynamics in these complex systems by orders of magnitude and with greater accuracy and resolution than state-of-the-art Markov state models. We illustrate how memory lies at the heart of successful and promising techniques, ranging from the Fokker-Planck and generalized Langevin equations to deep-learning recurrent neural networks and generalized master equations. We delineate how these techniques work, identify insights that they can offer in biomolecular systems, and discuss their advantages and disadvantages in practical settings. We show how generalized master equations can enable the investigation of, for example, the gate-opening process in RNA polymerase II and demonstrate how our recent advances tame the deleterious influence of statistical underconvergence of the molecular dynamics simulations used to parameterize these techniques. This represents a significant leap forward that will enable our memory-based techniques to interrogate systems that are currently beyond the reach of even the best Markov state models. We conclude by discussing some current challenges and future prospects for how exploiting memory will open the door to many exciting opportunities.
Collapse
Affiliation(s)
- Anthony J Dominic
- Department of Chemistry, University of Colorado Boulder, Boulder, Colorado 80309, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
5
|
Kilic Z, Schweiger M, Moyer C, Shepherd D, Pressé S. Gene expression model inference from snapshot RNA data using Bayesian non-parametrics. Nat Comput Sci 2023; 3:174-183. [PMID: 38125199 PMCID: PMC10732567 DOI: 10.1038/s43588-022-00392-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2023]
Abstract
Gene expression models, which are key towards understanding cellular regulatory response, underlie observations of single-cell transcriptional dynamics. Although RNA expression data encode information on gene expression models, existing computational frameworks do not perform simultaneous Bayesian inference of gene expression models and parameters from such data. Rather, gene expression models-composed of gene states, their connectivities and associated parameters-are currently deduced by pre-specifying gene state numbers and connectivity before learning associated rate parameters. Here we propose a method to learn full distributions over gene states, state connectivities and associated rate parameters, simultaneously and self-consistently from single-molecule RNA counts. We propagate noise from fluctuating RNA counts over models by treating models themselves as random variables. We achieve this within a Bayesian non-parametric paradigm. We demonstrate our method on the Escherichia coli lacZ pathway and the Saccharomyces cerevisiae STL1 pathway, and verify its robustness on synthetic data.
Collapse
Affiliation(s)
- Zeliha Kilic
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN, USA
- These authors contributed equally: Zeliha Kilic, Max Schweiger
| | - Max Schweiger
- Center for Biological Physics, ASU, Tempe, AZ, USA
- Department of Physics, ASU, Tempe, AZ, USA
- These authors contributed equally: Zeliha Kilic, Max Schweiger
| | - Camille Moyer
- Center for Biological Physics, ASU, Tempe, AZ, USA
- School of Mathematics and Statistical Sciences, ASU, Tempe, AZ, USA
| | - Douglas Shepherd
- Center for Biological Physics, ASU, Tempe, AZ, USA
- Department of Physics, ASU, Tempe, AZ, USA
| | - Steve Pressé
- Center for Biological Physics, ASU, Tempe, AZ, USA
- Department of Physics, ASU, Tempe, AZ, USA
- School of Molecular Sciences, ASU, Tempe, AZ, USA
| |
Collapse
|
6
|
Bryan JS, Pressé S. Learning continuous potentials from smFRET. Biophys J 2023; 122:433-441. [PMID: 36463404 PMCID: PMC9892619 DOI: 10.1016/j.bpj.2022.11.2947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/08/2022] [Accepted: 11/29/2022] [Indexed: 12/07/2022] Open
Abstract
Potential energy landscapes are useful models in describing events such as protein folding and binding. While single-molecule fluorescence resonance energy transfer (smFRET) experiments encode information on continuous potentials for the system probed, including rarely visited barriers between putative potential minima, this information is rarely decoded from the data. This is because existing analysis methods often model smFRET output assuming, from the onset, that the system probed evolves in a discretized state space to be analyzed within a hidden Markov model (HMM) paradigm. By contrast, here, we infer continuous potentials from smFRET data without discretely approximating the state space. We do so by operating within a Bayesian nonparametric paradigm by placing priors on the family of all possible potential curves. As our inference accounts for a number of required experimental features raising computational cost (such as incorporating discrete photon shot noise), the framework leverages a structured-kernel-interpolation Gaussian process prior to help curtail computational cost. We show that our structured-kernel-interpolation priors for potential energy reconstruction from smFRET analysis accurately infers the potential energy landscape from a smFRET binding experiment. We then illustrate advantages of structured-kernel-interpolation priors for potential energy reconstruction from smFRET over standard HMM approaches by providing information, such as barrier heights and friction coefficients, that is otherwise inaccessible to HMMs.
Collapse
Affiliation(s)
- J Shepard Bryan
- Center for Biological Physics, Arizona State University, Tempe, Arizona; Department of Physics, Arizona State University, Tempe, Arizona
| | - Steve Pressé
- Center for Biological Physics, Arizona State University, Tempe, Arizona; Department of Physics, Arizona State University, Tempe, Arizona; School of Molecular Sciences, Arizona State University, Tempe, Arizona.
| |
Collapse
|
7
|
Saurabh A, Fazel M, Safar M, Sgouralis I, Pressé S. Single-photon smFRET. I: Theory and conceptual basis. Biophys Rep (N Y) 2022; 3:100089. [PMID: 36582655 PMCID: PMC9793182 DOI: 10.1016/j.bpr.2022.100089] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 11/28/2022] [Indexed: 12/03/2022]
Abstract
We present a unified conceptual framework and the associated software package for single-molecule Förster resonance energy transfer (smFRET) analysis from single-photon arrivals leveraging Bayesian nonparametrics, BNP-FRET. This unified framework addresses the following key physical complexities of a single-photon smFRET experiment, including: 1) fluorophore photophysics; 2) continuous time kinetics of the labeled system with large timescale separations between photophysical phenomena such as excited photophysical state lifetimes and events such as transition between system states; 3) unavoidable detector artefacts; 4) background emissions; 5) unknown number of system states; and 6) both continuous and pulsed illumination. These physical features necessarily demand a novel framework that extends beyond existing tools. In particular, the theory naturally brings us to a hidden Markov model with a second-order structure and Bayesian nonparametrics on account of items 1, 2, and 5 on the list. In the second and third companion articles, we discuss the direct effects of these key complexities on the inference of parameters for continuous and pulsed illumination, respectively.
Collapse
Affiliation(s)
- Ayush Saurabh
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Physics, Arizona State University, Tempe, Arizona
| | - Mohamadreza Fazel
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Physics, Arizona State University, Tempe, Arizona
| | - Matthew Safar
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Mathematics and Statistical Science, Arizona State University, Tempe, Arizona
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee Knoxville, Knoxville, Tennesse
| | - Steve Pressé
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Physics, Arizona State University, Tempe, Arizona,School of Molecular Sciences, Arizona State University, Tempe, Arizona,Corresponding author
| |
Collapse
|
8
|
Saurabh A, Safar M, Fazel M, Sgouralis I, Pressé S. Single-photon smFRET: II. Application to continuous illumination. Biophys Rep (N Y) 2022; 3:100087. [PMID: 36582656 PMCID: PMC9792399 DOI: 10.1016/j.bpr.2022.100087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 11/01/2022] [Accepted: 11/21/2022] [Indexed: 12/03/2022]
Abstract
Here we adapt the Bayesian nonparametrics (BNP) framework presented in the first companion article to analyze kinetics from single-photon, single-molecule Förster resonance energy transfer (smFRET) traces generated under continuous illumination. Using our sampler, BNP-FRET, we learn the escape rates and the number of system states given a photon trace. We benchmark our method by analyzing a range of synthetic and experimental data. Particularly, we apply our method to simultaneously learn the number of system states and the corresponding kinetics for intrinsically disordered proteins using two-color FRET under varying chemical conditions. Moreover, using synthetic data, we show that our method can deduce the number of system states even when kinetics occur at timescales of interphoton intervals.
Collapse
Affiliation(s)
- Ayush Saurabh
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Physics, Arizona State University, Tempe, Arizona
| | - Matthew Safar
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Mathematics and Statistical Science, Arizona State University, Tempe, Arizona
| | - Mohamadreza Fazel
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Physics, Arizona State University, Tempe, Arizona
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee Knoxville, Knoxville, Tennessee
| | - Steve Pressé
- Center for Biological Physics, Arizona State University, Tempe, Arizona,Department of Physics, Arizona State University, Tempe, Arizona,School of Molecular Sciences, Arizona State University, Tempe, Arizona,Corresponding author
| |
Collapse
|
9
|
Jazani S, Xu 徐伟青 LWQ, Sgouralis I, Shepherd DP, Pressé S. Computational Proposal for Tracking Multiple Molecules in a Multifocus Confocal Setup. ACS Photonics 2022; 9:2489-2498. [PMID: 36051355 PMCID: PMC9431897 DOI: 10.1021/acsphotonics.2c00614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Tracking single molecules continues to provide new insights into the fundamental rules governing biological function. Despite continued technical advances in fluorescent and non-fluorescent labeling as well as data analysis, direct observations of trajectories and interactions of multiple molecules in dense environments remain aspirational goals. While confocal methods provide a means to deduce dynamical parameters with high temporal resolution, such as diffusion coefficients, they do so at the expense of spatial resolution. Indeed, on account of a confocal volume's symmetry, typically only distances from the center of the confocal spot can be deduced. Motivated by the need for true three dimensional high speed tracking in densely labeled environments, we propose a computational tool for tracking many fluorescent molecules traversing multiple, closely spaced, confocal measurement volumes providing independent observations. Various realizations of this multiple confocal volumes strategy have previously been used for long term, large area, tracking of one fluorescent molecule in three dimensions. What is more, we achieve tracking by directly using single photon arrival times to inform our likelihood and exploit Hamiltonian Monte Carlo to efficiently sample trajectories from our posterior within a Bayesian nonparametric paradigm. A nonparametric paradigm here is warranted as the number of molecules present are, themselves, a priori unknown. Taken together, we provide a computational framework to infer trajectories of multiple molecules at once, below the diffraction limit (the width of a confocal spot), in three dimensions at sub-millisecond or faster time scales.
Collapse
Affiliation(s)
- Sina Jazani
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins Medicine, Baltimore
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe
| | - Lance W Q Xu 徐伟青
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee, Knoxville
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe
| | - Douglas P Shepherd
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe
| | - Steve Pressé
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe
- School of Molecular Sciences, Arizona State University, Tempe
| |
Collapse
|
10
|
Bryan JS, Basak P, Bechhoefer J, Pressé S. Inferring potential landscapes from noisy trajectories of particles within an optical feedback trap. iScience 2022. [PMID: 36034218 PMCID: PMC9400092 DOI: 10.1016/j.isci.2022.104731] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 06/27/2022] [Accepted: 07/02/2022] [Indexed: 11/22/2022] Open
Abstract
While particle trajectories encode information on their governing potentials, potentials can be challenging to robustly extract from trajectories. Measurement errors may corrupt a particle’s position, and sparse sampling of the potential limits data in higher energy regions such as barriers. We develop a Bayesian method to infer potentials from trajectories corrupted by Markovian measurement noise without assuming prior functional form on the potentials. As an alternative to Gaussian process priors over potentials, we introduce structured kernel interpolation to the Natural Sciences which allows us to extend our analysis to large datasets. Structured-Kernel-Interpolation Priors for Potential Energy Reconstruction (SKIPPER) is validated on 1D and 2D experimental trajectories for particles in a feedback trap. A feedback trap was used to generate noisy Langevin microbead trajectories The potential energy surface is recovered using a Bayesian formulation The formulation uses a structured-kernel-interpolation Gaussian process (SKI-GP) to tractably approximate Gaussian process regression for larger datasets Thanks to our adaptation of SKI-GP, we have broadened the use of Gaussian processes for natural science applications
Collapse
|
11
|
Palstra I, Koenderink AF. A Python Toolbox for Unbiased Statistical Analysis of Fluorescence Intermittency of Multilevel Emitters. J Phys Chem C Nanomater Interfaces 2021; 125:12050-12060. [PMID: 34276862 PMCID: PMC8282189 DOI: 10.1021/acs.jpcc.1c01670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 05/05/2021] [Indexed: 06/13/2023]
Abstract
We report on a Python toolbox for unbiased statistical analysis of fluorescence intermittency properties of single emitters. Intermittency, that is, step-wise temporal variations in the instantaneous emission intensity and fluorescence decay rate properties, is common to organic fluorophores, II-VI quantum dots, and perovskite quantum dots alike. Unbiased statistical analysis of intermittency switching time distributions, involved levels, and lifetimes are important to avoid interpretation artifacts. This work provides an implementation of Bayesian changepoint analysis and level clustering applicable to time-tagged single-photon detection data of single emitters that can be applied to real experimental data and as a tool to verify the ramifications of hypothesized mechanistic intermittency models. We provide a detailed Monte Carlo analysis to illustrate these statistics tools and to benchmark the extent to which conclusions can be drawn on the photophysics of highly complex systems, such as perovskite quantum dots that switch between a plethora of states instead of just two.
Collapse
Affiliation(s)
- Isabelle
M. Palstra
- Institute
of Physics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
- Center
for Nanophotonics, AMOLF, Science Park 104, 1098 XG Amsterdam, The Netherlands
| | - A. Femius Koenderink
- Center
for Nanophotonics, AMOLF, Science Park 104, 1098 XG Amsterdam, The Netherlands
| |
Collapse
|
12
|
Kilic Z, Sgouralis I, Heo W, Ishii K, Tahara T, Pressé S. Extraction of rapid kinetics from smFRET measurements using integrative detectors. Cell Rep Phys Sci 2021; 2:100409. [PMID: 34142102 PMCID: PMC8208598 DOI: 10.1016/j.xcrp.2021.100409] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Hidden Markov models (HMMs) are used to learn single-molecule kinetics across a range of experimental techniques. By their construction, HMMs assume that single-molecule events occur on slower timescales than those of data acquisition. To move beyond that HMM limitation and allow for single-molecule events to occur on any timescale, we must treat single-molecule events in continuous time as they occur in nature. We propose a method to learn kinetic rates from single-molecule Förster resonance energy transfer (smFRET) data collected by integrative detectors, even if those rates exceed data acquisition rates. To achieve that, we exploit our recently proposed "hidden Markov jump process" (HMJP), with which we learn transition kinetics from parallel measurements in donor and acceptor channels. HMJPs generalize the HMM paradigm in two critical ways: (1) they deal with physical smFRET systems as they switch between conformational states in continuous time, and (2) they estimate transition rates between conformational states directly without having recourse to transition probabilities or assuming slow dynamics. Our continuous-time treatment learns the transition kinetics and photon emission rates for dynamic regimes that are inaccessible to HMMs, which treat system kinetics in discrete time. We validate our framework's robustness on simulated data and demonstrate its performance on experimental data from FRET-labeled Holliday junctions.
Collapse
Affiliation(s)
- Zeliha Kilic
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287, USA
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee, Knoxville, TN 37996, USA
| | - Wooseok Heo
- Molecular Spectroscopy Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kunihiko Ishii
- Molecular Spectroscopy Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Ultrafast Spectroscopy Research Team, RIKEN Center for Advanced Photonics (RAP), 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Tahei Tahara
- Molecular Spectroscopy Laboratory, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Ultrafast Spectroscopy Research Team, RIKEN Center for Advanced Photonics (RAP), 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Steve Pressé
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, AZ 85287, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- Lead contact
| |
Collapse
|
13
|
Kilic Z, Sgouralis I, Pressé S. Residence time analysis of RNA polymerase transcription dynamics: A Bayesian sticky HMM approach. Biophys J 2021; 120:1665-1679. [PMID: 33705761 DOI: 10.1016/j.bpj.2021.02.045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 02/08/2021] [Accepted: 02/18/2021] [Indexed: 01/09/2023] Open
Abstract
The time spent by a single RNA polymerase (RNAP) at specific locations along the DNA, termed "residence time," reports on the initiation, elongation, and termination stages of transcription. At the single-molecule level, this information can be obtained from dual ultrastable optical trapping experiments, revealing a transcriptional elongation of RNAP interspersed with residence times of variable duration. Successfully discriminating between long and short residence times was used by previous approaches to learn about RNAP's transcription elongation dynamics. Here, we propose an approach based on the Bayesian sticky hidden Markov model that treats all residence times for an Escherichia coli RNAP on an equal footing without a priori discriminating between long and short residence times. Furthermore, our method has two additional advantages: we provide full distributions around key point statistics and directly treat the sequence dependence of RNAP's elongation rate. By applying our approach to experimental data, we find assigned relative probabilities on long versus short residence times, force-dependent average residence time transcription elongation dynamics, ∼10% drop in the average backtracking durations in the presence of GreB, and ∼20% drop in the average residence time as a function of applied force in the presence of RNaseA.
Collapse
Affiliation(s)
- Zeliha Kilic
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona
| | - Ioannis Sgouralis
- Department of Mathematics, University of Tennessee, Knoxville, Tennessee
| | - Steve Pressé
- Center for Biological Physics, Department of Physics and School of Molecular Sciences, Arizona State University, Tempe, Arizona. spresse@%20asu.edu
| |
Collapse
|
14
|
Abstract
Effective forces derived from experimental or in silico molecular dynamics time traces are critical in developing reduced and computationally efficient descriptions of otherwise complex dynamical problems. This helps motivate why it is important to develop methods to efficiently learn effective forces from time series data. A number of methods already exist to do this when data are plentiful but otherwise fail for sparse datasets or datasets where some regions of phase space are undersampled. In addition, any method developed to learn effective forces from time series data should be minimally a priori committal as to the shape of the effective force profile, exploit every data point without reducing data quality through any form of binning or pre-processing, and provide full credible intervals (error bars) about the prediction for the entirety of the effective force curve. Here, we propose a generalization of the Gaussian process, a key tool in Bayesian nonparametric inference and machine learning, which meets all of the above criteria in learning effective forces for the first time.
Collapse
Affiliation(s)
- J Shepard Bryan
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona 85287, USA
| | - Ioannis Sgouralis
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona 85287, USA
| | - Steve Pressé
- Center for Biological Physics, Department of Physics, Arizona State University, Tempe, Arizona 85287, USA
| |
Collapse
|
15
|
Abstract
Conformational memory in single-molecule dynamics has attracted recent attention and, in particular, has been invoked as a possible explanation of some of the intriguing properties of transition paths observed in single-molecule force spectroscopy (SMFS) studies. Here we study one candidate for a non-Markovian model that can account for conformational memory, the generalized Langevin equation with a friction force that depends not only on the instantaneous velocity but also on the velocities in the past. The memory in this model is determined by a time-dependent friction memory kernel. We propose a method for extracting this kernel directly from an experimental signal and illustrate its feasibility by applying it to a generalized Rouse model of a SMFS experiment, where the memory kernel is known exactly. Using the same model, we further study how memory affects various statistical properties of transition paths observed in SMFS experiments and evaluate the performance of recent approximate analytical theories of non-Markovian dynamics of barrier crossing. We argue that the same type of analysis can be applied to recent single-molecule observations of transition paths in protein and DNA folding.
Collapse
|
16
|
Abstract
In an effort to answer the much-debated question of whether the time evolution of common experimental observables can be described as one-dimensional diffusion in the potential of mean force, we propose a simple criterion that allows one to test whether the Markov assumption is applicable to a single-molecule trajectory x( t). This test does not involve fitting of the data to any presupposed model and can be applied to experimental data with relatively low temporal resolution.
Collapse
Affiliation(s)
- Alexander M. Berezhkovskii
- Mathematical and Statistical Computing Laboratory, Office of Intramural Research, Center for Information Technology, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Dmitrii E. Makarov
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
17
|
Abstract
Gene networks with feedback often involve interactions between multiple species of biomolecules, much more than experiments can actually monitor. Coupled with this is the challenge that experiments often measure gene expression in noisy fluorescence instead of protein numbers. How do we infer biophysical information and characterize the underlying circuits from this limited and convoluted data? We address this by building stochastic models using the principle of Maximum Caliber (MaxCal). MaxCal uses the basic information on synthesis, degradation, and feedback-without invoking any other auxiliary species and ad hoc reactions-to generate stochastic trajectories similar to those typically measured in experiments. MaxCal in conjunction with Maximum Likelihood (ML) can infer parameters of the model using fluctuating trajectories of protein expression over time. We demonstrate the success of the MaxCal + ML methodology using synthetic data generated from known circuits of different genetic switches: (i) a single-gene autoactivating circuit involving five species (including mRNA), (ii) a mutually repressing two-gene circuit (toggle switch) with seven species (including mRNA) considering stochastic time traces of two proteins, and (iii) the same toggle switch circuit considering stochastic time traces of only one of the two proteins. To further challenge the MaxCal + ML inference scheme, we repeat our analysis for the second and third scenario with traces expressed in noisy fluorescence instead of protein number to closely mimic typical experiments. We show that, for all of these models with increasing complexity and obfuscation, the minimal model of MaxCal is still able to capture the fluctuations of the trajectory and infer basic underlying rate parameters when benchmarked against the known values used to generate the synthetic data. Importantly, the model also yields an effective feedback parameter that can be used to quantify interactions within these circuits. These applications show the promise of MaxCal's ability to characterize circuits with limited data, and its utility to better understand evolution and advance design strategies for specific functions.
Collapse
Affiliation(s)
- Taylor Firman
- Molecular and Cellular Biophysics , University of Denver , Denver , Colorado 80209 , United States
| | - Stephen Wedekind
- Department of Physics and Astronomy , University of Denver , Denver , Colorado 80209 , United States
| | - T J McMorrow
- Department of Physics and Astronomy , University of Denver , Denver , Colorado 80209 , United States
| | - Kingshuk Ghosh
- Department of Physics and Astronomy , University of Denver , Denver , Colorado 80209 , United States
| |
Collapse
|
18
|
Firman T, Balázsi G, Ghosh K. Building Predictive Models of Genetic Circuits Using the Principle of Maximum Caliber. Biophys J 2017; 113:2121-2130. [PMID: 29117534 DOI: 10.1016/j.bpj.2017.08.057] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/25/2017] [Accepted: 08/31/2017] [Indexed: 11/17/2022] Open
Abstract
Learning the underlying details of a gene network is a major challenge in cellular and synthetic biology. We address this challenge by building a chemical kinetic model that utilizes information encoded in the stochastic protein expression trajectories typically measured in experiments. The applicability of the proposed method is demonstrated in an auto-activating genetic circuit, a common motif in natural and synthetic gene networks. Our approach is based on the principle of maximum caliber (MaxCal)-a dynamical analog of the principle of maximum entropy-and builds a minimal model using only three constraints: 1) protein synthesis, 2) protein degradation, and 3) positive feedback. The MaxCal-generated model (described with four parameters) was benchmarked against synthetic data generated using a Gillespie algorithm on a known reaction network (with seven parameters). MaxCal accurately predicts underlying rate parameters of protein synthesis and degradation as well as experimental observables such as protein number and dwell-time distributions. Furthermore, MaxCal yields an effective feedback parameter that can be useful for circuit design. We also extend our methodology and demonstrate how to analyze trajectories that are not in protein numbers but in arbitrary fluorescence units, a more typical condition in experiments. This "top-down" methodology based on minimal information-in contrast to traditional "bottom-up" approaches that require ad hoc knowledge of circuit details-provides a powerful tool to accurately infer underlying details of feedback circuits that are not otherwise visible in experiments and to help guide circuit design.
Collapse
Affiliation(s)
- Taylor Firman
- Department of Physics and Astronomy, Molecular and Cellular Biophysics, University of Denver, Denver, Colorado
| | - Gábor Balázsi
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York; Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York
| | - Kingshuk Ghosh
- Department of Physics and Astronomy, Molecular and Cellular Biophysics, University of Denver, Denver, Colorado.
| |
Collapse
|
19
|
Affiliation(s)
- Meysam Tavakoli
- Physics Department; Indiana University-Purdue University Indianapolis; Indianapolis IN 46202 USA
| | - J. Nicholas Taylor
- Research Institute for Electronic Science; Hokkaido University; Kita 20 Nishi 10 Kita-Ku Sapporo 001-0020 Japan
| | - Chun-Biu Li
- Research Institute for Electronic Science; Hokkaido University; Kita 20 Nishi 10 Kita-Ku Sapporo 001-0020 Japan
- Department of Mathematics; Stockholm University; 106 91 Stockholm Sweden
| | - Tamiki Komatsuzaki
- Research Institute for Electronic Science; Hokkaido University; Kita 20 Nishi 10 Kita-Ku Sapporo 001-0020 Japan
| | - Steve Pressé
- Physics Department; Indiana University-Purdue University Indianapolis; Indianapolis IN 46202 USA
- Department of Chemistry and Chemical Biology; Indiana University-Purdue University Indianapolis; Indianapolis IN 46202 USA
- Department of Cell and Integrative Physiology; Indiana University School of Medicine; Indianapolis IN 46202 USA
- Department of Physics and School of Molecular Sciences; Arizona State University; Tempe AZ 85287 USA
| |
Collapse
|
20
|
Abstract
Super-resolution microscopy provides direct insight into fundamental biological processes occurring at length scales smaller than light's diffraction limit. The analysis of data at such scales has brought statistical and machine learning methods into the mainstream. Here we provide a survey of data analysis methods starting from an overview of basic statistical techniques underlying the analysis of super-resolution and, more broadly, imaging data. We subsequently break down the analysis of super-resolution data into four problems: the localization problem, the counting problem, the linking problem, and what we've termed the interpretation problem.
Collapse
Affiliation(s)
- Antony Lee
- Department of Physics, University of California at Berkeley, Berkeley, California 94720, United States
- Jason L. Choy Laboratory of Single-Molecule Biophysics, University of California at Berkeley, Berkeley, California 94720, United States
| | - Konstantinos Tsekouras
- Department of Physics, University of California at Berkeley, Berkeley, California 94720, United States
- Department of Physics, Arizona State University, Tempe, Arizona 85287, United States
| | | | - Carlos Bustamante
- Jason L. Choy Laboratory of Single-Molecule Biophysics, University of California at Berkeley, Berkeley, California 94720, United States
- Biophysics Graduate Group, University of California at Berkeley, Berkeley, California 94720, United States
- Institute for Quantitative Biosciences-QB3, University of California at Berkeley, Berkeley, California 94720, United States
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, California 94720, United States
- Department of Chemistry, University of California at Berkeley, Berkeley, California 94720, United States
- Howard Hughes Medical Institute, University of California at Berkeley, Berkeley, California 94720, United States
- Kavli Energy Nanosciences Institute, University of California at Berkeley, Berkeley, California 94720, United States
| | - Steve Pressé
- Department of Physics, University of California at Berkeley, Berkeley, California 94720, United States
- Department of Chemistry and Chemical Biology, Indiana University–Purdue University Indianapolis, Indianapolis, Indiana 46202, United States
- Department of Cell and Integrative Physiology, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
- Department of Physics, Arizona State University, Tempe, Arizona 85287, United States
| |
Collapse
|
21
|
Xu H, Plaut B, Zhu X, Chen M, Mavinkurve U, Maiti A, Song G, Murari K, Mandal M. Direct Observation of Folding Energy Landscape of RNA Hairpin at Mechanical Loading Rates. J Phys Chem B 2017; 121:2220-2229. [PMID: 28248503 DOI: 10.1021/acs.jpcb.6b10362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
By applying a controlled mechanical load using optical tweezers, we measured the diffusive barrier crossing in a 49 nt long P5ab RNA hairpin. We find that in the free-energy landscape the barrier height (G‡) and transition distance (x‡) are dependent on the loading rate (r) along the pulling direction, x, as predicted by Bell. The barrier shifted toward the initial state, whereas ΔG‡ reduced significantly from 50 to 5 kT, as r increased from 0 to 32 pN/s. However, the equilibrium work (ΔG) during strand separation, as estimated by Crook's fluctuation theorem, remained unchanged at different rates. Previously, helix formation and denaturation have been described as two-state (F ↔ U) transitions for P5ab. Herein, we report three intermediate states I1, I, and I2 located at 4, 11, and 16 nm respectively, from the folded conformation. The intermediates were observed only when the hairpin was subjected to an optimal r, 7.6 pN/s. The results indicate that the complementary strands in P5ab can zip and unzip through complex routes, whereby mismatches act as checkpoints and often impose barriers. The study highlights the significance of loading rates in force-spectroscopy experiments that are increasingly being used to measure the folding properties of biomolecules.
Collapse
Affiliation(s)
- Huizhong Xu
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Benjamin Plaut
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Xiran Zhu
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Maverick Chen
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Udit Mavinkurve
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Anindita Maiti
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Guangtao Song
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Krishna Murari
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Maumita Mandal
- Department of Physics, ‡Department of Mathematical Sciences, §Department of Computer Science, and ∥Department of Chemistry, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
22
|
|
23
|
Colomb W, Sarkar SK. Extracting physics of life at the molecular level: A review of single-molecule data analyses. Phys Life Rev 2015; 13:107-37. [PMID: 25660417 DOI: 10.1016/j.plrev.2015.01.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 01/09/2015] [Indexed: 12/31/2022]
Abstract
Studying individual biomolecules at the single-molecule level has proved very insightful recently. Single-molecule experiments allow us to probe both the equilibrium and nonequilibrium properties as well as make quantitative connections with ensemble experiments and equilibrium thermodynamics. However, it is important to be careful about the analysis of single-molecule data because of the noise present and the lack of theoretical framework for processes far away from equilibrium. Biomolecular motion, whether it is free in solution, on a substrate, or under force, involves thermal fluctuations in varying degrees, which makes the motion noisy. In addition, the noise from the experimental setup makes it even more complex. The details of biologically relevant interactions, conformational dynamics, and activities are hidden in the noisy single-molecule data. As such, extracting biological insights from noisy data is still an active area of research. In this review, we will focus on analyzing both fluorescence-based and force-based single-molecule experiments and gaining biological insights at the single-molecule level. Inherently nonequilibrium nature of biological processes will be highlighted. Simulated trajectories of biomolecular diffusion will be used to compare and validate various analysis techniques.
Collapse
Affiliation(s)
- Warren Colomb
- Department of Physics, Colorado School of Mines, Golden, CO 80401, United States
| | - Susanta K Sarkar
- Department of Physics, Colorado School of Mines, Golden, CO 80401, United States.
| |
Collapse
|