1
|
Mehnert SA, Davidson JT, Adeoye A, Lowe BD, Ruiz EA, King JR, Jackson GP. Expert Algorithm for Substance Identification Using Mass Spectrometry: Application to the Identification of Cocaine on Different Instruments Using Binary Classification Models. J Am Soc Mass Spectrom 2023. [PMID: 37254938 DOI: 10.1021/jasms.3c00090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This is the second of two manuscripts describing how general linear modeling (GLM) of a selection of the most abundant normalized fragment ion abundances of replicate mass spectra from one laboratory can be used in conjunction with binary classifiers to enable specific and selective identifications with reportable error rates of spectra from other laboratories. Here, the proof-of-concept uses a training set of 128 replicate cocaine spectra from one crime laboratory as the basis of GLM modeling. GLM models for the 20 most abundant fragments of cocaine were then applied to 175 additional test/validation cocaine spectra collected in more than a dozen crime laboratories and 716 known negative spectra, which included 10 spectra of three diastereomers of cocaine. Spectral similarity and dissimilarity between the measured and predicted abundances were assessed using a variety of conventional measures, including the mean absolute residual and NIST's spectral similarity score. For each spectral measure, GLM predictions were compared to the traditional exemplar approach, which used the average of the cocaine training set as the consensus spectrum for comparisons. In unsupervised models, EASI provided better than a 95% true positive rate for cocaine with a 0% false positive rate. A supervised binary logistic regression model provided 100% accuracy and no errors using EASI-predicted abundances of only four peaks at m/z 152, 198, 272, and 303. Regardless of the measure of spectral similarity, error rates for identifications using EASI were superior to the traditional exemplar/consensus approach. As a supervised binary classifier, EASI was more reliable than using Mahalanobis distances.
Collapse
Affiliation(s)
- Samantha A Mehnert
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - J Tyler Davidson
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Alexandra Adeoye
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Brandon D Lowe
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Emily A Ruiz
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Jacob R King
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Glen P Jackson
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| |
Collapse
|
2
|
Jackson GP, Mehnert SA, Davidson JT, Lowe BD, Ruiz EA, King JR. Expert Algorithm for Substance Identification Using Mass Spectrometry: Statistical Foundations in Unimolecular Reaction Rate Theory. J Am Soc Mass Spectrom 2023. [PMID: 37255332 DOI: 10.1021/jasms.3c00089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This study aims to resolve one of the longest-standing problems in mass spectrometry, which is how to accurately identify an organic substance from its mass spectrum when a spectrum of the suspected substance has not been analyzed contemporaneously on the same instrument. Part one of this two-part report describes how Rice-Ramsperger-Kassel-Marcus (RRKM) theory predicts that many branching ratios in replicate electron-ionization mass spectra will provide approximately linear correlations when analysis conditions change within or between instruments. Here, proof-of-concept general linear modeling is based on the 20 most abundant fragments in a database of 128 training spectra of cocaine collected over 6 months in an operational crime laboratory. The statistical validity of the approach is confirmed through both analysis of variance (ANOVA) of the regression models and assessment of the distributions of the residuals of the models. General linear modeling models typically explain more than 90% of the variance in normalized abundances. When the linear models from the training set are applied to 175 additional known positive cocaine spectra from more than 20 different laboratories, the linear models enabled ion abundances to be predicted with an accuracy of <2% relative to the base peak, even though the measured abundances vary by more than 30%. The same models were also applied to 716 known negative spectra, including the diastereomers of cocaine: allococaine, pseudococaine, and pseudoallococaine, and the residual errors were larger for the known negatives than for known positives. The second part of the manuscript describes how general linear regression modeling can serve as the basis for binary classification and reliable identification of cocaine from its diastereomers and all other known negatives.
Collapse
Affiliation(s)
- Glen P Jackson
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Samantha A Mehnert
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - J Tyler Davidson
- Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Brandon D Lowe
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Emily A Ruiz
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Jacob R King
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| |
Collapse
|
3
|
Stieve BJ, Richner TJ, Krook-Magnuson C, Netoff TI, Krook-Magnuson E. Optimization of closed-loop electrical stimulation enables robust cerebellar-directed seizure control. Brain 2023; 146:91-108. [PMID: 35136942 DOI: 10.1093/brain/awac051] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 12/17/2021] [Accepted: 01/11/2022] [Indexed: 01/11/2023] Open
Abstract
Additional treatment options for temporal lobe epilepsy are needed, and potential interventions targeting the cerebellum are of interest. Previous animal work has shown strong inhibition of hippocampal seizures through on-demand optogenetic manipulation of the cerebellum. However, decades of work examining electrical stimulation-a more immediately translatable approach-targeting the cerebellum has produced very mixed results. We were therefore interested in exploring the impact that stimulation parameters may have on seizure outcomes. Using a mouse model of temporal lobe epilepsy, we conducted on-demand electrical stimulation of the cerebellar cortex, and varied stimulation charge, frequency and pulse width, resulting in over 1000 different potential combinations of settings. To explore this parameter space in an efficient, data-driven, manner, we utilized Bayesian optimization with Gaussian process regression, implemented in MATLAB with an Expected Improvement Plus acquisition function. We examined three different fitting conditions and two different electrode orientations. Following the optimization process, we conducted additional on-demand experiments to test the effectiveness of selected settings. Regardless of experimental setup, we found that Bayesian optimization allowed identification of effective intervention settings. Additionally, generally similar optimal settings were identified across animals, suggesting that personalized optimization may not always be necessary. While optimal settings were effective, stimulation with settings predicted from the Gaussian process regression to be ineffective failed to provide seizure control. Taken together, our results provide a blueprint for exploration of a large parameter space for seizure control and illustrate that robust inhibition of seizures can be achieved with electrical stimulation of the cerebellum, but only if the correct stimulation parameters are used.
Collapse
Affiliation(s)
- Bethany J Stieve
- Graduate Program in Neuroscience, University of Minnesota, Minneapolis 55455, USA
| | - Thomas J Richner
- Department of Biomedical Engineering, University of Minnesota, Minneapolis 55455, USA.,Department of Neuroscience, University of Minnesota, Minneapolis 55455, USA
| | | | - Theoden I Netoff
- Graduate Program in Neuroscience, University of Minnesota, Minneapolis 55455, USA.,Department of Biomedical Engineering, University of Minnesota, Minneapolis 55455, USA
| | - Esther Krook-Magnuson
- Graduate Program in Neuroscience, University of Minnesota, Minneapolis 55455, USA.,Department of Neuroscience, University of Minnesota, Minneapolis 55455, USA
| |
Collapse
|
4
|
Aranyi SC, Nagy M, Opposits G, Berényi E, Emri M. Characterizing Network Search Algorithms Developed for Dynamic Causal Modeling. Front Neuroinform 2021; 15:656486. [PMID: 34177506 PMCID: PMC8222613 DOI: 10.3389/fninf.2021.656486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Accepted: 05/07/2021] [Indexed: 11/13/2022] Open
Abstract
Dynamic causal modeling (DCM) is a widely used tool to estimate the effective connectivity of specified models of a brain network. Finding the model explaining measured data is one of the most important outstanding problems in Bayesian modeling. Using heuristic model search algorithms enables us to find an optimal model without having to define a model set a priori. However, the development of such methods is cumbersome in the case of large model-spaces. We aimed to utilize commonly used graph theoretical search algorithms for DCM to create a framework for characterizing them, and to investigate relevance of such methods for single-subject and group-level studies. Because of the enormous computational demand of DCM calculations, we separated the model estimation procedure from the search algorithm by providing a database containing the parameters of all models in a full model-space. For test data a publicly available fMRI dataset of 60 subjects was used. First, we reimplemented the deterministic bilinear DCM algorithm in the ReDCM R package, increasing computational speed during model estimation. Then, three network search algorithms have been adapted for DCM, and we demonstrated how modifications to these methods, based on DCM posterior parameter estimates, can enhance search performance. Comparison of the results are based on model evidence, structural similarities and the number of model estimations needed during search. An analytical approach using Bayesian model reduction (BMR) for efficient network discovery is already available for DCM. Comparing model search methods we found that topological algorithms often outperform analytical methods for single-subject analysis and achieve similar results for recovering common network properties of the winning model family, or set of models, obtained by multi-subject family-wise analysis. However, network search methods show their limitations in higher level statistical analysis of parametric empirical Bayes. Optimizing such linear modeling schemes the BMR methods are still considered the recommended approach. We envision the freely available database of estimated model-spaces to help further studies of the DCM model-space, and the ReDCM package to be a useful contribution for Bayesian inference within and beyond the field of neuroscience.
Collapse
Affiliation(s)
- Sándor Csaba Aranyi
- Division of Nuclear Medicine and Translational Imaging, Department of Medical Imaging, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Marianna Nagy
- Division of Radiology and Imaging Science, Department of Medical Imaging, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Gábor Opposits
- Division of Nuclear Medicine and Translational Imaging, Department of Medical Imaging, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Ervin Berényi
- Division of Radiology and Imaging Science, Department of Medical Imaging, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Miklós Emri
- Division of Nuclear Medicine and Translational Imaging, Department of Medical Imaging, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
5
|
Abstract
Many spectra have a polynomial-like baseline. Iterative polynomial fitting is one of the most popular methods for baseline correction of these spectra. However, the baseline estimated by iterative polynomial fitting may have a substantial error when the spectrum contains significantly strong peaks or have strong peaks located at the endpoints. First, iterative polynomial fitting uses temporary baseline estimated from the current spectrum to identify peak data points. If the current spectrum contains strong peaks, then the temporary baseline substantially deviates from the true baseline. Some good baseline data points of the spectrum might be mistakenly identified as peak data points and are artificially re-assigned with a low value. Second, if a strong peak is located at the endpoint of the spectrum, then the endpoint region of the estimated baseline might have a significant error due to overfitting. This study proposes a search algorithm-based baseline correction method (SA) that aims to compress sample the raw spectrum to a dataset with small number of data points and then convert the peak removal process into solving a search problem in artificial intelligence to minimize an objective function by deleting peak data points. First, the raw spectrum is smoothened out by the moving average method to reduce noise and then divided into dozens of unequally spaced sections on the basis of Chebyshev nodes. Finally, the minimal points of each section are collected to form a dataset for peak removal through search algorithm. SA selects the mean absolute error as the objective function because of its sensitivity to overfitting and rapid calculation. The baseline correction performance of SA is compared with those of three baseline correction methods, the Lieber and Mahadevan-Jansen method, adaptive iteratively reweighted penalized least squares method, and improved asymmetric least squares method. Simulated and real Fourier transform infrared and Raman spectra with polynomial-like baselines are employed in the experiments. Results show that for these spectra the baseline estimated by SA has fewer error than those by the three other methods.
Collapse
Affiliation(s)
- Xin Wang
- College of Mechanical Engineering and Applied Electronics Technology, Beijing University of Technology, Beijing, China
| | - Xia Chen
- Key Laboratory of Enhanced Heat Transfer and Energy Conservation, Ministry of Education, College of Environmental and Energy Engineering, Beijing University of Technology, Beijing, China
- Key Laboratory of Heat Transfer and Energy Conversion, Beijing Municipality, College of Environmental and Energy Engineering, Beijing University of Technology, Beijing, China
| |
Collapse
|
6
|
Abstract
While animals track or search for targets, sensory organs make small unexplained movements on top of the primary task-related motions. While multiple theories for these movements exist-in that they support infotaxis, gain adaptation, spectral whitening, and high-pass filtering-predicted trajectories show poor fit to measured trajectories. We propose a new theory for these movements called energy-constrained proportional betting, where the probability of moving to a location is proportional to an expectation of how informative it will be balanced against the movement's predicted energetic cost. Trajectories generated in this way show good agreement with measured trajectories of fish tracking an object using electrosense, a mammal and an insect localizing an odor source, and a moth tracking a flower using vision. Our theory unifies the metabolic cost of motion with information theory. It predicts sense organ movements in animals and can prescribe sensor motion for robots to enhance performance.
Collapse
Affiliation(s)
- Chen Chen
- Center for Robotics and Biosystems, Northwestern UniversityEvanstonUnited States
- Department of Biomedical Engineering, Northwestern UniversityEvanstonUnited States
| | - Todd D Murphey
- Center for Robotics and Biosystems, Northwestern UniversityEvanstonUnited States
- Department of Mechanical Engineering, Northwestern UniversityEvanstonUnited States
| | - Malcolm A MacIver
- Center for Robotics and Biosystems, Northwestern UniversityEvanstonUnited States
- Department of Biomedical Engineering, Northwestern UniversityEvanstonUnited States
- Department of Mechanical Engineering, Northwestern UniversityEvanstonUnited States
- Department of Neurobiology, Northwestern UniversityEvanstonUnited States
| |
Collapse
|
7
|
Abstract
Count responses with grouping and right censoring have long been used in surveys to study a variety of behaviors, status, and attitudes. Yet grouping or right-censoring decisions of count responses still rely on arbitrary choices made by researchers. We develop a new method for evaluating grouping and right-censoring decisions of count responses from a (semisupervised) machine-learning perspective. This article uses Poisson multinomial mixture models to conceptualize the data-generating process of count responses with grouping and right censoring and demonstrates the link between grouping-scheme choices and asymptotic distributions of the Poisson mixture. To search for the optimal grouping scheme maximizing objective functions of the Fisher information (matrix), an innovative three-step M algorithm is then proposed to process infinitely many grouping schemes based on Bayesian A-, D-, and E-optimalities. A new R package is developed to implement this algorithm and evaluate grouping schemes of count responses. Results show that an optimal grouping scheme not only leads to a more efficient sampling design but also outperforms a nonoptimal one even if the latter has more groups.
Collapse
Affiliation(s)
- Qiang Fu
- Department of Sociology, The University of British Columbia, Vancouver, British Columbia, Canada, V6 T 1Z1
| | - Xin Guo
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
| | - Kenneth C. Land
- John Franklin Crowell Professor Emeritus, Department of Sociology, Social Science Research Institute, Duke University, Durham, NC, USA
| |
Collapse
|
8
|
Abstract
The exemplary search capabilities of flying insects have established them as one of the most diverse taxa on Earth. However, we still lack the fundamental ability to quantify, represent, and predict trajectories under natural contexts to understand search and its applications. For example, flying insects have evolved in complex multimodal three-dimensional (3D) environments, but we do not yet understand which features of the natural world are used to locate distant objects. Here, we independently and dynamically manipulate 3D objects, airflow fields, and odor plumes in virtual reality over large spatial and temporal scales. We demonstrate that flies make use of features such as foreground segmentation, perspective, motion parallax, and integration of multiple modalities to navigate to objects in a complex 3D landscape while in flight. We first show that tethered flying insects of multiple species navigate to virtual 3D objects. Using the apple fly Rhagoletis pomonella, we then measure their reactive distance to objects and show that these flies use perspective and local parallax cues to distinguish and navigate to virtual objects of different sizes and distances. We also show that apple flies can orient in the absence of optic flow by using only directional airflow cues, and require simultaneous odor and directional airflow input for plume following to a host volatile blend. The elucidation of these features unlocks the opportunity to quantify parameters underlying insect behavior such as reactive space, optimal foraging, and dispersal, as well as develop strategies for pest management, pollination, robotics, and search algorithms.
Collapse
|
9
|
Van de Meulebroucke C, Beckers J, Corten K. What Can We Expect Following Anterior Total Hip Arthroplasty on a Regular Operating Table? A Validation Study of an Artificial Intelligence Algorithm to Monitor Adverse Events in a High-Volume, Nonacademic Setting. J Arthroplasty 2019; 34:2260-2266. [PMID: 31445868 DOI: 10.1016/j.arth.2019.07.039] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 07/17/2019] [Accepted: 07/30/2019] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Quality monitoring is increasingly important to support and assure sustainability of the orthopedic practice. Surgeons in nonacademic settings often lack resources to accurately monitor quality of care. Widespread use of electronic medical records (EMR) provides easier access to medical information, facilitating its analysis. However, manual review of EMRs is highly inefficient. Artificial intelligence (AI) software allows for the development of algorithms for extracting relevant complications from EMRs. We hypothesized that an AI-supported algorithm for complication data extraction would have an accuracy level equal to or higher than manual review after total hip arthroplasty (THA). METHODS A total of 532 consecutive patients underwent 613 THA between January 1 and December 31, 2017. A random derivation cohort (100 patients, 115 hips) was used to determine accuracy. After generation of a gold standard, the algorithm was compared to manual extraction to validate performance in raw data extraction. The full cohort (532 patients, 613 hips) was used to determine recall, precision, and F-value. RESULTS AI accuracy was 95.0%, compared to 94.5% for manual review (P = .69). Recall of 96.0% (84.0%-100%), precision of 88.0% (33%-100%) and F-measure of 0.85 (0.5-1) was achieved for all adverse events. No adverse events were recorded in 80.6%, 1.3% required reintervention and 18.1% had "transient" events. CONCLUSION The use of an automated, AI-supported search algorithm for EMRs provided continuous feedback on the quality of care with a performance level comparable to manual data extraction, but with greater speed. New clinical information surfaced, as 18.1% of patients can be expected to have "transient" problems.
Collapse
Affiliation(s)
| | - Joris Beckers
- Orthopedic Department, Hip Unit, Ziekenhuis Oost-Limburg, Genk, Belgium
| | - Kristoff Corten
- Orthopedic Department, Hip Unit, Ziekenhuis Oost-Limburg, Genk, Belgium
| |
Collapse
|
10
|
Abstract
Most database search tools for proteomics have their own scoring parameter sets depending on experimental conditions such as fragmentation methods, instruments, digestion enzymes, and so on. These scoring parameter sets are usually predefined by tool developers and cannot be modified by users. The number of different experimental conditions grows as the technology develops, and the given set of scoring parameters could be suboptimal for tandem mass spectrometry data acquired using new sample preparation or fragmentation methods. Here we introduce a new approach to optimize scoring parameters in a data-dependent manner using a spectrum quality filter. The new approach conducts a preliminary search for the spectra selected by the spectrum quality filter. Search results from the preliminary search are used to generate data-dependent scoring parameters; then, the full search over the entire input spectra is conducted using the learned scoring parameters. We show that the new approach yields more and better peptide-spectrum matches than the conventional search using built-in scoring parameters when compared at the same 1% false discovery rate.
Collapse
Affiliation(s)
- Hyunjin Jo
- Department of Computer Science , Hanyang University , Seongdong-gu , Seoul 04763 , Korea
| | - Eunok Paek
- Department of Computer Science , Hanyang University , Seongdong-gu , Seoul 04763 , Korea
| |
Collapse
|
11
|
Abstract
Negative electron-transfer dissociation (NETD) has emerged as a premier tool for peptide anion analysis, offering access to acidic post-translational modifications and regions of the proteome that are intractable with traditional positive-mode approaches. Whole-proteome scale characterization is now possible with NETD, but proper informatic tools are needed to capitalize on advances in instrumentation. Currently only one database search algorithm (OMSSA) can process NETD data. Here we implement NETD search capabilities into the Byonic platform to improve the sensitivity of negative-mode data analyses, and we benchmark these improvements using 90 min LC-MS/MS analyses of tryptic peptides from human embryonic stem cells. With this new algorithm for searching NETD data, we improved the number of successfully identified spectra by as much as 80% and identified 8665 unique peptides, 24 639 peptide spectral matches, and 1338 proteins in activated-ion NETD analyses, more than doubling identifications from previous negative-mode characterizations of the human proteome. Furthermore, we reanalyzed our recently published large-scale, multienzyme negative-mode yeast proteome data, improving peptide and peptide spectral match identifications and considerably increasing protein sequence coverage. In all, we show that new informatics tools, in combination with recent advances in data acquisition, can significantly improve proteome characterization in negative-mode approaches.
Collapse
Affiliation(s)
- Nicholas M Riley
- Department of Chemistry, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States.,Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Marshall Bern
- Protein Metrics, Inc. , San Carlos, California 94070, United States
| | - Michael S Westphall
- Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Joshua J Coon
- Department of Chemistry, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States.,Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States.,Department of Biomolecular Chemistry, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| |
Collapse
|
12
|
Xiao X, Agris PF, Hall CK. Introducing folding stability into the score function for computational design of RNA-binding peptides boosts the probability of success. Proteins 2016; 84:700-11. [PMID: 26914059 DOI: 10.1002/prot.25021] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 01/26/2016] [Accepted: 02/10/2016] [Indexed: 12/30/2022]
Abstract
A computational strategy that integrates our peptide search algorithm with atomistic molecular dynamics simulation was used to design rational peptide drugs that recognize and bind to the anticodon stem and loop domain (ASL(Lys3)) of human tRNAUUULys3 for the purpose of interrupting HIV replication. The score function of the search algorithm was improved by adding a peptide stability term weighted by an adjustable factor λ to the peptide binding free energy. The five best peptide sequences associated with five different values of λ were determined using the search algorithm and then input in atomistic simulations to examine the stability of the peptides' folded conformations and their ability to bind to ASL(Lys3). Simulation results demonstrated that setting an intermediate value of λ achieves a good balance between optimizing the peptide's binding ability and stabilizing its folded conformation during the sequence evolution process, and hence leads to optimal binding to the target ASL(Lys3). Thus, addition of a peptide stability term significantly improves the success rate for our peptide design search.
Collapse
Affiliation(s)
- Xingqing Xiao
- Chemical and Biomolecular Engineering Department, North Carolina State University, Raleigh, North Carolina, 27695-7905
| | - Paul F Agris
- The RNA Institute, University at Albany, State University of New York, Albany, New York, 12222
| | - Carol K Hall
- Chemical and Biomolecular Engineering Department, North Carolina State University, Raleigh, North Carolina, 27695-7905
| |
Collapse
|
13
|
Tien M, Kashyap R, Wilson GA, Hernandez-Torres V, Jacob AK, Schroeder DR, Mantilla CB. Retrospective Derivation and Validation of an Automated Electronic Search Algorithm to Identify Post Operative Cardiovascular and Thromboembolic Complications. Appl Clin Inform 2015; 6:565-76. [PMID: 26448798 DOI: 10.4338/aci-2015-03-ra-0026] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 07/28/2015] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND With increasing numbers of hospitals adopting electronic medical records, electronic search algorithms for identifying postoperative complications can be invaluable tools to expedite data abstraction and clinical research to improve patient outcomes. OBJECTIVES To derive and validate an electronic search algorithm to identify postoperative thromboembolic and cardiovascular complications such as deep venous thrombosis, pulmonary embolism, or myocardial infarction within 30 days of total hip or knee arthroplasty. METHODS A total of 34 517 patients undergoing total hip or knee arthroplasty between January 1, 1996 and December 31, 2013 were identified. Using a derivation cohort of 418 patients, several iterations of a free-text electronic search were developed and refined for each complication. Subsequently, the automated search algorithm was validated on an independent cohort of 2 857 patients, and the sensitivity and specificities were compared to the results of manual chart review. RESULTS In the final derivation subset, the automated search algorithm achieved a sensitivity of 91% and specificity of 85% for deep vein thrombosis, a sensitivity of 96% and specificity of 100% for pulmonary embolism, and a sensitivity of 100% and specificity of 95% for myocardial infarction. When applied to the validation cohort, the search algorithm achieved a sensitivity of 97% and specificity of 99% for deep vein thrombosis, a sensitivity of 97% and specificity of 100% for pulmonary embolism, and a sensitivity of 100% and specificity of 99% for myocardial infarction. CONCLUSIONS The derivation and validation of an electronic search strategy can accelerate the data abstraction process for research, quality improvement, and enhancement of patient care, while maintaining superb reliability compared to manual review.
Collapse
Affiliation(s)
- M Tien
- Mayo Clinic, College of Medicine , Rochester, MN, United States
| | - R Kashyap
- Mayo Clinic , Department of Anesthesiology, Rochester, MN, United States
| | - G A Wilson
- Mayo Clinic , Division of Pulmonary and Critical Care Medicine, Rochester, MN, United States
| | - V Hernandez-Torres
- Mayo Clinic , Department of Anesthesiology, Rochester, MN, United States
| | - A K Jacob
- Mayo Clinic , Department of Anesthesiology, Rochester, MN, United States
| | - D R Schroeder
- Mayo Clinic, Health Sciences Research - Biomedical Statistics and Informatics , Rochester, MN, United States
| | - C B Mantilla
- Mayo Clinic , Department of Anesthesiology, Rochester, MN, United States
| |
Collapse
|
14
|
Carroll AJ, Zhang P, Whitehead L, Kaines S, Tcherkez G, Badger MR. PhenoMeter: A Metabolome Database Search Tool Using Statistical Similarity Matching of Metabolic Phenotypes for High-Confidence Detection of Functional Links. Front Bioeng Biotechnol 2015; 3:106. [PMID: 26284240 PMCID: PMC4518198 DOI: 10.3389/fbioe.2015.00106] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 07/10/2015] [Indexed: 12/14/2022] Open
Abstract
This article describes PhenoMeter (PM), a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PM score, was a function of both Pearson correlation and Fisher's Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher's Exact Test used alone. To demonstrate general applicability, we show that the PM reliably retrieved the most closely functionally linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PM is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php).
Collapse
Affiliation(s)
- Adam J. Carroll
- College of Medicine, Biology and Environment, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Peng Zhang
- College of Medicine, Biology and Environment, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Lynne Whitehead
- College of Medicine, Biology and Environment, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Sarah Kaines
- College of Medicine, Biology and Environment, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Guillaume Tcherkez
- College of Medicine, Biology and Environment, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Murray R. Badger
- College of Medicine, Biology and Environment, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| |
Collapse
|
15
|
Ho YY, Cope LM, Parmigiani G. Modular network construction using eQTL data: an analysis of computational costs and benefits. Front Genet 2014; 5:40. [PMID: 24616734 PMCID: PMC3935177 DOI: 10.3389/fgene.2014.00040] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2013] [Accepted: 02/01/2014] [Indexed: 11/30/2022] Open
Abstract
Background: In this paper, we consider analytic methods for the integrated analysis of genomic DNA variation and mRNA expression (also named as eQTL data), to discover genetic networks that are associated with a complex trait of interest. Our focus is the systematic evaluation of the trade-off between network size and network search efficiency in the construction of these networks. Results: We developed a modular approach to network construction, building from smaller networks to larger ones, thereby reducing the search space while including more variables in the analysis. The goal is achieving a lower computational cost while maintaining high confidence in the resulting networks. As demonstrated in our simulation results, networks built in this way have low node/edge false discovery rate (FDR) and high edge sensitivity comparing to greedy search. We further demonstrate our method in a data set of cellular responses to two chemotherapeutic agents: docetaxel and 5-fluorouracil (5-FU), and identify biologically plausible networks that might describe resistances to these drugs. Conclusion: In this study, we suggest that guided comprehensive searches for parsimonious networks should be considered as an alternative to greedy network searches.
Collapse
Affiliation(s)
- Yen-Yi Ho
- Division of Biostatistics, School of Public Health, University of Minnesota Minneapolis, MN, USA
| | - Leslie M Cope
- The Sidney Kimmel Cancer Center, Johns Hopkins School of Medicine Baltimore, MD, USA
| | - Giovanni Parmigiani
- Dana-Farber Cancer Institute and Harvard School of Public Health Boston, MA, USA
| |
Collapse
|
16
|
Xiao X, Hall CK, Agris PF. The design of a peptide sequence to inhibit HIV replication: a search algorithm combining Monte Carlo and self-consistent mean field techniques. J Biomol Struct Dyn 2013; 32:1523-36. [PMID: 24147736 DOI: 10.1080/07391102.2013.825757] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We developed a search algorithm combining Monte Carlo (MC) and self-consistent mean field techniques to evolve a peptide sequence that has good binding capability to the anticodon stem and loop (ASL) of human lysine tRNA species, tRNA(Lys3), with the ultimate purpose of breaking the replication cycle of human immunodeficiency virus-1. The starting point is the 15-amino-acid sequence, RVTHHAFLGAHRTVG, found experimentally by Agris and co-workers to bind selectively to hypermodified tRNA(Lys3). The peptide backbone conformation is determined via atomistic simulation of the peptide-ASL(Lys3) complex and then held fixed throughout the search. The proportion of amino acids of various types (hydrophobic, polar, charged, etc.) is varied to mimic different peptide hydration properties. Three different sets of hydration properties were examined in the search algorithm to see how this affects evolution to the best-binding peptide sequences. Certain amino acids are commonly found at fixed sites for all three hydration states, some necessary for binding affinity and some necessary for binding specificity. Analysis of the binding structure and the various contributions to the binding energy shows that: 1) two hydrophilic residues (asparagine at site 11 and the cysteine at site 12) "recognize" the ASL(Lys3) due to the VDW energy, and thereby contribute to its binding specificity and 2) the positively charged arginines at sites 4 and 13 preferentially attract the negatively charged sugar rings and the phosphate linkages, and thereby contribute to the binding affinity.
Collapse
Affiliation(s)
- Xingqing Xiao
- a Chemical and Biomolecular Engineering Department , North Carolina State University , Raleigh , NC , 27695-7905 , USA
| | | | | |
Collapse
|
17
|
Smischney NJ, Velagapudi VM, Onigkeit JA, Pickering BW, Herasevich V, Kashyap R. Retrospective derivation and validation of a search algorithm to identify emergent endotracheal intubations in the intensive care unit. Appl Clin Inform 2013; 4:419-27. [PMID: 24155793 DOI: 10.4338/aci-2013-05-ra-0033] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Accepted: 08/16/2013] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The development and validation of automated electronic medical record (EMR) search strategies are important in identifying emergent endotracheal intubations in the intensive care unit (ICU). OBJECTIVE To develop and validate an automated search algorithm (strategy) for emergent endotracheal intubation in the critically ill patient. METHODS The EMR search algorithm was created through sequential steps with keywords applied to an institutional EMR database. The search strategy was derived retrospectively through a secondary analysis of a 450-patient subset from the 2,684 patients admitted to either a medical or surgical ICU from January 1, 2010, through December 31, 2011. This search algorithm was validated against an additional 450 randomly selected patients. Sensitivity, specificity, and negative and positive predictive values of the automated search algorithm were compared with a manual medical record review (the reference standard) for data extraction of emergent endotracheal intubations. RESULTS In the derivation subset, the automated electronic note search strategy achieved a sensitivity of 74% (95% CI, 69%-79%) and a specificity of 98% (95% CI, 92%-100%). With refinements in the search algorithm, sensitivity increased to 95% (95% CI, 91%-97%) and specificity decreased to 96% (95% CI, 92%-98%) in this subset. After validation of the algorithm through a separate patient subset, the final reported sensitivity and specificity were 95% (95% CI, 86%-99%) and 100% (95% CI, 98%-100%). CONCLUSIONS Use of electronic search algorithms allows for correct extraction of emergent endotracheal intubations in the ICU, with high degrees of sensitivity and specificity. Such search algorithms are a reliable alternative to manual chart review for identification of emergent endotracheal intubations.
Collapse
|
18
|
Abstract
Various insects and small animals can navigate in turbulent streams to find their mates (or food) from sparse pheromone (odor) detections. Their access to internal space perception and use of cognitive maps still are heavily debated, but for some of them, limited space perception seems to be the rule. However, this poor space perception does not prevent them from impressive search capacities. Here, as an attempt to model these behaviors, we propose a scheme that can perform, even without a detailed internal space map, searches in turbulent streams. The algorithm is based on a standardized projection of the probability of the source position to remove space perception and on the evaluation of a free energy, whose minimization along the path gives direction to the searcher. An internal "temperature" allows active control of the exploration/exploitation balance during the search. We demonstrate the efficiency of the scheme numerically, with a computational model of odor plume propagation, and experimentally, with robotic searches of thermal sources in turbulent streams. In addition to being a model to describe animals' searches, this scheme may be applied to robotic searches in complex varying media without odometry error corrections and in problems in which active control of the exploration/exploitation balance is profitable.
Collapse
|
19
|
Juliá M, Gil A, Reinoso O. Searching dynamic agents with a team of mobile robots. Sensors (Basel) 2012; 12:8815-31. [PMID: 23012519 DOI: 10.3390/s120708815] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 06/15/2012] [Accepted: 06/20/2012] [Indexed: 11/16/2022]
Abstract
This paper presents a new algorithm that allows a team of robots to cooperatively search for a set of moving targets. An estimation of the areas of the environment that are more likely to hold a target agent is obtained using a grid-based Bayesian filter. The robot sensor readings and the maximum speed of the moving targets are used in order to update the grid. This representation is used in a search algorithm that commands the robots to those areas that are more likely to present target agents. This algorithm splits the environment in a tree of connected regions using dynamic programming. This tree is used in order to decide the destination for each robot in a coordinated manner. The algorithm has been successfully tested in known and unknown environments showing the validity of the approach.
Collapse
|
20
|
Petrella RJ. A versatile method for systematic conformational searches: application to CheY. J Comput Chem 2011; 32:2369-85. [PMID: 21557263 PMCID: PMC3298744 DOI: 10.1002/jcc.21817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2010] [Revised: 03/01/2011] [Accepted: 03/20/2011] [Indexed: 12/27/2022]
Abstract
A novel molecular structure prediction method, the Z Method, is described. It provides a versatile platform for the development and use of systematic, grid-based conformational search protocols, in which statistical information (i.e., rotamers) can also be included. The Z Method generates trial structures by applying many changes of the same type to a single starting structure, thereby sampling the conformation space in an unbiased way. The method, implemented in the CHARMM program as the Z Module, is applied here to an illustrative model problem in which rigid, systematic searches are performed in a 36-dimensional conformational space that describes the relative positions of the 10 secondary structural elements of the protein CheY. A polar hydrogen representation with an implicit solvation term (EEF1) is used to evaluate successively larger fragments of the protein generated in a hierarchical build-up procedure. After a final refinement stage, and a total computational time of about two-and-a-half CPU days on AMD Opteron processors, the prediction is within 1.56 Å of the native structure. The errors in the predicted backbone dihedral angles are found to approximately cancel. Monte Carlo and simulated annealing trials on the same or smaller versions of the problem, using the same atomic model and energy terms, are shown to result in less accurate predictions. Although the problem solved here is a limited one, the findings illustrate the utility of systematic searches with atom-based models for macromolecular structure prediction and the importance of unbiased sampling in structure prediction methods.
Collapse
Affiliation(s)
- Robert J Petrella
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, USA.
| |
Collapse
|
21
|
Furenlid LR, Hesterman JY, Barrett HH. Fast maximum-likelihood estimation methods for scintillation cameras and other optical sensors. Proc SPIE Int Soc Opt Eng 2007; 6707. [PMID: 26347027 DOI: 10.1117/12.740321] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Maximum-likelihood estimation methods offer many advantages for processing experimental data to extract information, especially when combined with carefully measured calibration data. There are many tasks relevant to x-ray and gamma-ray detection that can be addressed with a new, fast ML-search algorithm that can be implemented in hardware or software. Example applications include gamma-ray event position, energy, and timing estimation, as well as general applications in optical testing and wave-front sensing.
Collapse
Affiliation(s)
- L R Furenlid
- Department of Radiology, University of Arizona Tucson, AZ 85724 ; College of Optical Sciences, University of Arizona Tucson, AZ 85724
| | - J Y Hesterman
- College of Optical Sciences, University of Arizona Tucson, AZ 85724
| | - H H Barrett
- Department of Radiology, University of Arizona Tucson, AZ 85724 ; College of Optical Sciences, University of Arizona Tucson, AZ 85724
| |
Collapse
|