1
|
Poon BK, Terwilliger TC, Adams PD. The Phenix-AlphaFold webservice: Enabling AlphaFold predictions for use in Phenix. Protein Sci 2024; 33:e4992. [PMID: 38647406 PMCID: PMC11034488 DOI: 10.1002/pro.4992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 03/01/2024] [Accepted: 03/31/2024] [Indexed: 04/25/2024]
Abstract
Advances in machine learning have enabled sufficiently accurate predictions of protein structure to be used in macromolecular structure determination with crystallography and cryo-electron microscopy data. The Phenix software suite has AlphaFold predictions integrated into an automated pipeline that can start with an amino acid sequence and data, and automatically perform model-building and refinement to return a protein model fitted into the data. Due to the steep technical requirements of running AlphaFold efficiently, we have implemented a Phenix-AlphaFold webservice that enables all Phenix users to run AlphaFold predictions remotely from the Phenix GUI starting with the official 1.21 release. This webservice will be improved based on how it is used by the research community and the future research directions for Phenix.
Collapse
Affiliation(s)
- Billy K. Poon
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
| | - Thomas C. Terwilliger
- New Mexico ConsortiumLos AlamosNew MexicoUSA
- Los Alamos National LaboratoryLos AlamosNew MexicoUSA
| | - Paul D. Adams
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCaliforniaUSA
- Department of BioengineeringUniversity of California, BerkeleyBerkeleyCaliforniaUSA
| |
Collapse
|
2
|
Stachowski TR, Fischer M. FLEXR GUI: a graphical user interface for multi-conformer modeling of proteins. J Appl Crystallogr 2024; 57:580-586. [PMID: 38596743 PMCID: PMC11001397 DOI: 10.1107/s1600576724001523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 02/14/2024] [Indexed: 04/11/2024] Open
Abstract
Proteins are well known 'shapeshifters' which change conformation to function. In crystallography, multiple conformational states are often present within the crystal and the resulting electron-density map. Yet, explicitly incorporating alternative states into models to disentangle multi-conformer ensembles is challenging. We previously reported the tool FLEXR, which, within a few minutes, automatically separates conformational signal from noise and builds the corresponding, often missing, structural features into a multi-conformer model. To make the method widely accessible for routine multi-conformer building as part of the computational toolkit for macromolecular crystallography, we present a graphical user interface (GUI) for FLEXR, designed as a plugin for Coot 1. The GUI implementation seamlessly connects FLEXR models with the existing suite of validation and modeling tools available in Coot. We envision that FLEXR will aid crystallographers by increasing access to a multi-conformer modeling method that will ultimately lead to a better representation of protein conformational heterogeneity in the Protein Data Bank. In turn, deeper insights into the protein conformational landscape may inform biology or provide new opportunities for ligand design. The code is open source and freely available on GitHub at https://github.com/TheFischerLab/FLEXR-GUI.
Collapse
Affiliation(s)
- Timothy R. Stachowski
- Department of Chemical Biology and Therapeutics, St Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Marcus Fischer
- Department of Chemical Biology and Therapeutics, St Jude Children’s Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
3
|
Usón I, Sheldrick GM. Modes and model building in SHELXE. Acta Crystallogr D Struct Biol 2024; 80:4-15. [PMID: 38088896 PMCID: PMC10833347 DOI: 10.1107/s2059798323010082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 11/21/2023] [Indexed: 01/12/2024] Open
Abstract
Density modification is a standard step to provide a route for routine structure solution by any experimental phasing method, with single-wavelength or multi-wavelength anomalous diffraction being the most popular methods, as well as to extend fragments or incomplete models into a full solution. The effect of density modification on the starting maps from either source is illustrated in the case of SHELXE. The different modes in which the program can run are reviewed; these include less well known uses such as reading external phase values and weights or phase distributions encoded in Hendrickson-Lattman coefficients. Typically in SHELXE, initial phases are calculated from experimental data, from a partial model or map, or from a combination of both sources. The initial phase set is improved and extended by density modification and, if the resolution of the data and the type of structure permits, polyalanine tracing. As a feature to systematically eliminate model bias from phases derived from predicted models, the trace can be set to exclude the area occupied by the starting model. The trace now includes an extension into the gamma position or hydrophobic and aromatic side chains if a sequence is provided, which is performed in every tracing cycle. Once a correlation coefficient of over 30% between the structure factors calculated from such a trace and the native data indicates that the structure has been solved, the sequence is docked in all model-building cycles and side chains are fitted if the map supports it. The extensions to the tracing algorithm brought in to provide a complete model are discussed. The improvement in phasing performance is assessed using a set of tests.
Collapse
Affiliation(s)
- Isabel Usón
- ICREA, Institució Catalana de Recerca i Estudis Avançats, Passeig Lluís Companys, 23, Barcelona, E-08003, Spain
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB-CSIC), Barcelona Science Park, Helix Building, Baldiri Reixach, 15, Barcelona, 08028, Spain
| | - George M. Sheldrick
- Department of Structural Chemistry, Georg-August Universität Göttingen, Tammannstrasse 4, 37077 Göttingen, Germany
| |
Collapse
|
4
|
Lin YT, Lin PT, Lin CC, Wu TH, Liu LT, Su CW, Teng W, Tsai CY, Huang CH, Chen WT, Chan KM, Hsu CW, Lin CY, Lin SM, Chien RN. Adding nutritional status to the original BCLC stage improves mortality prediction for hepatocellular carcinoma patients in HBV-endemic regions. Am J Cancer Res 2023; 13:3618-3628. [PMID: 37693156 PMCID: PMC10492128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 07/25/2023] [Indexed: 09/12/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is associated with high mortality, especially in Asian populations where chronic HBV infection is a major cause. Accurate prediction of mortality can assist clinical decision-making. We aim to (i) compare the predicting ability of Barcelona Clinic Liver Cancer classification (BCLC) stage, neutrophil-to-lymphocyte ratio (NLR), and Albumin-Bilirubin (ALBI) score in predicting short-term mortality (one- and two-year) and (ii) develop a novel model with improved accuracy compared to the conventional models. This study enrolled 298 consecutive HCC patients from our hepatology department. The prognostic values for mortality were assessed by area under the receiver operating characteristic curve (AUROC) analysis. A novel model was established and internally validated using 5-fold cross-validation, followed by external validation in a cohort of 100 patients. The primary etiology of cirrhosis was hepatitis B virus (HBV), with 81.2% of HCC patients having preserved liver function. Significant differences were observed in hemoglobin (Hb) and serum albumin levels, which reflect patients' nutrition status, between patients who survived for one year and those who died. BCLC exhibited superior predictive accuracy compared to NLR but had borderline superiority to the ALBI score. Therefore, a novel model incorporating BCLC, Hb, and serum albumin was developed, internally and externally validated, as well as subgroup sensitivity analysis. The model exhibited significantly higher predictive accuracy for one- and two-year mortality than conventional prognostic predictors, with AUROC values of 0.841 and 0.805, respectively. The novel "BCLC-Nutrition Model", which incorporates BCLC, Hb, and serum albumin, may provide improved predictive accuracy for short-term mortality in HCC patients compared to commonly used prognostic scores. This emphasizes the importance of nutrition in the management of HCC patients.
Collapse
Affiliation(s)
- Yan-Ting Lin
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
| | - Po-Ting Lin
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Chen-Chun Lin
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Tsung-Han Wu
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
- Department of General Surgery, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
| | - Li-Tong Liu
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
| | - Chung-Wei Su
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Wei Teng
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Chun-Yi Tsai
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
- Department of General Surgery, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
| | - Chien-Hao Huang
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Wei-Ting Chen
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Kun-Ming Chan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
- Department of General Surgery, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
| | - Chao-Wei Hsu
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Chun-Yen Lin
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Shi-Ming Lin
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| | - Rong-Nan Chien
- Division of Hepatology, Department of Gastroenterology and Hepatology, Chang-Gung Memorial Hospital, Linkou Medical CenterTaoyuan, Taiwan
- College of Medicine, Chang-Gung UniversityTaoyuan, Taiwan
| |
Collapse
|
5
|
Reggiano G, Lugmayr W, Farrell D, Marlovits TC, DiMaio F. Residue-level error detection in cryoelectron microscopy models. Structure 2023; 31:860-869.e4. [PMID: 37253357 PMCID: PMC10330749 DOI: 10.1016/j.str.2023.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/16/2023] [Accepted: 05/03/2023] [Indexed: 06/01/2023]
Abstract
Building accurate protein models into moderate resolution (3-5 Å) cryoelectron microscopy (cryo-EM) maps is challenging and error prone. We have developed MEDIC (Model Error Detection in Cryo-EM), a robust statistical model that identifies local backbone errors in protein structures built into cryo-EM maps by combining local fit-to-density with deep-learning-derived structural information. MEDIC is validated on a set of 28 structures that were subsequently solved to higher resolutions, where we identify the differences between low- and high-resolution structures with 68% precision and 60% recall. We additionally use this model to fix over 100 errors in 12 deposited structures and to identify errors in 4 refined AlphaFold predictions with 80% precision and 60% recall. As modelers more frequently use deep learning predictions as a starting point for refinement and rebuilding, MEDIC's ability to handle errors in structures derived from hand-building and machine learning methods makes it a powerful tool for structural biologists.
Collapse
Affiliation(s)
- Gabriella Reggiano
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Wolfgang Lugmayr
- University Medical Center Hamburg-Eppendorf (UKE), Institute of Structural and Systems Biology, Hamburg, Germany; CSSB Centre for Structural Systems Biology, Hamburg, Germany; Deutsches Elektronen Synchrotron (DESY), Hamburg, Germany
| | | | - Thomas C Marlovits
- University Medical Center Hamburg-Eppendorf (UKE), Institute of Structural and Systems Biology, Hamburg, Germany; CSSB Centre for Structural Systems Biology, Hamburg, Germany; Deutsches Elektronen Synchrotron (DESY), Hamburg, Germany
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
6
|
Stachowski TR, Fischer M. FLEXR: automated multi-conformer model building using electron-density map sampling. Acta Crystallogr D Struct Biol 2023; 79:354-367. [PMID: 37071395 PMCID: PMC10167668 DOI: 10.1107/s2059798323002498] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 03/13/2023] [Indexed: 04/19/2023] Open
Abstract
Protein conformational dynamics that may inform biology often lie dormant in high-resolution electron-density maps. While an estimated ∼18% of side chains in high-resolution models contain alternative conformations, these are underrepresented in current PDB models due to difficulties in manually detecting, building and inspecting alternative conformers. To overcome this challenge, we developed an automated multi-conformer modeling program, FLEXR. Using Ringer-based electron-density sampling, FLEXR builds explicit multi-conformer models for refinement. Thereby, it bridges the gap of detecting hidden alternate states in electron-density maps and including them in structural models for refinement, inspection and deposition. Using a series of high-quality crystal structures (0.8-1.85 Å resolution), we show that the multi-conformer models produced by FLEXR uncover new insights that are missing in models built either manually or using current tools. Specifically, FLEXR models revealed hidden side chains and backbone conformations in ligand-binding sites that may redefine protein-ligand binding mechanisms. Ultimately, the tool facilitates crystallographers with opportunities to include explicit multi-conformer states in their high-resolution crystallographic models. One key advantage is that such models may better reflect interesting higher energy features in electron-density maps that are rarely consulted by the community at large, which can then be productively used for ligand discovery downstream. FLEXR is open source and publicly available on GitHub at https://github.com/TheFischerLab/FLEXR.
Collapse
Affiliation(s)
- Timothy R Stachowski
- Department of Chemical Biology and Therapeutics, St Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Marcus Fischer
- Department of Chemical Biology and Therapeutics, St Jude Children's Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
7
|
Alharbi E, Calinescu R, Cowtan K. Buccaneer model building with neural network fragment selection. Acta Crystallogr D Struct Biol 2023; 79:326-338. [PMID: 36974965 PMCID: PMC10071564 DOI: 10.1107/s205979832300181x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 02/27/2023] [Indexed: 03/29/2023] Open
Abstract
Tracing the backbone is a critical step in protein model building, as incorrect tracing leads to poor protein models. Here, a neural network trained to identify unfavourable fragments and remove them from the model-building process in order to improve backbone tracing is presented. Moreover, a decision tree was trained to select an optimal threshold to eliminate unfavourable fragments. The neural network was tested on experimental phasing data sets from the Joint Center for Structural Genomics (JCSG), recently deposited experimental phasing data sets (from 2015 to 2021) and molecular-replacement data sets. The experimental results show that using the neural network in the Buccaneer protein-model-building software can produce significantly more complete protein models than those built using Buccaneer alone. In particular, Buccaneer with the neural network built protein models with a completeness that was at least 5% higher for 25% and 50% of the original and truncated resolution JCSG experimental phasing data sets, respectively, for 28% of the recently collected experimental phasing data sets and for 43% of the molecular-replacement data sets.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| |
Collapse
|
8
|
Terwilliger TC, Afonine PV, Liebschner D, Croll TI, McCoy AJ, Oeffner RD, Williams CJ, Poon BK, Richardson JS, Read RJ, Adams PD. Accelerating crystal structure determination with iterative AlphaFold prediction. Acta Crystallogr D Struct Biol 2023; 79:234-244. [PMID: 36876433 PMCID: PMC9986801 DOI: 10.1107/s205979832300102x] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 02/03/2023] [Indexed: 02/28/2023] Open
Abstract
Experimental structure determination can be accelerated with artificial intelligence (AI)-based structure-prediction methods such as AlphaFold. Here, an automatic procedure requiring only sequence information and crystallographic data is presented that uses AlphaFold predictions to produce an electron-density map and a structural model. Iterating through cycles of structure prediction is a key element of this procedure: a predicted model rebuilt in one cycle is used as a template for prediction in the next cycle. This procedure was applied to X-ray data for 215 structures released by the Protein Data Bank in a recent six-month period. In 87% of cases our procedure yielded a model with at least 50% of Cα atoms matching those in the deposited models within 2 Å. Predictions from the iterative template-guided prediction procedure were more accurate than those obtained without templates. It is concluded that AlphaFold predictions obtained based on sequence information alone are usually accurate enough to solve the crystallographic phase problem with molecular replacement, and a general strategy for macromolecular structure determination that includes AI-based prediction both as a starting point and as a method of model optimization is suggested.
Collapse
Affiliation(s)
| | - Pavel V Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dorothee Liebschner
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Tristan I Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Airlie J McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Robert D Oeffner
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | | | - Billy K Poon
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Paul D Adams
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
9
|
Kirchebner J, Lau S, Machetanz L. Offenders and non-offenders with schizophrenia spectrum disorders: Do they really differ in known risk factors for aggression? Front Psychiatry 2023; 14:1145644. [PMID: 37139319 PMCID: PMC10150953 DOI: 10.3389/fpsyt.2023.1145644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 03/17/2023] [Indexed: 05/05/2023] Open
Abstract
Introduction Individuals with schizophrenia spectrum disorders (SSD) have an elevated risk for aggressive behavior, and several factors contributing to this risk have been identified, e. g. comorbid substance use disorders. From this knowledge, it could be inferred that offender patients show a higher expression of said risk factors than non-offender patients. Yet, there is a lack of comparative studies between those two groups, and findings gathered from one of the two are not directly applicable to the other due to numerous structural differences. The aim of this study therefore was to identify key differences in offender patients and non-offender patients regarding aggressive behavior through application of supervised machine learning, and to quantify the performance of the model. Methods For this purpose, we applied seven different (ML) algorithms on a dataset comprising 370 offender patients and a comparison group of 370 non-offender patients, both with a schizophrenia spectrum disorder. Results With a balanced accuracy of 79.9%, an AUC of 0.87, a sensitivity of 77.3% and a specificity of 82.5%, gradient boosting emerged as best performing model and was able to correctly identify offender patients in over 4/5 the cases. Out of 69 possible predictor variables, the following emerged as the ones with the most indicative power in distinguishing between the two groups: olanzapine equivalent dose at the time of discharge from the referenced hospitalization, failures during temporary leave, being born outside of Switzerland, lack of compulsory school graduation, out- and inpatient treatment(s) prior to the referenced hospitalization, physical or neurological illness as well as medication compliance. Discussion Interestingly, both factors related to psychopathology and to the frequency and expression of aggression itself did not yield a high indicative power in the interplay of variables, thus suggesting that while they individually contribute to aggression as a negative outcome, they are compensable through certain interventions. The findings contribute to our understanding of differences between offenders and non-offenders with SSD, showing that previously described risk factors of aggression may be counteracted through sufficient treatment and integration in the mental health care system.
Collapse
|
10
|
Machetanz L, Huber D, Lau S, Kirchebner J. Model Building in Forensic Psychiatry: A Machine Learning Approach to Screening Offender Patients with SSD. Diagnostics (Basel) 2022; 12:diagnostics12102509. [PMID: 36292198 PMCID: PMC9600890 DOI: 10.3390/diagnostics12102509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/28/2022] [Accepted: 10/13/2022] [Indexed: 11/16/2022] Open
Abstract
Today’s extensive availability of medical data enables the development of predictive models, but this requires suitable statistical methods, such as machine learning (ML). Especially in forensic psychiatry, a complex and cost-intensive field with risk assessments and predictions of treatment outcomes as central tasks, there is a need for such predictive tools, for example, to anticipate complex treatment courses and to be able to offer appropriate therapy on an individualized basis. This study aimed to develop a first basic model for the anticipation of adverse treatment courses based on prior compulsory admission and/or conviction as simple and easily objectifiable parameters in offender patients with a schizophrenia spectrum disorder (SSD). With a balanced accuracy of 67% and an AUC of 0.72, gradient boosting proved to be the optimal ML algorithm. Antisocial behavior, physical violence against staff, rule breaking, hyperactivity, delusions of grandeur, fewer feelings of guilt, the need for compulsory isolation, cannabis abuse/dependence, a higher dose of antipsychotics (measured by the olanzapine half-life) and an unfavorable legal prognosis emerged as the ten most influential variables out of a dataset with 209 parameters. Our findings could demonstrate an example of the use of ML in the development of an easy-to-use predictive model based on few objectifiable factors.
Collapse
|
11
|
Alharbi E, Bond P, Calinescu R, Cowtan K. Predicting the performance of automated crystallographic model-building pipelines. Acta Crystallogr D Struct Biol 2021; 77:1591-1601. [PMID: 34866614 PMCID: PMC8647178 DOI: 10.1107/s2059798321010500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 10/10/2021] [Indexed: 12/02/2022] Open
Abstract
Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
- Department of Information Technology, University of Tabuk, Tabuk, Saudi Arabia
| | - Paul Bond
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| |
Collapse
|
12
|
Yang Y, Li D. Medical Data Feature Learning Based on Probability and Depth Learning Mining: Model Development and Validation. JMIR Med Inform 2021; 9:e19055. [PMID: 33830067 PMCID: PMC8063096 DOI: 10.2196/19055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 05/08/2020] [Accepted: 05/08/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Big data technology provides unlimited potential for efficient storage, processing, querying, and analysis of medical data. Technologies such as deep learning and machine learning simulate human thinking, assist physicians in diagnosis and treatment, provide personalized health care services, and promote the use of intelligent processes in health care applications. OBJECTIVE The aim of this paper was to analyze health care data and develop an intelligent application to predict the number of hospital outpatient visits for mass health impact and analyze the characteristics of health care big data. Designing a corresponding data feature learning model will help patients receive more effective treatment and will enable rational use of medical resources. METHODS A cascaded depth model was successfully implemented by constructing a cascaded depth learning framework and by studying and analyzing the specific feature transformation, feature selection, and classifier algorithm used in the framework. To develop a medical data feature learning model based on probabilistic and deep learning mining, we mined information from medical big data and developed an intelligent application that studies the differences in medical data for disease risk assessment and enables feature learning of the related multimodal data. Thus, we propose a cascaded data feature learning model. RESULTS The depth model created in this paper is more suitable for forecasting daily outpatient volumes than weekly or monthly volumes. We believe that there are two reasons for this: on the one hand, the training data set in the daily outpatient volume forecast model is larger, so the training parameters of the model more closely fit the actual data relationship. On the other hand, the weekly and monthly outpatient volume is the cumulative daily outpatient volume; therefore, errors caused by the prediction will gradually accumulate, and the greater the interval, the lower the prediction accuracy. CONCLUSIONS Several data feature learning models are proposed to extract the relationships between outpatient volume data and obtain the precise predictive value of the outpatient volume, which is very helpful for the rational allocation of medical resources and the promotion of intelligent medical treatment.
Collapse
Affiliation(s)
- Yuanlin Yang
- Department of Logistics Management, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Obstetric and Gynecologic and Pediatric Disease and Birth Defects of Ministry of Education, Sichuan University, Chengdu, China
| | - Dehua Li
- Key Laboratory of Obstetric and Gynecologic and Pediatric Disease and Birth Defects of Ministry of Education, Sichuan University, Chengdu, China.,Quality Assessment Office, Nursing Department, West China Second University Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
13
|
Terwilliger TC, Sobolev OV, Afonine PV, Adams PD, Ho CM, Li X, Zhou ZH. Protein identification from electron cryomicroscopy maps by automated model building and side-chain matching. Acta Crystallogr D Struct Biol 2021; 77:457-462. [PMID: 33825706 PMCID: PMC8025881 DOI: 10.1107/s2059798321001765] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 02/12/2021] [Indexed: 11/10/2022] Open
Abstract
Using single-particle electron cryo-microscopy (cryo-EM), it is possible to obtain multiple reconstructions showing the 3D structures of proteins imaged as a mixture. Here, it is shown that automatic map interpretation based on such reconstructions can be used to create atomic models of proteins as well as to match the proteins to the correct sequences and thereby to identify them. This procedure was tested using two proteins previously identified from a mixture at resolutions of 3.2 Å, as well as using 91 deposited maps with resolutions between 2 and 4.5 Å. The approach is found to be highly effective for maps obtained at resolutions of 3.5 Å and better, and to have some utility at resolutions as low as 4 Å.
Collapse
Affiliation(s)
- Thomas C. Terwilliger
- New Mexico Consortium, Los Alamos, NM 87544, USA
- Bioscience Division, Los Alamos National Laboratory, Mail Stop M888, Los Alamos, NM 87545, USA
| | - Oleg V. Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Pavel V. Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Paul D. Adams
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Bioengineering, University of California Berkeley, Berkeley, California, USA
| | - Chi-Min Ho
- The Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA 90095, USA
- California NanoSystems Institute, University of California, Los Angeles, CA 90095, USA
- Department of Microbiology and Immunology, Vagelos College of Physicians and Surgeons, Columbia University, New York, USA
| | - Xiaorun Li
- California NanoSystems Institute, University of California, Los Angeles, CA 90095, USA
- Hefei National Laboratory for Physical Sciences at Microscale, University of Science and Technology of China, Hefei, Anhui 230026, People’s Republic of China
| | - Z. Hong Zhou
- The Molecular Biology Institute, University of California, Los Angeles, CA 90095, USA
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA 90095, USA
- California NanoSystems Institute, University of California, Los Angeles, CA 90095, USA
| |
Collapse
|
14
|
Abstract
When building atomic models into weak and/or low-resolution density, a common strategy is to restrain their conformation to that of a higher resolution model of the same or similar sequence. When doing so, it is important to avoid over-restraining to the reference model in the face of disagreement with the experimental data. The most common strategy for this is the use of `top-out' potentials. These act like simple harmonic restraints within a defined range, but gradually weaken when the deviation between the model and reference grows beyond that range. In each current implementation the rate at which the potential flattens at large deviations follows a fixed form, although the form chosen varies among implementations. A restraint potential with a tuneable rate of flattening would provide greater flexibility to encode the confidence in any given restraint. Here, two new such potentials are described: a Cartesian distance restraint derived from a recent generalization of common loss functions and a periodic torsion restraint based on a renormalization of the von Mises distribution. Further, their implementation as user-adjustable/switchable restraints in ISOLDE is described and their use in some real-world examples is demonstrated.
Collapse
Affiliation(s)
- Tristan Ian Croll
- Cambridge Institute for Medical Research, Keith Peters Building, Cambridge CB2 0XY, United Kingdom
| | - Randy J. Read
- Cambridge Institute for Medical Research, Keith Peters Building, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
15
|
Chojnowski G, Sobolev E, Heuser P, Lamzin VS. The accuracy of protein models automatically built into cryo-EM maps with ARP/wARP. Acta Crystallogr D Struct Biol 2021; 77:142-150. [PMID: 33559604 PMCID: PMC7869898 DOI: 10.1107/s2059798320016332] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 12/16/2020] [Indexed: 11/26/2022] Open
Abstract
A new module of the ARP/wARP suite for automated model building into cryo-EM maps is presented. Recent developments in cryogenic electron microscopy (cryo-EM) have enabled structural studies of large macromolecular complexes at resolutions previously only attainable using macromolecular crystallography. Although a number of methods can already assist in de novo building of models into high-resolution cryo-EM maps, automated and reliable map interpretation remains a challenge. Presented here is a systematic study of the accuracy of models built into cryo-EM maps using ARP/wARP. It is demonstrated that the local resolution is a good indicator of map interpretability, and for the majority of the test cases ARP/wARP correctly builds 90% of main-chain fragments in regions where the local resolution is 4.0 Å or better. It is also demonstrated that the coordinate accuracy for models built into cryo-EM maps is comparable to that of X-ray crystallographic models at similar local cryo-EM and crystallographic resolutions. The model accuracy also correlates with the refined atomic displacement parameters.
Collapse
Affiliation(s)
- Grzegorz Chojnowski
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Egor Sobolev
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Philipp Heuser
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Victor S Lamzin
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| |
Collapse
|
16
|
Todd H, Emsley P. Development and assessment of CootVR, a virtual reality computer program for model building. Acta Crystallogr D Struct Biol 2021; 77:19-27. [PMID: 33404522 PMCID: PMC7787110 DOI: 10.1107/s2059798320013625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Accepted: 10/12/2020] [Indexed: 12/03/2022]
Abstract
Virtual reality-specific tools for model building are possible, and can provide an order-of-magnitude speedup over mouse-and-keyboard tools in certain situations. Biological macromolecules have complex three-dimensional shapes that are experimentally examined using X-ray crystallography and electron cryo-microscopy. Interpreting the data that these methods yield involves building 3D atomic models. With almost every data set, some portion of the time put into creating these models must be spent manually modifying the model in order to make it consistent with the data; this is difficult and time-consuming, in part because the data are ‘blurry’ in three dimensions. This paper describes the design and assessment of CootVR (available at http://hamishtodd1.github.io/cvr), a prototype computer program for performing this task in virtual reality, allowing structural biologists to build molecular models into cryo-EM and crystallographic data using their hands. CootVR was timed against Coot for a very specific model-building task, and was found to give an order-of-magnitude speedup for this task. A from-scratch model build using CootVR was also attempted; from this experience it is concluded that currently CootVR does not give a speedup over Coot overall.
Collapse
Affiliation(s)
- Hamish Todd
- Huawei Research and Development, Cambridge, United Kingdom
| | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
17
|
Xu J, Nie H, He J, Wang X, Liao K, Tu L, Xiong Z. Using Machine Learning Modeling to Explore New Immune-Related Prognostic Markers in Non-Small Cell Lung Cancer. Front Oncol 2020; 10:550002. [PMID: 33215029 PMCID: PMC7665579 DOI: 10.3389/fonc.2020.550002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Accepted: 09/30/2020] [Indexed: 02/06/2023] Open
Abstract
OBJECTIVE To find new immune-related prognostic markers for non-small cell lung cancer (NSCLC). METHODS We found GSE14814 is related to NSCLC in GEO database. The non-small cell lung cancer observation (NSCLC-OBS) group was evaluated for immunity and divided into high and low groups for differential gene screening according to the score of immune evaluation. A single factor COX regression analysis was performed to select the genes related to prognosis. A prognostic model was constructed by machine learning, and test whether the model has a test efficacy for prognosis. A chip-in-chip non-small cell lung cancer chemotherapy (NSCLC-ACT) sample was used as a validation dataset for the same validation and prognostic analysis of the model. The coexpression genes of hub genes were obtained by pearson analysis and gene enrichment, function enrichment and protein interaction analysis. The tumor samples of patients with different clinical stages were detected by immunohistochemistry and the expression difference of prognostic genes in tumor tissues of patients with different stages was compared. RESULTS By screening, we found that LYN, C3, COPG2IT1, HLA.DQA1, and TNFRSF17 is closely related to prognosis. After machine learning, we constructed the immune prognosis model from these 5 genes, and the model AUC values were greater than 0.9 at three time periods of 1, 3, and 5 years; the total survival period of the low-risk group was significantly better than that of the high-risk group. The results of prognosis analysis in ACT samples were consistent with OBS groups. The coexpression genes are mainly involved B cell receptor signaling pathway and are mainly enriched in apoptotic cell clearance. Prognostic key genes are highly correlated with PDCD1, PDCD1LG2, LAG3, and CTLA4 immune checkpoints. The immunohistochemical results showed that the expression of COPG2IT1 and HLA.DQA1 in stage III increased significantly and the expression of LYN, C3, and TNFRSF17 in stage III decreased significantly compared with that of stage I. The experimental results are consistent with the previous analysis. CONCLUSION LYN, C3, COPG2IT1, LA.DQA1, and NFRSF17 may be new immune markers to judge the prognosis of patients with non-small cell lung cancer.
Collapse
Affiliation(s)
- Jiasheng Xu
- Department of Pathology, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Han Nie
- Department of Vascular Surgery, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Jiarui He
- Department of Vascular Surgery, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Xinlu Wang
- Department of Vascular Surgery, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Kaili Liao
- Department of Clinical Laboratory, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Luxia Tu
- Department of Pathology, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Zhenfang Xiong
- Department of Pathology, The First Affiliated Hospital of Nanchang University, Nanchang, China
| |
Collapse
|
18
|
Rochira W, Agirre J. Iris: Interactive all-in-one graphical validation of 3D protein model iterations. Protein Sci 2020; 30:93-107. [PMID: 32964594 PMCID: PMC7737763 DOI: 10.1002/pro.3955] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 09/15/2020] [Accepted: 09/15/2020] [Indexed: 11/12/2022]
Abstract
Iris validation is a Python package created to represent comprehensive per‐residue validation metrics for entire protein chains in a compact, readable and interactive view. These metrics can either be calculated by Iris, or by a third‐party program such as MolProbity. We show that those parts of a protein model requiring attention may generate ripples across the metrics on the diagram, immediately catching the modeler's attention. Iris can run as a standalone tool, or be plugged into existing structural biology software to display per‐chain model quality at a glance, with a particular emphasis on evaluating incremental changes resulting from the iterative nature of model building and refinement. Finally, the integration of Iris into the CCP4i2 graphical user interface is provided as a showcase of its pluggable design.
Collapse
Affiliation(s)
- William Rochira
- Department of Chemistry, York Structural Biology Laboratory, University of York, York, UK
| | - Jon Agirre
- Department of Chemistry, York Structural Biology Laboratory, University of York, York, UK
| |
Collapse
|
19
|
Alharbi E, Calinescu R, Cowtan K. Pairwise running of automated crystallographic model-building pipelines. Acta Crystallogr D Struct Biol 2020; 76:814-823. [PMID: 32876057 PMCID: PMC7466752 DOI: 10.1107/s2059798320010542] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Accepted: 07/31/2020] [Indexed: 11/11/2022] Open
Abstract
For the last two decades, researchers have worked independently to automate protein model building, and four widely used software pipelines have been developed for this purpose: ARP/wARP, Buccaneer, Phenix AutoBuild and SHELXE. Here, the usefulness of combining these pipelines to improve the built protein structures by running them in pairwise combinations is examined. The results show that integrating these pipelines can lead to significant improvements in structure completeness and Rfree. In particular, running Phenix AutoBuild after Buccaneer improved structure completeness for 29% and 75% of the data sets that were examined at the original resolution and at a simulated lower resolution, respectively, compared with running Phenix AutoBuild on its own. In contrast, Phenix AutoBuild alone produced better structure completeness than the two pipelines combined for only 7% and 3% of these data sets.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
- Department of Information Technology, University of Tabuk, Tabuk, Saudi Arabia
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| |
Collapse
|
20
|
Melillo N, Grandoni S, Cesari N, Brogin G, Puccini P, Magni P. Inter-compound and Intra-compound Global Sensitivity Analysis of a Physiological Model for Pulmonary Absorption of Inhaled Compounds. AAPS J 2020; 22:116. [PMID: 32862303 PMCID: PMC7456635 DOI: 10.1208/s12248-020-00499-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Accepted: 08/06/2020] [Indexed: 12/25/2022] Open
Abstract
In recent years, global sensitivity analysis (GSA) has gained interest in physiologically based pharmacokinetics (PBPK) modelling and simulation from pharmaceutical industry, regulatory authorities, and academia. With the case study of an in-house PBPK model for inhaled compounds in rats, the aim of this work is to show how GSA can contribute in PBPK model development and daily use. We identified two types of GSA that differ in the aims and, thus, in the parameter variability: inter-compound and intra-compound GSA. The inter-compound GSA aims to understand which are the parameters that mostly influence the variability of the metrics of interest in the whole space of the drugs' properties, and thus, it is useful during the model development. On the other hand, the intra-compound GSA aims to highlight how much the uncertainty associated with the parameters of a given drug impacts the uncertainty in the model prediction and so, it is useful during routine PBPK use. In this work, inter-compound GSA highlighted that dissolution- and formulation-related parameters were mostly important for the prediction of the fraction absorbed, while the permeability is the most important parameter for lung AUC and MRT. Intra-compound GSA highlighted that, for all the considered compounds, the permeability was one of the most important parameters for lung AUC, MRT and plasma MRT, while the extraction ratio and the dose for the plasma AUC. GSA is a crucial instrument for the quality assessment of model-based inference; for this reason, we suggest its use during both PBPK model development and use.
Collapse
Affiliation(s)
- Nicola Melillo
- Laboratory of Bioinformatics, Mathematical Modelling and Synthetic Biology, Department of Electrical, Computer and Biomedical Engineering, Università degli Studi di Pavia, Via Ferrata 5, I-27100, Pavia, Italy
| | - Silvia Grandoni
- Laboratory of Bioinformatics, Mathematical Modelling and Synthetic Biology, Department of Electrical, Computer and Biomedical Engineering, Università degli Studi di Pavia, Via Ferrata 5, I-27100, Pavia, Italy
| | - Nicola Cesari
- Pharmacokinetics, Biochemistry and Metabolism Department, Chiesi Farmaceutici S.p.A., Parma, Italy
| | - Giandomenico Brogin
- Pharmacokinetics, Biochemistry and Metabolism Department, Chiesi Farmaceutici S.p.A., Parma, Italy
| | - Paola Puccini
- Pharmacokinetics, Biochemistry and Metabolism Department, Chiesi Farmaceutici S.p.A., Parma, Italy
| | - Paolo Magni
- Laboratory of Bioinformatics, Mathematical Modelling and Synthetic Biology, Department of Electrical, Computer and Biomedical Engineering, Università degli Studi di Pavia, Via Ferrata 5, I-27100, Pavia, Italy.
| |
Collapse
|
21
|
Zhang H, Wang X, Ding R, Shen L, Gao P, Xu H, Xiu C, Zhang H, Song D, Han B. Characterization and imaging of surgical specimens of invasive breast cancer and normal breast tissues with the application of Raman spectral mapping: A feasibility study and comparison with randomized single-point detection method. Oncol Lett 2020; 20:2969-2976. [PMID: 32782614 PMCID: PMC7400922 DOI: 10.3892/ol.2020.11804] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 01/24/2020] [Indexed: 12/24/2022] Open
Abstract
A mapping technique was used in the present study to explore the biological and imaging characteristics of invasive breast cancer and normal breast tissues in Raman examination data and construct a diagnostic model for breast cancer. Raman examination data reflect the biochemical or molecular characteristics of the target tissues. A total of 45 specimens from patients with breast cancer who underwent surgery and 25 adjacent normal breast tissue specimens were included in the present study. Using the specimens, a total of 53 sets of mapping data and 2,597 pieces of Raman spectral data were obtained. The collected spectra were corrected and fitted, the Raman spectra were analyzed by robust statistical methods, and a diagnostic model was constructed using the k-Nearest Neighbor (KNN) method. The KNN classification method was applied to analyze the characteristics of the mapping test application. The percentage of outliers in the mapping data for malignant and normal breast tissues was 12.7 and 6.6%, respectively. The percentage of outlier data in the conventional single-point detection data for malignant and normal breast tissues was 24.5 and 26.0%, respectively. Analysis using a t-test identified a significant difference in the number of outliers between mapping and single-point detection for malignant (t=−6.169; P<0.001) and normal breast tissues (t=−8.873; P<0.001). Based on the mapping data, the accuracy, sensitivity and specificity for breast cancer detection by the diagnostic model constructed using the KNN method was 99.56, 96.6 and 98.48%, respectively. The positive and negative predictive value of this model was 99.56 and 89.04%, respectively. The data obtained by mapping technology demonstrated improved stability and contained less outliers compared with single-point detection. The diagnostic model constructed using the mapping data demonstrated excellent diagnostic performance and good correspondence with pathological results. The findings of the present study demonstrated the feasibility of the application of the diagnostic model for intraoperative real-time imaging for patients with breast cancer. This study provided the foundation of Raman spectroscopy-based diagnostic imaging at the molecular level.
Collapse
Affiliation(s)
- Haipeng Zhang
- Department of Gynaecology, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Xiaozhen Wang
- Department of Breast Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Rongbo Ding
- Department of Breast Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Lishengnan Shen
- Department of Breast Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Pin Gao
- Department of Breast Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Hui Xu
- Department of Ophthalmology, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Caifeng Xiu
- Department of Gynaecology, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Huanxia Zhang
- Department of Gynaecology, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Dong Song
- Department of Breast Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Bing Han
- Department of Breast Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
22
|
Abstract
Manually identifying and correcting errors in protein models can be a slow process, but improvements in validation tools and automated model-building software can contribute to reducing this burden. This article presents a new correctness score that is produced by combining multiple sources of information using a neural network. The residues in 639 automatically built models were marked as correct or incorrect by comparing them with the coordinates deposited in the PDB. A number of features were also calculated for each residue using Coot, including map-to-model correlation, density values, B factors, clashes, Ramachandran scores, rotamer scores and resolution. Two neural networks were created using these features as inputs: one to predict the correctness of main-chain atoms and the other for side chains. The 639 structures were split into 511 that were used to train the neural networks and 128 that were used to test performance. The predicted correctness scores could correctly categorize 92.3% of the main-chain atoms and 87.6% of the side chains. A Coot ML Correctness script was written to display the scores in a graphical user interface as well as for the automatic pruning of chains, residues and side chains with low scores. The automatic pruning function was added to the CCP4i2 Buccaneer automated model-building pipeline, leading to significant improvements, especially for high-resolution structures.
Collapse
Affiliation(s)
- Paul S. Bond
- Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Keith S. Wilson
- Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Kevin D. Cowtan
- Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| |
Collapse
|
23
|
Hoh SW, Burnley T, Cowtan K. Current approaches for automated model building into cryo-EM maps using Buccaneer with CCP-EM. Acta Crystallogr D Struct Biol 2020; 76:531-541. [PMID: 32496215 PMCID: PMC7271950 DOI: 10.1107/s2059798320005513] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 04/20/2020] [Indexed: 11/11/2022] Open
Abstract
This work focuses on the use of the existing protein-model-building software Buccaneer to provide structural interpretation of electron cryo-microscopy (cryo-EM) maps. Originally developed for application to X-ray crystallography, the necessary steps to optimise the usage of Buccaneer with cryo-EM maps are shown. This approach has been applied to the data sets of 208 cryo-EM maps with resolutions of better than 4 Å. The results obtained also show an evident improvement in the sequencing step when the initial reference map and model used for crystallographic cases are replaced by a cryo-EM reference. All other necessary changes to settings in Buccaneer are implemented in the model-building pipeline from within the CCP-EM interface (as of version 1.4.0).
Collapse
Affiliation(s)
- Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Tom Burnley
- Scientific Computing Department, Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Kevin Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| |
Collapse
|
24
|
Chojnowski G, Choudhury K, Heuser P, Sobolev E, Pereira J, Oezugurel U, Lamzin VS. The use of local structural similarity of distant homologues for crystallographic model building from a molecular-replacement solution. Acta Crystallogr D Struct Biol 2020; 76:248-260. [PMID: 32133989 PMCID: PMC7057216 DOI: 10.1107/s2059798320000455] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 01/14/2020] [Indexed: 12/18/2022] Open
Abstract
The performance of automated protein model building usually decreases with resolution, mainly owing to the lower information content of the experimental data. This calls for a more elaborate use of the available structural information about macromolecules. Here, a new method is presented that uses structural homologues to improve the quality of protein models automatically constructed using ARP/wARP. The method uses local structural similarity between deposited models and the model being built, and results in longer main-chain fragments that in turn can be more reliably docked to the protein sequence. The application of the homology-based model extension method to the example of a CFA synthase at 2.7 Å resolution resulted in a more complete model with almost all of the residues correctly built and docked to the sequence. The method was also evaluated on 1493 molecular-replacement solutions at a resolution of 4.0 Å and better that were submitted to the ARP/wARP web service for model building. A significant improvement in the completeness and sequence coverage of the built models has been observed.
Collapse
Affiliation(s)
- Grzegorz Chojnowski
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Koushik Choudhury
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Philipp Heuser
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Egor Sobolev
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Joana Pereira
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Umut Oezugurel
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Victor S. Lamzin
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| |
Collapse
|
25
|
Zeng L, Ding W, Hao Q. Using cryo-electron microscopy maps for X-ray structure determination of homologues. Acta Crystallogr D Struct Biol 2020; 76:63-72. [PMID: 31909744 DOI: 10.1107/s2059798319015924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 11/25/2019] [Indexed: 11/10/2022]
Abstract
The combination of cryo-electron microscopy (cryo-EM) and X-ray crystallography reflects an important trend in structural biology. In a previously published study, a hybrid method for the determination of X-ray structures using initial phases provided by the corresponding parts of cryo-EM maps was presented. However, if the target structure of X-ray crystallography is not identical but homologous to the corresponding molecular model of the cryo-EM map, then the decrease in the accuracy of the starting phases makes the whole process more difficult. Here, a modified hybrid method is presented to handle such cases. The whole process includes three steps: cryo-EM map replacement, phase extension by NCS averaging and dual-space iterative model building. When the resolution gap between the cryo-EM and X-ray crystallographic data is large and the sequence identity is low, an intermediate stage of model building is necessary. Six test cases have been studied with sequence identity between the corresponding molecules in the cryo-EM and X-ray structures ranging from 34 to 52% and with sequence similarity ranging from 86 to 91%. This hybrid method consistently produced models with reasonable Rwork and Rfree values which agree well with the previously determined X-ray structures for all test cases, thus indicating the general applicability of the method for X-ray structure determination of homologues using cryo-EM maps as a starting point.
Collapse
Affiliation(s)
- Lingxiao Zeng
- School of Biomedical Sciences, University of Hong Kong, 21 Sassoon Road, Hong Kong
| | - Wei Ding
- Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | - Quan Hao
- School of Biomedical Sciences, University of Hong Kong, 21 Sassoon Road, Hong Kong
| |
Collapse
|
26
|
Alharbi E, Bond PS, Calinescu R, Cowtan K. Comparison of automated crystallographic model-building pipelines. Acta Crystallogr D Struct Biol 2019; 75:1119-1128. [PMID: 31793905 DOI: 10.1107/s2059798319014918] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 11/04/2019] [Indexed: 11/10/2022]
Abstract
A comparison of four protein model-building pipelines (ARP/wARP, Buccaneer, PHENIX AutoBuild and SHELXE) was performed using data sets from 202 experimentally phased cases, both with the data as observed and truncated to simulate lower resolutions. All pipelines were run using default parameters. Additionally, an ARP/wARP run was completed using models from Buccaneer. All pipelines achieved nearly complete protein structures and low Rwork/Rfree at resolutions between 1.2 and 1.9 Å, with PHENIX AutoBuild and ARP/wARP producing slightly lower R factors. At lower resolutions, Buccaneer leads to significantly more complete models.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, England
| | - Paul S Bond
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, England
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England
| |
Collapse
|
27
|
Zeraatkar D, Cheung K, Milio K, Zworth M, Gupta A, Bhasin A, Bartoszko JJ, Kiflen M, Morassut RE, Noor ST, Lawson DO, Johnston BC, Bangdiwala SI, de Souza RJ. Methods for the Selection of Covariates in Nutritional Epidemiology Studies: A Meta-Epidemiological Review. Curr Dev Nutr 2019; 3:nzz104. [PMID: 31598577 PMCID: PMC6778415 DOI: 10.1093/cdn/nzz104] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Accepted: 09/05/2019] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Observational studies provide important information about the effects of exposures that cannot be easily studied in clinical trials, such as nutritional exposures, but are subject to confounding. Investigators adjust for confounders by entering them as covariates in analytic models. OBJECTIVE The aim of this study was to evaluate the reporting and credibility of methods for selection of covariates in nutritional epidemiology studies. METHODS We sampled 150 nutritional epidemiology studies published in 2007/2008 and 2017/2018 from the top 5 high-impact nutrition and medical journals and extracted information on methods for selection of covariates. RESULTS Most studies did not report selecting covariates a priori (94.0%) or criteria for selection of covariates (63.3%). There was general inconsistency in choice of covariates, even among studies investigating similar questions. One-third of studies did not acknowledge potential for residual confounding in their discussion. CONCLUSION Studies often do not report methods for selection of covariates, follow available guidance for selection of covariates, nor discuss potential for residual confounding.
Collapse
Affiliation(s)
- Dena Zeraatkar
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Kevin Cheung
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Kirolos Milio
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Max Zworth
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Arnav Gupta
- Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Arrti Bhasin
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Jessica J Bartoszko
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Michel Kiflen
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
- Population Health Research Institute, McMaster University, Hamilton, Ontario, Canada
| | - Rita E Morassut
- Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Salmi T Noor
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Daeria O Lawson
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Bradley C Johnston
- Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Shrikant I Bangdiwala
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
- Population Health Research Institute, McMaster University, Hamilton, Ontario, Canada
| | - Russell J de Souza
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
- Population Health Research Institute, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
28
|
Chojnowski G, Pereira J, Lamzin VS. Sequence assignment for low-resolution modelling of protein crystal structures. Acta Crystallogr D Struct Biol 2019; 75:753-763. [PMID: 31373574 PMCID: PMC6677015 DOI: 10.1107/s2059798319009392] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 06/30/2019] [Indexed: 01/08/2023]
Abstract
Recent advances in automated protein model building using ARP/wARP are presented. The new methods include machine-learning-enhanced sequence assignment and loop building using a fragment database. The performance of automated model building in crystal structure determination usually decreases with the resolution of the experimental data, and may result in fragmented models and incorrect side-chain assignment. Presented here are new methods for machine-learning-based docking of main-chain fragments to the sequence and for their sequence-independent connection using a dedicated library of protein fragments. The combined use of these new methods noticeably increases sequence coverage and reduces fragmentation of the protein models automatically built with ARP/wARP.
Collapse
Affiliation(s)
- Grzegorz Chojnowski
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Joana Pereira
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Victor S Lamzin
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| |
Collapse
|
29
|
van Beusekom B, Wezel N, Hekkelman ML, Perrakis A, Emsley P, Joosten RP. Building and rebuilding N-glycans in protein structure models. Acta Crystallogr D Struct Biol 2019; 75:416-425. [PMID: 30988258 PMCID: PMC6465985 DOI: 10.1107/s2059798319003875] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 03/20/2019] [Indexed: 01/16/2023]
Abstract
Carbohydrates are automatically built and rebuilt using Coot in the PDB-REDO pipeline. N-Glycosylation is one of the most common post-translational modifications and is implicated in, for example, protein folding and interaction with ligands and receptors. N-Glycosylation trees are complex structures of linked carbohydrate residues attached to asparagine residues. While carbohydrates are typically modeled in protein structures, they are often incomplete or have the wrong chemistry. Here, new tools are presented to automatically rebuild existing glycosylation trees, to extend them where possible, and to add new glycosylation trees if they are missing from the model. The method has been incorporated in the PDB-REDO pipeline and has been applied to build or rebuild 16 452 carbohydrate residues in 11 651 glycosylation trees in 4498 structure models, and is also available from the PDB-REDO web server. With better modeling of N-glycosylation, the biological function of this important modification can be better and more easily understood.
Collapse
Affiliation(s)
- Bart van Beusekom
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Natasja Wezel
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Maarten L Hekkelman
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Anastassis Perrakis
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Paul Emsley
- MRC Laboratory for Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Robbie P Joosten
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| |
Collapse
|
30
|
Ben-Aharon Z, Levitt M, Kalisman N. Automatic Inference of Sequence from Low-Resolution Crystallographic Data. Structure 2018; 26:1546-1554.e2. [PMID: 30293812 DOI: 10.1016/j.str.2018.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 04/18/2018] [Accepted: 08/23/2018] [Indexed: 11/17/2022]
Abstract
At resolutions worse than 3.5 Å, the electron density is weak or nonexistent at the locations of the side chains. Consequently, the assignment of the protein sequences to their correct positions along the backbone is a difficult problem. In this work, we propose a fully automated computational approach to assign sequence at low resolution. It is based on our surprising observation that standard reciprocal-space indicators, such as the initial unrefined R value, are sensitive enough to detect an erroneous sequence assignment of even a single backbone position. Our approach correctly determines the amino acid type for 15%, 13%, and 9% of the backbone positions in crystallographic datasets with resolutions of 4.0 Å, 4.5 Å, and 5.0 Å, respectively. We implement these findings in an application for threading a sequence onto a backbone structure. For the three resolution ranges, the application threads 83%, 81%, and 64% of the sequences exactly as in the deposited PDB structures.
Collapse
Affiliation(s)
- Ziv Ben-Aharon
- Department of Biological Chemistry, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Nir Kalisman
- Department of Biological Chemistry, The Hebrew University of Jerusalem, Jerusalem 91904, Israel.
| |
Collapse
|
31
|
|
32
|
Abstract
X-ray crystallography and cryo-electron microscopy (cryo-EM) are complementary techniques for structure determination. Crystallography usually reveals more detailed information, while cryo-EM is an extremely useful technique for studying large-sized macromolecules. As the gap between the resolution of crystallography and cryo-EM data narrows, the cryo-EM map of a macromolecule could serve as an initial model to solve the phase problem of crystal diffraction for high-resolution structure determination. FSEARCH is a procedure to utilize the low-resolution molecular shape for crystallographic phasing. The IPCAS (Iterative Protein Crystal structure Automatic Solution) pipeline is an automatic direct-methods-aided dual-space iterative phasing and model-building procedure. When only an electron-density map is available as the starting point, IPCAS is capable of generating a completed model from the phases of the input map automatically, without the requirement of an initial model. In this study, a hybrid method integrating X-ray crystallography with cryo-EM to help with structure determination is presented. With a cryo-EM map as the starting point, the workflow of the method involves three steps. (1) Cryo-EM map replacement: FSEARCH is utilized to find the correct translation and orientation of the cryo-EM map in the crystallographic unit cell and generates the initial low-resolution map. (2) Phase extension: the phases calculated from the correctly placed cryo-EM map are extended to high-resolution X-ray data by non-crystallographic symmetry averaging with phenix.resolve. (3) Model building: IPCAS is used to generate an initial model using the phase-extended map and perform model completion by iteration. Four cases (the lowest cryo-EM map resolution being 6.9 Å) have been tested for the general applicability of the hybrid method, and almost complete models have been generated for all test cases with reasonable Rwork/Rfree. The hybrid method therefore provides an automated tool for X-ray structure determination using a cryo-EM map as the starting point.
Collapse
Affiliation(s)
- Lingxiao Zeng
- School of Biomedical Sciences, University of Hong Kong, 21 Sassoon Road, Hong Kong
| | - Wei Ding
- Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
| | - Quan Hao
- School of Biomedical Sciences, University of Hong Kong, 21 Sassoon Road, Hong Kong
- Institute of Physics, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
| |
Collapse
|
33
|
Croll TI. ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr D Struct Biol 2018; 74:519-530. [PMID: 29872003 PMCID: PMC6096486 DOI: 10.1107/s2059798318002425] [Citation(s) in RCA: 812] [Impact Index Per Article: 135.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2017] [Accepted: 02/09/2018] [Indexed: 01/19/2023] Open
Abstract
This paper introduces ISOLDE, a new software package designed to provide an intuitive environment for high-fidelity interactive remodelling/refinement of macromolecular models into electron-density maps. ISOLDE combines interactive molecular-dynamics flexible fitting with modern molecular-graphics visualization and established structural biology libraries to provide an immersive interface wherein the model constantly acts to maintain physically realistic conformations as the user interacts with it by directly tugging atoms with a mouse or haptic interface or applying/removing restraints. In addition, common validation tasks are accelerated and visualized in real time. Using the recently described 3.8 Å resolution cryo-EM structure of the eukaryotic minichromosome maintenance (MCM) helicase complex as a case study, it is demonstrated how ISOLDE can be used alongside other modern refinement tools to avoid common pitfalls of low-resolution modelling and improve the quality of the final model. A detailed analysis of changes between the initial and final model provides a somewhat sobering insight into the dangers of relying on a small number of validation metrics to judge the quality of a low-resolution model.
Collapse
Affiliation(s)
- Tristan Ian Croll
- Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Cambridge CB2 0XY, England
| |
Collapse
|
34
|
Ryoo JH, Wang C, Swearer SM, Hull M, Shi D. Longitudinal Model Building Using Latent Transition Analysis: An Example Using School Bullying Data. Front Psychol 2018; 9:675. [PMID: 29867652 PMCID: PMC5953336 DOI: 10.3389/fpsyg.2018.00675] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Accepted: 04/19/2018] [Indexed: 11/22/2022] Open
Abstract
Applications of latent transition analysis (LTA) have emerged since the early 1990s, with numerous scientific findings being published in many areas, including social and behavioral sciences, education, and public health. Although LTA is effective as a statistical analytic tool for a person-centered model using longitudinal data, model building in LTA has often been subjective and confusing for applied researchers. To fill this gap in the literature, we review the components of LTA, recommend a framework of fitting LTA, and summarize what acceptable model evaluation tools should be used in practice. The proposed framework of fitting LTA consists of six steps depicted in Figure 1 from step 0 (exploring data) to step 5 (fitting distal variables). We also illustrate the framework of fitting LTA with data on concerns about school bullying from a sample of 1,180 students ranging from 5th to 9th grade (mean age = 12.2 years, SD = 1.29 years at Time 1) over three semesters. We identified four groups of students with distinct patterns of bullying concerns, and found that their concerns about bullying decreased and narrowed to specific concerns about rumors, gossip, and social exclusion over time. The data and command (syntax) files needed for reproducing the results using SAS PROC LCA and PROC LTA (Version 1.3.2) (2015) and Mplus 7.4 (Muthén and Muthén, 1998–2015) are provided as online supplementary materials.
Collapse
Affiliation(s)
- Ji Hoon Ryoo
- Department of Educational Leadership, Foundations, and Policy, University of Virginia, Charlottesville, VA, United States
| | - Cixin Wang
- Department of Counseling, Higher Education, and Special Education, University of Maryland, College Park, College Park, MD, United States
| | - Susan M Swearer
- Department of Educational Psychology, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Michael Hull
- Department of Educational Leadership, Foundations, and Policy, University of Virginia, Charlottesville, VA, United States
| | - Dingjing Shi
- Department of Psychology, University of Virginia, Charlottesville, VA, United States
| |
Collapse
|
35
|
Sazzed S, Song J, Kovacs JA, Wriggers W, Auer M, He J. Tracing Actin Filament Bundles in Three-Dimensional Electron Tomography Density Maps of Hair Cell Stereocilia. Molecules 2018; 23:E882. [PMID: 29641472 DOI: 10.3390/molecules23040882] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 03/14/2018] [Accepted: 03/22/2018] [Indexed: 12/20/2022] Open
Abstract
Cryo-electron tomography (cryo-ET) is a powerful method of visualizing the three-dimensional organization of supramolecular complexes, such as the cytoskeleton, in their native cell and tissue contexts. Due to its minimal electron dose and reconstruction artifacts arising from the missing wedge during data collection, cryo-ET typically results in noisy density maps that display anisotropic XY versus Z resolution. Molecular crowding further exacerbates the challenge of automatically detecting supramolecular complexes, such as the actin bundle in hair cell stereocilia. Stereocilia are pivotal to the mechanoelectrical transduction process in inner ear sensory epithelial hair cells. Given the complexity and dense arrangement of actin bundles, traditional approaches to filament detection and tracing have failed in these cases. In this study, we introduce BundleTrac, an effective method to trace hundreds of filaments in a bundle. A comparison between BundleTrac and manually tracing the actin filaments in a stereocilium showed that BundleTrac accurately built 326 of 330 filaments (98.8%), with an overall cross-distance of 1.3 voxels for the 330 filaments. BundleTrac is an effective semi-automatic modeling approach in which a seed point is provided for each filament and the rest of the filament is computationally identified. We also demonstrate the potential of a denoising method that uses a polynomial regression to address the resolution and high-noise anisotropic environment of the density map.
Collapse
|
36
|
Kim M, Hsu HY, Kwok OM, Seo S. The Optimal Starting Model to Search for the Accurate Growth Trajectory in Latent Growth Models. Front Psychol 2018; 9:349. [PMID: 29636712 PMCID: PMC5880923 DOI: 10.3389/fpsyg.2018.00349] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Accepted: 03/02/2018] [Indexed: 11/13/2022] Open
Abstract
This simulation study aims to propose an optimal starting model to search for the accurate growth trajectory in Latent Growth Models (LGM). We examine the performance of four different starting models in terms of the complexity of the mean and within-subject variance-covariance (V-CV) structures when there are time-invariant covariates embedded in the population models. Results showed that the model search starting with the fully saturated model (i.e., the most complex mean and within-subject V-CV model) recovers best for the true growth trajectory in simulations. Specifically, the fully saturated starting model with using ΔBIC and ΔAIC performed best (over 95%) and recommended for researchers. An illustration of the proposed method is given using the empirical secondary dataset. Implications of the findings and limitations are discussed.
Collapse
Affiliation(s)
- Minjung Kim
- Quantitative Research, Evaluation, and Measurement, Department of Educational Studies, The Ohio State University, Columbus, OH, United States
| | - Hsien-Yuan Hsu
- Children's Learning Institute, University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Oi-Man Kwok
- Department of Educational Psychology, Texas A&M University, College Station, TX, United States
| | - Sunmi Seo
- Department of Psychology, University of Alabama, Tuscaloosa, AL, United States
| |
Collapse
|
37
|
Abstract
Atomic models based on high-resolution density maps are the ultimate result of the cryo-EM structure determination process. Here, we introduce a general procedure for local sharpening of cryo-EM density maps based on prior knowledge of an atomic reference structure. The procedure optimizes contrast of cryo-EM densities by amplitude scaling against the radially averaged local falloff estimated from a windowed reference model. By testing the procedure using six cryo-EM structures of TRPV1, β-galactosidase, γ-secretase, ribosome-EF-Tu complex, 20S proteasome and RNA polymerase III, we illustrate how local sharpening can increase interpretability of density maps in particular in cases of resolution variation and facilitates model building and atomic model refinement.
Collapse
Affiliation(s)
- Arjen J Jakobi
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany.,Hamburg Unit c/o DESY, European Molecular Biology Laboratory, Hamburg, Germany.,The Hamburg Centre for Ultrafast Imaging, Hamburg, Germany
| | - Matthias Wilmanns
- Hamburg Unit c/o DESY, European Molecular Biology Laboratory, Hamburg, Germany
| | - Carsten Sachse
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany
| |
Collapse
|
38
|
Abstract
Coot is a molecular-graphics program primarily aimed at model building using X-ray data. Recently, tools for the manipulation and representation of ligands have been introduced. Here, these new tools for ligand validation and comparison are described. Ligands in the wwPDB have been scored by density-fit, distortion and atom-clash metrics. The distributions of these scores can be used to assess the relative merits of the particular ligand in the protein-ligand complex of interest by means of `sliders' akin to those now available for each accession code on the wwPDB websites.
Collapse
Affiliation(s)
- Paul Emsley
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| |
Collapse
|
39
|
Abstract
Crystal structures of protein-ligand complexes are often used to infer biology and inform structure-based drug discovery. Hence, it is important to build accurate, reliable models of ligands that give confidence in the interpretation of the respective protein-ligand complex. This paper discusses key stages in the ligand-fitting process, including ligand binding-site identification, ligand description and conformer generation, ligand fitting, refinement and subsequent validation. The CCP4 suite contains a number of software tools that facilitate this task: AceDRG for the creation of ligand descriptions and conformers, Lidia and JLigand for two-dimensional and three-dimensional ligand editing and visual analysis, Coot for density interpretation, ligand fitting, analysis and validation, and REFMAC5 for macromolecular refinement. In addition to recent advancements in automatic carbohydrate building in Coot (LO/Carb) and ligand-validation tools (FLEV), the release of the CCP4i2 GUI provides an integrated solution that streamlines the ligand-fitting workflow, seamlessly passing results from one program to the next. The ligand-fitting process is illustrated using instructive practical examples, including problematic cases such as post-translational modifications, highlighting the need for careful analysis and rigorous validation.
Collapse
Affiliation(s)
- Robert A. Nicholls
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| |
Collapse
|
40
|
Abstract
Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.
Collapse
Affiliation(s)
| | | | - Brian Goldman
- Modeling & Informatics, Vertex Pharmaceuticals, Boston, MA, USA
| |
Collapse
|
41
|
Esdar M, Hübner U, Liebe JD, Hüsers J, Thye J. Understanding latent structures of clinical information logistics: A bottom-up approach for model building and validating the workflow composite score. Int J Med Inform 2016; 97:210-220. [PMID: 27919379 DOI: 10.1016/j.ijmedinf.2016.10.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Revised: 10/04/2016] [Accepted: 10/10/2016] [Indexed: 10/20/2022]
Abstract
BACKGROUND AND PURPOSE Clinical information logistics is a construct that aims to describe and explain various phenomena of information provision to drive clinical processes. It can be measured by the workflow composite score, an aggregated indicator of the degree of IT support in clinical processes. This study primarily aimed to investigate the yet unknown empirical patterns constituting this construct. The second goal was to derive a data-driven weighting scheme for the constituents of the workflow composite score and to contrast this scheme with a literature based, top-down procedure. This approach should finally test the validity and robustness of the workflow composite score. METHODS Based on secondary data from 183 German hospitals, a tiered factor analytic approach (confirmatory and subsequent exploratory factor analysis) was pursued. A weighting scheme, which was based on factor loadings obtained in the analyses, was put into practice. RESULTS We were able to identify five statistically significant factors of clinical information logistics that accounted for 63% of the overall variance. These factors were "flow of data and information", "mobility", "clinical decision support and patient safety", "electronic patient record" and "integration and distribution". The system of weights derived from the factor loadings resulted in values for the workflow composite score that differed only slightly from the score values that had been previously published based on a top-down approach. CONCLUSION Our findings give insight into the internal composition of clinical information logistics both in terms of factors and weights. They also allowed us to propose a coherent model of clinical information logistics from a technical perspective that joins empirical findings with theoretical knowledge. Despite the new scheme of weights applied to the calculation of the workflow composite score, the score behaved robustly, which is yet another hint of its validity and therefore its usefulness.
Collapse
Affiliation(s)
- Moritz Esdar
- Health Informatics Research Group, University of Applied Sciences Osnabrück, Faculty of Business Management and Social Sciences, Caprivistr. 30A, D-49076 Osnabrück, Germany.
| | - Ursula Hübner
- Health Informatics Research Group, University of Applied Sciences Osnabrück, Faculty of Business Management and Social Sciences, Caprivistr. 30A, D-49076 Osnabrück, Germany.
| | - Jan-David Liebe
- Health Informatics Research Group, University of Applied Sciences Osnabrück, Faculty of Business Management and Social Sciences, Caprivistr. 30A, D-49076 Osnabrück, Germany.
| | - Jens Hüsers
- Health Informatics Research Group, University of Applied Sciences Osnabrück, Faculty of Business Management and Social Sciences, Caprivistr. 30A, D-49076 Osnabrück, Germany.
| | - Johannes Thye
- Health Informatics Research Group, University of Applied Sciences Osnabrück, Faculty of Business Management and Social Sciences, Caprivistr. 30A, D-49076 Osnabrück, Germany.
| |
Collapse
|
42
|
Croll TI, Andersen GR. Re-evaluation of low-resolution crystal structures via interactive molecular-dynamics flexible fitting (iMDFF): a case study in complement C4. Acta Crystallogr D Struct Biol 2016; 72:1006-16. [PMID: 27599733 DOI: 10.1107/s2059798316012201] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Accepted: 07/27/2016] [Indexed: 11/10/2022]
Abstract
While the rapid proliferation of high-resolution structures in the Protein Data Bank provides a rich set of templates for starting models, it remains the case that a great many structures both past and present are built at least in part by hand-threading through low-resolution and/or weak electron density. With current model-building tools this task can be challenging, and the de facto standard for acceptable error rates (in the form of atomic clashes and unfavourable backbone and side-chain conformations) in structures based on data with dmax not exceeding 3.5 Å reflects this. When combined with other factors such as model bias, these residual errors can conspire to make more serious errors in the protein fold difficult or impossible to detect. The three recently published 3.6-4.2 Å resolution structures of complement C4 (PDB entries 4fxg, 4fxk and 4xam) rank in the top quartile of structures of comparable resolution both in terms of Rfree and MolProbity score, yet, as shown here, contain register errors in six β-strands. By applying a molecular-dynamics force field that explicitly models interatomic forces and hence excludes most physically impossible conformations, the recently developed interactive molecular-dynamics flexible fitting (iMDFF) approach significantly reduces the complexity of the conformational space to be searched during manual rebuilding. This substantially improves the rate of detection and correction of register errors, and allows user-guided model building in maps with a resolution lower than 3.5 Å to converge to solutions with a stereochemical quality comparable to atomic resolution structures. Here, iMDFF has been used to individually correct and re-refine these three structures to MolProbity scores of <1.7, and strategies for working with such challenging data sets are suggested. Notably, the improved model allowed the resolution for complement C4b to be extended from 4.2 to 3.5 Å as demonstrated by paired refinement.
Collapse
Affiliation(s)
- Tristan Ian Croll
- Institute of Health and Biomedical Innovation, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia
| | - Gregers Rom Andersen
- Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10C, 8000 Aarhus, Denmark
| |
Collapse
|
43
|
Selvarajah G, Selvarajah S. Model building to facilitate understanding of holliday junction and heteroduplex formation, and holliday junction resolution. Biochem Mol Biol Educ 2016; 44:381-390. [PMID: 26899144 DOI: 10.1002/bmb.20964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2015] [Revised: 12/30/2015] [Accepted: 01/26/2016] [Indexed: 06/05/2023]
Abstract
Students frequently expressed difficulty in understanding the molecular mechanisms involved in chromosomal recombination. Therefore, we explored alternative methods for presenting the two concepts of the double-strand break model: Holliday junction and heteroduplex formation, and Holliday junction resolution. In addition to a lecture and computer-animated video, we included a model building activity using pipe cleaners. Biotechnology undergraduates (n = 108) used the model to simulate Holliday junction and heteroduplex formation, and Holliday junction resolution. Based on student perception, an average of 12.85 and 78.35% students claimed that they completely and partially understood the two concepts, respectively. A test conducted to ascertain their understanding about the two concepts showed that 66.1% of the students provided the correct response to the three multiple choice questions. A majority of the 108 students attributed the inclusion of model building to their better understanding of Holliday junction and heteroduplex formation, and Holliday junction resolution. This underlines the importance of incorporating model building, particularly in concepts that require spatial visualization. © 2016 by The International Union of Biochemistry and Molecular Biology, 44(4):381-390, 2016.
Collapse
|
44
|
|
45
|
Buyel JF, Gruchow HM, Fischer R. Depth Filters Containing Diatomite Achieve More Efficient Particle Retention than Filters Solely Containing Cellulose Fibers. Front Plant Sci 2015; 6:1134. [PMID: 26734037 PMCID: PMC4685141 DOI: 10.3389/fpls.2015.01134] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 11/30/2015] [Indexed: 05/30/2023]
Abstract
The clarification of biological feed stocks during the production of biopharmaceutical proteins is challenging when large quantities of particles must be removed, e.g., when processing crude plant extracts. Single-use depth filters are often preferred for clarification because they are simple to integrate and have a good safety profile. However, the combination of filter layers must be optimized in terms of nominal retention ratings to account for the unique particle size distribution in each feed stock. We have recently shown that predictive models can facilitate filter screening and the selection of appropriate filter layers. Here we expand our previous study by testing several filters with different retention ratings. The filters typically contain diatomite to facilitate the removal of fine particles. However, diatomite can interfere with the recovery of large biopharmaceutical molecules such as virus-like particles and aggregated proteins. Therefore, we also tested filtration devices composed solely of cellulose fibers and cohesive resin. The capacities of both filter types varied from 10 to 50 L m(-2) when challenged with tobacco leaf extracts, but the filtrate turbidity was ~500-fold lower (~3.5 NTU) when diatomite filters were used. We also tested pre-coat filtration with dispersed diatomite, which achieved capacities of up to 120 L m(-2) with turbidities of ~100 NTU using bulk plant extracts, and in contrast to the other depth filters did not require an upstream bag filter. Single pre-coat filtration devices can thus replace combinations of bag and depth filters to simplify the processing of plant extracts, potentially saving on time, labor and consumables. The protein concentrations of TSP, DsRed and antibody 2G12 were not affected by pre-coat filtration, indicating its general applicability during the manufacture of plant-derived biopharmaceutical proteins.
Collapse
Affiliation(s)
- Johannes F. Buyel
- Integrated Production Platforms, Fraunhofer-Institute for Molecular Biology and Applied Ecology IMEAachen, Germany
- Institute for Molecular Biotechnology, RWTH Aachen UniversityAachen, Germany
| | - Hannah M. Gruchow
- Integrated Production Platforms, Fraunhofer-Institute for Molecular Biology and Applied Ecology IMEAachen, Germany
- Institute for Molecular Biotechnology, RWTH Aachen UniversityAachen, Germany
| | - Rainer Fischer
- Integrated Production Platforms, Fraunhofer-Institute for Molecular Biology and Applied Ecology IMEAachen, Germany
- Institute for Molecular Biotechnology, RWTH Aachen UniversityAachen, Germany
| |
Collapse
|
46
|
Molinaro AM, Wrensch MR, Jenkins RB, Eckel-Passow JE. Statistical considerations on prognostic models for glioma. Neuro Oncol 2015; 18:609-23. [PMID: 26657835 DOI: 10.1093/neuonc/nov255] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 09/14/2015] [Indexed: 12/16/2022] Open
Abstract
Given the lack of beneficial treatments in glioma, there is a need for prognostic models for therapeutic decision making and life planning. Recently several studies defining subtypes of glioma have been published. Here, we review the statistical considerations of how to build and validate prognostic models, explain the models presented in the current glioma literature, and discuss advantages and disadvantages of each model. The 3 statistical considerations to establishing clinically useful prognostic models are: study design, model building, and validation. Careful study design helps to ensure that the model is unbiased and generalizable to the population of interest. During model building, a discovery cohort of patients can be used to choose variables, construct models, and estimate prediction performance via internal validation. Via external validation, an independent dataset can assess how well the model performs. It is imperative that published models properly detail the study design and methods for both model building and validation. This provides readers the information necessary to assess the bias in a study, compare other published models, and determine the model's clinical usefulness. As editors, reviewers, and readers of the relevant literature, we should be cognizant of the needed statistical considerations and insist on their use.
Collapse
Affiliation(s)
- Annette M Molinaro
- Department of Neurological Surgery, University of California San Francisco (UCSF), San Francisco, California (A.M.M., M.R.W.); Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California (A.M.M., M.R.W.); Institute of Human Genetics, University of California San Francisco, San Francisco, California (M.R.W.); Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (R.B.J.); Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota (J.E.E.-P.)
| | - Margaret R Wrensch
- Department of Neurological Surgery, University of California San Francisco (UCSF), San Francisco, California (A.M.M., M.R.W.); Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California (A.M.M., M.R.W.); Institute of Human Genetics, University of California San Francisco, San Francisco, California (M.R.W.); Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (R.B.J.); Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota (J.E.E.-P.)
| | - Robert B Jenkins
- Department of Neurological Surgery, University of California San Francisco (UCSF), San Francisco, California (A.M.M., M.R.W.); Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California (A.M.M., M.R.W.); Institute of Human Genetics, University of California San Francisco, San Francisco, California (M.R.W.); Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (R.B.J.); Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota (J.E.E.-P.)
| | - Jeanette E Eckel-Passow
- Department of Neurological Surgery, University of California San Francisco (UCSF), San Francisco, California (A.M.M., M.R.W.); Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California (A.M.M., M.R.W.); Institute of Human Genetics, University of California San Francisco, San Francisco, California (M.R.W.); Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (R.B.J.); Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota (J.E.E.-P.)
| |
Collapse
|
47
|
Fitton LC, PrôA M, Rowland C, Toro-Ibacache V, O'higgins P. The impact of simplifications on the performance of a finite element model of a Macaca fascicularis cranium. Anat Rec (Hoboken) 2015; 298:107-21. [PMID: 25339306 DOI: 10.1002/ar.23075] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 10/11/2014] [Indexed: 11/08/2022]
Abstract
In recent years finite element analysis (FEA) has emerged as a useful tool for the analysis of skeletal form-function relationships. While this approach has obvious appeal for the study of fossil specimens, such material is often fragmentary with disrupted internal architecture and can contain matrix that leads to errors in accurate segmentation. Here we examine the effects of varying the detail of segmentation and material properties of teeth on the performance of a finite element model of a Macaca fascicularis cranium within a comparative functional framework. Cranial deformations were compared using strain maps to assess differences in strain contours and Procrustes size and shape analyses, from geometric morphometrics, were employed to compare large scale deformations. We show that a macaque model subjected to biting can be made solid, and teeth altered in material properties, with minimal impact on large scale modes of deformation. The models clustered tightly by bite point rather than by modeling simplification approach, and fell out as being distinct from another species. However localized fluctuations in predicted strain magnitudes were recorded with different modeling approaches, particularly over the alveolar region. This study indicates that, while any model simplification should be undertaken with care and attention to its effects, future applications of FEA to fossils with unknown internal architecture may produce reliable results with regard to general modes of deformation, even when detail of internal bone architecture cannot be reliably modeled.
Collapse
Affiliation(s)
- Laura C Fitton
- Centre for Anatomical and Human Sciences, Department of Archaeology and Hull York Medical School, University of York, York, United Kingdom
| | | | | | | | | |
Collapse
|
48
|
Zhang W, Zhang H, Zhang T, Fan H, Hao Q. Protein-complex structure completion using IPCAS (Iterative Protein Crystal structure Automatic Solution). ACTA ACUST UNITED AC 2015; 71:1487-92. [PMID: 26143920 DOI: 10.1107/s1399004715008597] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 05/02/2015] [Indexed: 11/10/2022]
Abstract
Protein complexes are essential components in many cellular processes. In this study, a procedure to determine the protein-complex structure from a partial molecular-replacement (MR) solution is demonstrated using a direct-method-aided dual-space iterative phasing and model-building program suite, IPCAS (Iterative Protein Crystal structure Automatic Solution). The IPCAS iteration procedure involves (i) real-space model building and refinement, (ii) direct-method-aided reciprocal-space phase refinement and (iii) phase improvement through density modification. The procedure has been tested with four protein complexes, including two previously unknown structures. It was possible to use IPCAS to build the whole complex structure from one or less than one subunit once the molecular-replacement method was able to give a partial solution. In the most challenging case, IPCAS was able to extend to the full length starting from less than 30% of the complex structure, while conventional model-building procedures were unsuccessful.
Collapse
Affiliation(s)
- Weizhe Zhang
- Department of Physiology, University of Hong Kong, Hong Kong
| | - Hongmin Zhang
- Department of Physiology, University of Hong Kong, Hong Kong
| | - Tao Zhang
- Institute of Physics, Chinese Academy of Sciences, Beijing 100080, People's Republic of China
| | - Haifu Fan
- Institute of Physics, Chinese Academy of Sciences, Beijing 100080, People's Republic of China
| | - Quan Hao
- Department of Physiology, University of Hong Kong, Hong Kong
| |
Collapse
|
49
|
Buyel JF. Controlling the interplay between Agrobacterium tumefaciens and plants during the transient expression of proteins. Bioengineered 2015; 6:242-4. [PMID: 25997443 PMCID: PMC4601233 DOI: 10.1080/21655979.2015.1052920] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Revised: 05/13/2015] [Accepted: 05/15/2015] [Indexed: 12/31/2022] Open
Abstract
In May 2012, the first plant-derived biopharmaceutical protein received full regulatory approval for therapeutic use in humans. Although plant-based expression systems have many advantages, they can suffer from low expression levels and, depending on the species, the presence of potentially toxic secondary metabolites. Transient expression mediated by Agrobacterium tumefaciens can be used to increase product yields but may also increase the concentration of secondary metabolites generated by plant defense responses. We have recently investigated the sequence of defense responses triggered by A. tumefaciens in tobacco plants and considered how these can be modulated by the transient expression of type III effectors from Pseudomonas syringae. Here we discuss the limitations of this approach, potential solutions and additional issues concerning transient expression in plants that should be investigated in greater detail.
Collapse
Affiliation(s)
- J F Buyel
- Institute for Molecular Biotechnology; RWTH Aachen University; Aachen, Germany
- Fraunhofer-Institute for Molecular Biology and Applied Ecology IME; Aachen, Germany
| |
Collapse
|
50
|
Morshed N, Echols N, Adams PD. Using support vector machines to improve elemental ion identification in macromolecular crystal structures. ACTA ACUST UNITED AC 2015; 71:1147-58. [PMID: 25945580 PMCID: PMC4427199 DOI: 10.1107/s1399004715004241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 03/01/2015] [Indexed: 11/11/2022]
Abstract
A method to automatically identify possible elemental ions in X-ray crystal structures has been extended to use support vector machine (SVM) classifiers trained on selected structures in the PDB, with significantly improved sensitivity over manually encoded heuristics. In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.
Collapse
Affiliation(s)
- Nader Morshed
- College of Letters and Science, University of California, Berkeley, CA 94720, USA
| | - Nathaniel Echols
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Paul D Adams
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|