1
|
Mu Y, Nguyen T, Hawickhorst B, Wriggers W, Sun J, He J. The combined focal loss and dice loss function improves the segmentation of beta-sheets in medium-resolution cryo-electron-microscopy density maps. BIOINFORMATICS ADVANCES 2024; 4:vbae169. [PMID: 39600382 PMCID: PMC11590252 DOI: 10.1093/bioadv/vbae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 08/17/2024] [Accepted: 11/19/2024] [Indexed: 11/29/2024]
Abstract
Summary Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F1 score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software. Availability and implementation https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.
Collapse
Affiliation(s)
- Yongcheng Mu
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Thu Nguyen
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Bryan Hawickhorst
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, United States
| | - Jiangwen Sun
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| |
Collapse
|
2
|
Sazzed S. Determining Protein Secondary Structures in Heterogeneous Medium-Resolution Cryo-EM Images Using CryoSSESeg. ACS OMEGA 2024; 9:26409-26416. [PMID: 38911779 PMCID: PMC11191131 DOI: 10.1021/acsomega.4c02608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 05/02/2024] [Accepted: 05/09/2024] [Indexed: 06/25/2024]
Abstract
While the acquisition of cryo-electron microscopy (cryo-EM) at near-atomic resolution is becoming more prevalent, a considerable number of density maps are still resolved only at intermediate resolutions (5-10 Å). Due to the large variation in quality among these medium-resolution density maps, extracting structural information from them remains a challenging task. This study introduces a convolutional neural network (CNN)-based framework, cryoSSESeg, to determine the organization of protein secondary structure elements in medium-resolution cryo-EM images. CryoSSESeg is trained on approximately 1300 protein chains derived from around 500 experimental cryo-EM density maps of varied quality. It demonstrates strong performance with residue-level F 1 scores of 0.76 for helix detection and 0.60 for β-sheet detection on average across a set of testing chains. In comparison to traditional image processing tools like SSETracer, which demand significant manual intervention and preprocessing steps, cryoSSESeg demonstrates comparable or superior performance. Additionally, it demonstrates competitive performance alongside another deep learning-based model, Emap2sec. Furthermore, this study underscores the importance of secondary structure quality, particularly adherence to expected shapes, in detection performance, emphasizing the necessity for careful evaluation of the data quality.
Collapse
|
3
|
Beton JG, Cragnolini T, Kaleel M, Mulvaney T, Sweeney A, Topf M. Integrating model simulation tools and
cryo‐electron
microscopy. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Joseph George Beton
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck and University College London London UK
| | - Manaz Kaleel
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Aaron Sweeney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| |
Collapse
|
4
|
Zumbado-Corrales M, Esquivel-Rodríguez J. EvoSeg: Automated Electron Microscopy Segmentation through Random Forests and Evolutionary Optimization. Biomimetics (Basel) 2021; 6:biomimetics6020037. [PMID: 34206006 PMCID: PMC8293153 DOI: 10.3390/biomimetics6020037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/17/2021] [Accepted: 05/28/2021] [Indexed: 11/30/2022] Open
Abstract
Electron Microscopy Maps are key in the study of bio-molecular structures, ranging from borderline atomic level to the sub-cellular range. These maps describe the envelopes that cover possibly a very large number of proteins that form molecular machines within the cell. Within those envelopes, we are interested to find what regions correspond to specific proteins so that we can understand how they function, and design drugs that can enhance or suppress a process that they are involved in, along with other experimental purposes. A classic approach by which we can begin the exploration of map regions is to apply a segmentation algorithm. This yields a mask where each voxel in 3D space is assigned an identifier that maps it to a segment; an ideal segmentation would map each segment to one protein unit, which is rarely the case. In this work, we present a method that uses bio-inspired optimization, through an Evolutionary-Optimized Segmentation algorithm, to iteratively improve upon baseline segments obtained from a classical approach, called watershed segmentation. The cost function used by the evolutionary optimization is based on an ideal segmentation classifier trained as part of this development, which uses basic structural information available to scientists, such as the number of expected units, volume and topology. We show that a basic initial segmentation with the additional information allows our evolutionary method to find better segmentation results, compared to the baseline generated by the watershed.
Collapse
|
5
|
Sazzed S, Scheible P, Alshammari M, Wriggers W, He J. Cylindrical Similarity Measurement for Helices in Medium-Resolution Cryo-Electron Microscopy Density Maps. J Chem Inf Model 2020; 60:2644-2650. [PMID: 32216344 PMCID: PMC8279803 DOI: 10.1021/acs.jcim.0c00010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Cryo-electron microscopy (cryo-EM) density maps at medium resolution (5-10 Å) reveal secondary structural features such as α-helices and β-sheets, but they lack the side chain details that would enable a direct structure determination. Among the more than 800 entries in the Electron Microscopy Data Bank (EMDB) of medium-resolution density maps that are associated with atomic models, a wide variety of similarities can be observed between maps and models. To validate such atomic models and to classify structural features, a local similarity criterion, the F1 score, is proposed and evaluated in this study. The F1 score is theoretically normalized to a range from zero to one, providing a local measure of cylindrical agreement between the density and atomic model of a helix. A systematic scan of 30,994 helices (among 3,247 protein chains modeled into medium-resolution density maps) reveals an actual range of observed F1 scores from 0.171 to 0.848, suggesting that the cylindrical fit of the current data is well stratified by the proposed measure. The best (highest) F1 scores tend to be associated with regions that exhibit high and spatially homogeneous local resolution (between 5 Å and 7.5 Å) in the helical density. The proposed F1 scores can be used as a discriminative classifier for validation studies and as a ranking criterion for cryo-EM density features in databases.
Collapse
Affiliation(s)
- Salim Sazzed
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Peter Scheible
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Maytha Alshammari
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| |
Collapse
|
6
|
Terwilliger TC, Adams PD, Afonine PV, Sobolev OV. Cryo-EM map interpretation and protein model-building using iterative map segmentation. Protein Sci 2020; 29:87-99. [PMID: 31599033 PMCID: PMC6933853 DOI: 10.1002/pro.3740] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/30/2019] [Accepted: 10/01/2019] [Indexed: 11/17/2022]
Abstract
A procedure for building protein chains into maps produced by single-particle electron cryo-microscopy (cryo-EM) is described. The procedure is similar to the way an experienced structural biologist might analyze a map, focusing first on secondary structure elements such as helices and sheets, then varying the contour level to identify connections between these elements. Since the high density in a map typically follows the main-chain of the protein, the main-chain connection between secondary structure elements can often be identified as the unbranched path between them with the highest minimum value along the path. This chain-tracing procedure is then combined with finding side-chain positions based on the presence of density extending away from the main path of the chain, allowing generation of a Cα model. The Cα model is converted to an all-atom model and is refined against the map. We show that this procedure is as effective as other existing methods for interpretation of cryo-EM maps and that it is considerably faster and produces models with fewer chain breaks than our previous methods that were based on approaches developed for crystallographic maps.
Collapse
Affiliation(s)
- Thomas C. Terwilliger
- Los Alamos National LaboratoryLos AlamosNew Mexico
- New Mexico ConsortiumLos AlamosNew Mexico
| | - Paul D. Adams
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCalifornia
- Department of BioengineeringUniversity of California BerkeleyBerkeleyCalifornia
| | - Pavel V. Afonine
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCalifornia
| | - Oleg V. Sobolev
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCalifornia
| |
Collapse
|
7
|
Terwilliger TC, Adams PD, Afonine PV, Sobolev OV. Map segmentation, automated model-building and their application to the Cryo-EM Model Challenge. J Struct Biol 2018; 204:338-343. [PMID: 30063987 PMCID: PMC6163059 DOI: 10.1016/j.jsb.2018.07.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 07/11/2018] [Accepted: 07/27/2018] [Indexed: 11/27/2022]
Abstract
A recently-developed method for identifying a compact, contiguous region representing the unique part of a density map was applied to 218 Cryo-EM maps with resolutions of 4.5 Å or better. The key elements of the segmentation procedure are (1) identification of all regions of density above a threshold and (2) choice of a unique set of these regions, taking symmetry into consideration, that maximize connectivity and compactness. This segmentation approach was then combined with tools for automated map sharpening and model-building to generate models for the 12 maps in the 2016 Cryo-EM Model Challenge in a fully automated manner. The resulting models have completeness from 24% to 82% and RMS distances from reference interpretations of 0.6 Å-2.1 Å.
Collapse
Affiliation(s)
- Thomas C Terwilliger
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA; New Mexico Consortium, Los Alamos, NM 87544, USA.
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720-8235, USA; Department of Bioengineering, University of California Berkeley, Berkeley, CA, USA
| | - Pavel V Afonine
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720-8235, USA; Department of Physics and International Centre for Quantum and Molecular Structures, Shanghai University, Shanghai 200444, People's Republic of China
| | - Oleg V Sobolev
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720-8235, USA
| |
Collapse
|
8
|
Haslam D, Zeng T, Li R, He J. Exploratory Studies Detecting Secondary Structures in Medium Resolution 3D Cryo-EM Images Using Deep Convolutional Neural Networks. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2018; 2018:628-632. [PMID: 35838356 PMCID: PMC9279009 DOI: 10.1145/3233547.3233704] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is an emerging biophysical technique for structural determination of protein complexes. However, accurate detection of secondary structures is still challenging when cryo-EM density maps are at medium resolutions (5-10 Å). Most of existing methods are image processing methods that do not fully utilize available images in the cryo-EM database. In this paper, we present a deep learning approach to segment secondary structure elements as helices and β-sheets from medium-resolution density maps. The proposed 3D convolutional neural network is shown to detect secondary structure locations with an F1 score between 0.79 and 0.88 for six simulated test cases. The architecture was also applied to an experimentally-derived cryo-EM density map with good accuracy.
Collapse
Affiliation(s)
- Devin Haslam
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529
| | - Tao Zeng
- Department of Computer Science, Washington State University, Pullman, WA 99164
| | | | - Jing He
- Corresponding author: Jing He,
| |
Collapse
|
9
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
10
|
Xu G, Ma T, Zang T, Wang Q, Ma J. OPUS-CSF: A C-atom-based scoring function for ranking protein structural models. Protein Sci 2017; 27:286-292. [PMID: 29047165 PMCID: PMC5734313 DOI: 10.1002/pro.3327] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 10/14/2017] [Accepted: 10/16/2017] [Indexed: 12/12/2022]
Abstract
We report a C‐atom‐based scoring function, named OPUS‐CSF, for ranking protein structural models. Rather than using traditional Boltzmann formula, we built a scoring function (CSF score) based on the native distributions (derived from the entire PDB) of coordinate components of mainchain C (carbonyl) atoms on selected residues of peptide segments of 5, 7, 9, and 11 residues in length. In testing OPUS‐CSF on decoy recognition, it maximally recognized 257 native structures out of 278 targets in 11 commonly used decoy sets, significantly outperforming other popular all‐atom empirical potentials. The average correlation coefficient with TM‐score was also comparable with those of other potentials. OPUS‐CSF is a highly coarse‐grained scoring function, which only requires input of partial mainchain information, and very fast. Thus, it is suitable for applications at early stage of structural building.
Collapse
Affiliation(s)
- Gang Xu
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Tianqi Ma
- Applied Physics Program, Rice University, Houston, Texas.,Department of Bioengineering, Rice University, Houston, Texas
| | - Tianwu Zang
- Applied Physics Program, Rice University, Houston, Texas.,Department of Bioengineering, Rice University, Houston, Texas
| | - Qinghua Wang
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, Texas
| | - Jianpeng Ma
- School of Life Sciences, Tsinghua University, Beijing, China.,Applied Physics Program, Rice University, Houston, Texas.,Department of Bioengineering, Rice University, Houston, Texas.,Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, Texas
| |
Collapse
|
11
|
Biswas A, Ranjan D, Zubair M, Zeil S, Nasr KA, He J. An Effective Computational Method Incorporating Multiple Secondary Structure Predictions in Topology Determination for Cryo-EM Images. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:578-586. [PMID: 27008671 PMCID: PMC5071113 DOI: 10.1109/tcbb.2016.2543721] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A key idea in de novo modeling of a medium-resolution density image obtained from cryo-electron microscopy is to compute the optimal mapping between the secondary structure traces observed in the density image and those predicted on the protein sequence. When secondary structures are not determined precisely, either from the image or from the amino acid sequence of the protein, the computational problem becomes more complex. We present an efficient method that addresses the secondary structure placement problem in presence of multiple secondary structure predictions and computes the optimal mapping. We tested the method using 12 simulated images from α-proteins and two Cryo-EM images of α-β proteins. We observed that the rank of the true topologies is consistently improved by using multiple secondary structure predictions instead of a single prediction. The results show that the algorithm is robust and works well even when errors/misses in the predicted secondary structures are present in the image or the sequence. The results also show that the algorithm is efficient and is able to handle proteins with as many as 33 helices.
Collapse
Affiliation(s)
- Abhishek Biswas
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Desh Ranjan
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Mohammad Zubair
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Stephanie Zeil
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Kamal Al Nasr
- Dept. of Computer Science, Tennessee State University, Nashville, TN 37209
| | - Jing He
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| |
Collapse
|
12
|
Li R, Si D, Zeng T, Ji S, He J. Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2017; 2016:41-46. [PMID: 29770260 DOI: 10.1109/bibm.2016.7822490] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The detection of secondary structure of proteins using three dimensional (3D) cryo-electron microscopy (cryo-EM) images is still a challenging task when the spatial resolution of cryo-EM images is at medium level (5-10Å ). Prior researches focused on the usage of local features that may not capture the global information of image objects. In this study, we propose to use deep learning methods to extract high representative global features and then automatically detect secondary structures of proteins. In particular, we build a convolutional neural network (CNN) classifier that predicts the probability of label for every individual voxel in 3D cryo-EM image with respect to the secondary structure elements of proteins such as α-helix, β-sheet and background. To effectively incorporate the 3D spatial information in protein structures, we propose to perform 3D convolutions in the convolutional layers of CNNs. We show that the proposed CNN classifier can outperform existing SVM method on identifying the secondary structure elements of proteins from 3D cryo-EM medium resolution images.
Collapse
Affiliation(s)
- Rongjian Li
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529
| | - Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011
| | - Tao Zeng
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164
| | - Shuiwang Ji
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529
| |
Collapse
|
13
|
Si D, He J. Modeling Beta-Traces for Beta-Barrels from Cryo-EM Density Maps. BIOMED RESEARCH INTERNATIONAL 2017; 2017:1793213. [PMID: 28164115 PMCID: PMC5259677 DOI: 10.1155/2017/1793213] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 12/08/2016] [Indexed: 01/09/2023]
Abstract
Cryo-electron microscopy (cryo-EM) has produced density maps of various resolutions. Although α-helices can be detected from density maps at 5-8 Å resolutions, β-strands are challenging to detect at such density maps due to close-spacing of β-strands. The variety of shapes of β-sheets adds the complexity of β-strands detection from density maps. We propose a new approach to model traces of β-strands for β-barrel density regions that are extracted from cryo-EM density maps. In the test containing eight β-barrels extracted from experimental cryo-EM density maps at 5.5 Å-8.25 Å resolution, StrandRoller detected about 74.26% of the amino acids in the β-strands with an overall 2.05 Å 2-way distance between the detected β-traces and the observed ones, if the best of the fifteen detection cases is considered.
Collapse
Affiliation(s)
- Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
14
|
Haslam D, Zubair M, Ranjan D, Biswas A, He J. CHALLENGES IN MATCHING SECONDARY STRUCTURES IN CRYO-EM: AN EXPLORATION. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2016; 2016:1714-1719. [PMID: 29770261 PMCID: PMC5952047 DOI: 10.1109/bibm.2016.7822776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Cryo-electron microscopy is a fast emerging biophysical technique for structural determination of large protein complexes. While more atomic structures are being determined using this technique, it is still challenging to derive atomic structures from density maps produced at medium resolution when no suitable templates are available. A critical step in structure determination is how a protein chain threads through the 3-dimensional density map. A dynamic programming method was previously developed to generate K best matches of secondary structures between the density map and its protein sequence using shortest paths in a related weighted graph. We discuss challenges associated with the creation of the weighted graph and explore heuristic methods to solve the problem of matching secondary structures.
Collapse
Affiliation(s)
- Devin Haslam
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| | - Mohammad Zubair
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| | - Desh Ranjan
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| | | | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| |
Collapse
|
15
|
Constrained cyclic coordinate descent for cryo-EM images at medium resolutions: beyond the protein loop closure problem. ROBOTICA 2016; 34:1777-1790. [DOI: 10.1017/s0263574716000242] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
SUMMARYThe cyclic coordinate descent (CCD) method is a popular loop closure method in protein structure modeling. It is a robotics algorithm originally developed for inverse kinematic applications. We demonstrate an effective method of building the backbone of protein structure models using the principle of CCD and a guiding trace. For medium-resolution 3-dimensional (3D) images derived using cryo-electron microscopy (cryo-EM), it is possible to obtain guiding traces of secondary structures and their skeleton connections. Our new method, constrained cyclic coordinate descent (CCCD), builds α-helices, β-strands, and loops quickly and fairly accurately along predefined traces. We show that it is possible to build the entire backbone of a protein fairly accurately when the guiding traces are accurate. In a test of 10 proteins, the models constructed using CCCD show an average of 3.91 Å of backbone root mean square deviation (RMSD). When the CCCD method is incorporated in a simulated annealing framework to sample possible shift, translation, and rotation freedom, the models built with the true topology were ranked high on the list, with an average backbone RMSD100 of 3.76 Å. CCCD is an effective method for modeling atomic structures after secondary structure traces and skeletons are extracted from 3D cryo-EM images.
Collapse
|
16
|
Wriggers W, He J. Numerical geometry of map and model assessment. J Struct Biol 2015; 192:255-61. [PMID: 26416532 DOI: 10.1016/j.jsb.2015.09.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 09/18/2015] [Accepted: 09/24/2015] [Indexed: 10/23/2022]
Abstract
We are describing best practices and assessment strategies for the atomic interpretation of cryo-electron microscopy (cryo-EM) maps. Multiscale numerical geometry strategies in the Situs package and in secondary structure detection software are currently evolving due to the recent increases in cryo-EM resolution. Criteria that aim to predict the accuracy of fitted atomic models at low (worse than 8Å) and medium (4-8 Å) resolutions remain challenging. However, a high level of confidence in atomic models can be achieved by combining such criteria. The observed errors are due to map-model discrepancies and due to the effect of imperfect global docking strategies. Extending the earlier motion capture approach developed for flexible fitting, we use simulated fiducials (pseudoatoms) at varying levels of coarse-graining to track the local drift of structural features. We compare three tracking approaches: naïve vector quantization, a smoothly deformable model, and a tessellation of the structure into rigid Voronoi cells, which are fitted using a multi-fragment refinement approach. The lowest error is an upper bound for the (small) discrepancy between the crystal structure and the EM map due to different conditions in their structure determination. When internal features such as secondary structures are visible in medium-resolution EM maps, it is possible to extend the idea of point-based fiducials to more complex geometric representations such as helical axes, strands, and skeletons. We propose quantitative strategies to assess map-model pairs when such secondary structure patterns are prominent.
Collapse
Affiliation(s)
- Willy Wriggers
- Department of Mechanical & Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, United States.
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States.
| |
Collapse
|
17
|
Biswas A, Ranjan D, Zubair M, He J. A Dynamic Programming Algorithm for Finding the Optimal Placement of a Secondary Structure Topology in Cryo-EM Data. J Comput Biol 2015; 22:837-43. [PMID: 26244416 DOI: 10.1089/cmb.2015.0120] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The determination of secondary structure topology is a critical step in deriving the atomic structures from the protein density maps obtained from electron cryomicroscopy technique. This step often relies on matching the secondary structure traces detected from the protein density map to the secondary structure sequence segments predicted from the amino acid sequence. Due to inaccuracies in both sources of information, a pool of possible secondary structure positions needs to be sampled. One way to approach the problem is to first derive a small number of possible topologies using existing matching algorithms, and then find the optimal placement for each possible topology. We present a dynamic programming method of Θ(Nq(2)h) to find the optimal placement for a secondary structure topology. We show that our algorithm requires significantly less computational time than the brute force method that is in the order of Θ(q(N) h).
Collapse
Affiliation(s)
- Abhishek Biswas
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Desh Ranjan
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Mohammad Zubair
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Jing He
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| |
Collapse
|
18
|
Si D, He J. Tracing Beta Strands Using StrandTwister from Cryo-EM Density Maps at Medium Resolutions. Structure 2014; 22:1665-76. [DOI: 10.1016/j.str.2014.08.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Revised: 08/07/2014] [Accepted: 08/08/2014] [Indexed: 10/24/2022]
|
19
|
López-Blanco JR, Chacón P. Structural modeling from electron microscopy data. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2014. [DOI: 10.1002/wcms.1199] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- José Ramón López-Blanco
- Department of Biological Physical Chemistry; Rocasolano Physical Chemistry Institute, CSIC; Madrid Spain
| | - Pablo Chacón
- Department of Biological Physical Chemistry; Rocasolano Physical Chemistry Institute, CSIC; Madrid Spain
| |
Collapse
|
20
|
Al Nasr K, Ranjan D, Zubair M, Chen L, He J. Solving the Secondary Structure Matching Problem in Cryo-EM De Novo Modeling Using a Constrained K-Shortest Path Graph Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:419-430. [PMID: 26355788 DOI: 10.1109/tcbb.2014.2302803] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Electron cryomicroscopy is becoming a major experimental technique in solving the structures of large molecular assemblies. More and more three-dimensional images have been obtained at the medium resolutions between 5 and 10 Å. At this resolution range, major α-helices can be detected as cylindrical sticks and β-sheets can be detected as plain-like regions. A critical question in de novo modeling from cryo-EM images is to determine the match between the detected secondary structures from the image and those on the protein sequence. We formulate this matching problem into a constrained graph problem and present an O(Δ(2)N(2)2(N)) algorithm to this NP-Hard problem. The algorithm incorporates the dynamic programming approach into a constrained K-shortest path algorithm. Our method, DP-TOSS, has been tested using α-proteins with maximum 33 helices and α-β proteins up to five helices and 12 β-strands. The correct match was ranked within the top 35 for 19 of the 20 α-proteins and all nine α-β proteins tested. The results demonstrate that DP-TOSS improves accuracy, time and memory space in deriving the topologies of the secondary structure elements for proteins with a large number of secondary structures and a complex skeleton.
Collapse
|
21
|
Si D, He J. Combining image processing and modeling to generate traces of beta-strands from cryo-EM density images of beta-barrels. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2014; 2014:3941-3944. [PMID: 25570854 DOI: 10.1109/embc.2014.6944486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Electron cryo-microscopy (Cryo-EM) technique produces 3-dimensional (3D) density images of proteins. When resolution of the images is not high enough to resolve the molecular details, it is challenging for image processing methods to enhance the molecular features. β-barrel is a particular structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive β-strands from the 3D image of a β-barrel at medium resolutions. We propose a new method, StrandRoller, to generate a small set of possible β-traces from the density images at medium resolutions of 5-10Å. StrandRoller has been tested using eleven β-barrel images simulated to 10Å resolution and one image isolated from the experimentally derived cryo-EM density image at 6.7Å resolution. StrandRoller was able to detect 81.84% of the β-strands with an overall 1.5Å 2-way distance between the detected and the observed β-traces, if the best of fifteen detections is considered. Our results suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when no separation of the β-strands is visible in the images.
Collapse
|
22
|
Esquivel-Rodríguez J, Kihara D. Computational methods for constructing protein structure models from 3D electron microscopy maps. J Struct Biol 2013; 184:93-102. [PMID: 23796504 DOI: 10.1016/j.jsb.2013.06.008] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Revised: 06/11/2013] [Accepted: 06/13/2013] [Indexed: 12/31/2022]
Abstract
Protein structure determination by cryo-electron microscopy (EM) has made significant progress in the past decades. Resolutions of EM maps have been improving as evidenced by recently reported structures that are solved at high resolutions close to 3Å. Computational methods play a key role in interpreting EM data. Among many computational procedures applied to an EM map to obtain protein structure information, in this article we focus on reviewing computational methods that model protein three-dimensional (3D) structures from a 3D EM density map that is constructed from two-dimensional (2D) maps. The computational methods we discuss range from de novo methods, which identify structural elements in an EM map, to structure fitting methods, where known high resolution structures are fit into a low-resolution EM map. A list of available computational tools is also provided.
Collapse
Affiliation(s)
- Juan Esquivel-Rodríguez
- Department of Computer Science, College of Science, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
23
|
Baker ML, Baker MR, Hryc CF, Ju T, Chiu W. Gorgon and pathwalking: macromolecular modeling tools for subnanometer resolution density maps. Biopolymers 2012; 97:655-68. [PMID: 22696403 PMCID: PMC3899894 DOI: 10.1002/bip.22065] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The complex interplay of proteins and other molecules, often in the form of large transitory assemblies, are critical to cellular function. Today, X-ray crystallography and electron cryo-microscopy (cryo-EM) are routinely used to image these macromolecular complexes, though often at limited resolutions. Despite the rapidly growing number of macromolecular structures, few tools exist for modeling and annotating structures in the range of 3-10 Å resolution. To address this need, we have developed a number of utilities specifically targeting subnanometer resolution density maps. As part of the 2010 Cryo-EM Modeling Challenge, we demonstrated two of our latest de novo modeling tools, Pathwalking and Gorgon, as well as a tool for secondary structure identification (SSEHunter) and a new rigid-body/flexible fitting tool in Gorgon. In total, we submitted 30 structural models from ten different subnanometer resolution data sets in four of the six challenge categories. Each of our utlities produced accurate structural models and annotations across the various density maps. In the end, the utilities that we present here offer users a robust toolkit for analyzing and modeling protein structure in macromolecular assemblies at non-atomic resolutions.
Collapse
Affiliation(s)
- Matthew L Baker
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, National Center for Macromolecular Imaging, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | | | | | |
Collapse
|
24
|
Zhang Q, Bettadapura R, Bajaj C. Macromolecular structure modeling from 3D EM using VolRover 2.0. Biopolymers 2012; 97:709-31. [PMID: 22696407 DOI: 10.1002/bip.22052] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We review tools for structure identification and model-based refinement from three-dimensional electron microscopy implemented in our in-house software package, VOLROVER 2.0. For viral density maps with icosahedral symmetry, we segment the capsid, polymeric, and monomeric subunits using techniques based on automatic symmetry detection and multidomain fast marching. For large biomolecules without symmetry information, we again use our multidomain fast-marching method with manual or fit-based multiseeding to segment meaningful substructures. In either case, we subject the resulting segmented subunit to secondary structure detection when the EM resolution is sufficiently high, and rigid-body structure fitting when the corresponding X-ray structure is available. Secondary structure elements are identified by three techniques: our earlier volume-based and boundary-based skeletonization methods as well as a new method, currently in development, based on solving the grassfire flow equation. For rigid-body fitting, we adapt our earlier fast Fourier-based correlation scheme F2Dock. Our reported segmentation, secondary structure elements identification, and rigid-body fitting techniques, implemented in VOLROVER 2.0 are applied to the PSB 2011 cryo-EM modeling challenge data, and our results are briefly compared to similar results submitted from other research groups. The comparisons show that our techniques are equally capable of segmenting relatively accurate subunits from a viral or protein assembly, and that high segmentation quality leads in turn to higher-quality results of secondary structure elements identification and correlation-based rigid-body fitting. © 2012 Wiley Periodicals, Inc. Biopolymers 97: 709-731, 2012.
Collapse
Affiliation(s)
- Qin Zhang
- Institute for Computational Engineering and Sciences, The University of Texas, Austin, TX 78712, USA
| | | | | |
Collapse
|
25
|
BISWAS ABHISHEK, SI DONG, AL NASR KAMAL, RANJAN DESH, ZUBAIR MOHAMMAD, HE JING. IMPROVED EFFICIENCY IN CRYO-EM SECONDARY STRUCTURE TOPOLOGY DETERMINATION FROM INACCURATE DATA. J Bioinform Comput Biol 2012; 10:1242006. [DOI: 10.1142/s0219720012420061] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The determination of the secondary structure topology is a critical step in deriving the atomic structure from the protein density map obtained from electron cryo-microscopy technique. This step often relies on the matching of two sources of information. One source comes from the secondary structures detected from the protein density map at the medium resolution, such as 5–10 Å. The other source comes from the predicted secondary structures from the amino acid sequence. Due to the inaccuracy in either source of information, a pool of possible secondary structure positions needs to be sampled. This paper studies the question, that is, how to reduce the computation of the mapping when the inaccuracy of the secondary structure predictions is considered. We present a method that combines the concept of dynamic graph with our previous work of using constrained shortest path to identify the topology of the secondary structures. We show a reduction of 34.55% of run-time as comparison to the naïve way of handling the inaccuracies. We also show an improved accuracy when the potential secondary structure errors are explicitly sampled verses the use of one consensus prediction. Our framework demonstrated the potential of developing computationally effective exact algorithms to identify the optimal topology of the secondary structures when the inaccuracy of the predicted data is considered.
Collapse
Affiliation(s)
- ABHISHEK BISWAS
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - DONG SI
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - KAMAL AL NASR
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - DESH RANJAN
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - MOHAMMAD ZUBAIR
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - JING HE
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
26
|
Si D, Ji S, Nasr KA, He J. A Machine Learning Approach for the Identification of Protein Secondary Structure Elements from Electron Cryo-Microscopy Density Maps. Biopolymers 2012; 97:698-708. [DOI: 10.1002/bip.22063] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
27
|
Reconstructing virus structures from nanometer to near-atomic resolutions with cryo-electron microscopy and tomography. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 726:49-90. [PMID: 22297510 DOI: 10.1007/978-1-4614-0980-9_4] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
The past few decades have seen tremendous advances in single-particle electron -cryo-microscopy (cryo-EM). The field has matured to the point that near-atomic resolution density maps can be generated for icosahedral viruses without the need for crystallization. In parallel, substantial progress has been made in determining the structures of nonicosahedrally arranged proteins in viruses by employing either single-particle cryo-EM or cryo-electron tomography (cryo-ET). Implicit in this course have been the availability of a new generation of electron cryo-microscopes and the development of the computational tools that are essential for generating these maps and models. This methodology has enabled structural biologists to analyze structures in increasing detail for virus particles that are in different morphogenetic states. Furthermore, electron imaging of frozen, hydrated cells, in the process of being infected by viruses, has also opened up a new avenue for studying virus structures "in situ". Here we present the common techniques used to acquire and process cryo-EM and cryo-ET data and discuss their implications for structural virology both now and in the future.
Collapse
|
28
|
Bajaj C, Goswami S, Zhang Q. Detection of secondary and supersecondary structures of proteins from cryo-electron microscopy. J Struct Biol 2011; 177:367-81. [PMID: 22186625 DOI: 10.1016/j.jsb.2011.11.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2011] [Revised: 11/09/2011] [Accepted: 11/15/2011] [Indexed: 11/30/2022]
Abstract
Recent advances in three-dimensional electron microscopy (3D EM) have enabled the quantitative visualization of the structural building blocks of proteins at improved resolutions. We provide algorithms to detect the secondary structures (α-helices and β-sheets) from proteins for which the volumetric maps are reconstructed at 6-10Å resolution. Additionally, we show that when the resolution is coarser than 10Å, some of the supersecondary structures can be detected from 3D EM maps. For both these algorithms, we employ tools from computational geometry and differential topology, specifically the computation of stable/unstable manifolds of certain critical points of the distance function induced by the molecular surface. Our results connect mathematically well-defined constructions with bio-chemically induced structures observed in proteins.
Collapse
Affiliation(s)
- Chandrajit Bajaj
- Center for Computational Visualization, The Institute for Computational Engineering and Sciences, Department of Computer Science, The University of Texas at Austin, University Station C0200, Austin, TX 78712, USA.
| | | | | |
Collapse
|
29
|
AL NASR KAMAL, RANJAN DESH, ZUBAIR MOHAMMAD, HE JING. RANKING VALID TOPOLOGIES OF THE SECONDARY STRUCTURE ELEMENTS USING A CONSTRAINT GRAPH. J Bioinform Comput Biol 2011; 9:415-30. [DOI: 10.1142/s0219720011005604] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2011] [Revised: 04/12/2011] [Accepted: 04/17/2011] [Indexed: 11/18/2022]
Abstract
Electron cryo-microscopy is a fast advancing biophysical technique to derive three-dimensional structures of large protein complexes. Using this technique, many density maps have been generated at intermediate resolution such as 6–10 Å resolution. Although it is challenging to derive the backbone of the protein directly from such density maps, secondary structure elements such as helices and β-sheets can be computationally detected. Our work in this paper provides an approach to enumerate the top-ranked possible topologies instead of enumerating the entire population of the topologies. This approach is particularly practical for large proteins. We developed a directed weighted graph, the topology graph, to represent the secondary structure assignment problem. We prove that the problem of finding the valid topology with the minimum cost is NP hard. We developed an O(N2 2N) dynamic programming algorithm to identify the topology with the minimum cost. The test of 15 proteins suggests that our dynamic programming approach is feasible to work with proteins of much larger size than we could before. The largest protein in the test contains 18 helical sticks detected from the density map out of 33 helices in the protein.
Collapse
Affiliation(s)
- KAMAL AL NASR
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - DESH RANJAN
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - MOHAMMAD ZUBAIR
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - JING HE
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
30
|
LU YONGGANG, HE JING, STRAUSS CHARLIEEM. DERIVING TOPOLOGY AND SEQUENCE ALIGNMENT FOR THE HELIX SKELETON IN LOW-RESOLUTION PROTEIN DENSITY MAPS. J Bioinform Comput Biol 2011; 6:183-201. [DOI: 10.1142/s0219720008003357] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2007] [Revised: 10/07/2007] [Accepted: 10/13/2007] [Indexed: 11/18/2022]
Abstract
Cryoelectron microscopy (cryoEM) is an experimental technique to determine the three-dimensional (3D) structure of large protein complexes. Currently, this technique is able to generate protein density maps at 6–9 Å resolution, at which the skeleton of the structure (which is composed of α-helices and β-sheets) can be visualized. As a step towards predicting the entire backbone of the protein from the protein density map, we developed a method to predict the topology and sequence alignment for the skeleton helices. Our method combines the geometrical information of the skeleton helices with the Rosetta ab initio structure prediction method to derive a consensus topology and sequence alignment for the skeleton helices. We tested the method with 60 proteins. For 45 proteins, the majority of the skeleton helices were assigned a correct topology from one of our top ten predictions. The offsets of the alignment for most of the assigned helices were within ±2 amino acids in the sequence. We also analyzed the use of the skeleton helices as a clustering tool for the decoy structures generated by Rosetta. Our comparison suggests that the topology clustering is a better method than a general overlap clustering method to enrich the ranking of decoys, particularly when the decoy pool is small.
Collapse
Affiliation(s)
- YONGGANG LU
- Department of Computer Science, New Mexico State University, Las Cruces, NM 88003, USA
| | - JING HE
- Department of Computer Science, New Mexico State University, Las Cruces, NM 88003, USA
| | - CHARLIE E. M. STRAUSS
- Bioscience Division, M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| |
Collapse
|
31
|
Baker ML, Abeysinghe SS, Schuh S, Coleman RA, Abrams A, Marsh MP, Hryc CF, Ruths T, Chiu W, Ju T. Modeling protein structure at near atomic resolutions with Gorgon. J Struct Biol 2011; 174:360-73. [PMID: 21296162 PMCID: PMC3078171 DOI: 10.1016/j.jsb.2011.01.015] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Revised: 01/27/2011] [Accepted: 01/31/2011] [Indexed: 11/29/2022]
Abstract
Electron cryo-microscopy (cryo-EM) has played an increasingly important role in elucidating the structure and function of macromolecular assemblies in near native solution conditions. Typically, however, only non-atomic resolution reconstructions have been obtained for these large complexes, necessitating computational tools for integrating and extracting structural details. With recent advances in cryo-EM, maps at near-atomic resolutions have been achieved for several macromolecular assemblies from which models have been manually constructed. In this work, we describe a new interactive modeling toolkit called Gorgon targeted at intermediate to near-atomic resolution density maps (10-3.5 Å), particularly from cryo-EM. Gorgon's de novo modeling procedure couples sequence-based secondary structure prediction with feature detection and geometric modeling techniques to generate initial protein backbone models. Beyond model building, Gorgon is an extensible interactive visualization platform with a variety of computational tools for annotating a wide variety of 3D volumes. Examples from cryo-EM maps of Rotavirus and Rice Dwarf Virus are used to demonstrate its applicability to modeling protein structure.
Collapse
Affiliation(s)
- Matthew L Baker
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Al Nasr K, Sun W, He J. Structure prediction for the helical skeletons detected from the low resolution protein density map. BMC Bioinformatics 2010; 11 Suppl 1:S44. [PMID: 20122218 PMCID: PMC3009517 DOI: 10.1186/1471-2105-11-s1-s44] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Background The current advances in electron cryo-microscopy technique have made it possible to obtain protein density maps at about 6-10 Å resolution. Although it is hard to derive the protein chain directly from such a low resolution map, the location of the secondary structures such as helices and strands can be computationally detected. It has been demonstrated that such low-resolution map can be used during the protein structure prediction process to enhance the structure prediction. Results We have developed an approach to predict the 3-dimensional structure for the helical skeletons that can be detected from the low resolution protein density map. This approach does not require the construction of the entire chain and distinguishes the structures based on the conformation of the helices. A test with 35 low resolution density maps shows that the highest ranked structure with the correct topology can be found within the top 1% of the list ranked by the effective energy formed by the helices. Conclusion The results in this paper suggest that it is possible to eliminate the great majority of the bad conformations of the helices even without the construction of the entire chain of the protein. For many proteins, the effective contact energy formed by the secondary structures alone can distinguish a small set of likely structures from the pool.
Collapse
Affiliation(s)
- Kamal Al Nasr
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA.
| | | | | |
Collapse
|
33
|
Abstract
Today, electron cryomicroscopy (cryo-EM) can routinely achieve subnanometer resolutions of complex macromolecular assemblies. From a density map, one can extract key structural and functional information using a variety of computational analysis tools. At subnanometer resolution, these tools make it possible to isolate individual subunits, identify secondary structures, and accurately fit atomic models. With several cryo-EM studies achieving resolutions beyond 5Å, computational modeling and feature recognition tools have been employed to construct backbone and atomic models of the protein components directly from a density map. In this chapter, we describe several common classes of computational tools that can be used to analyze and model subnanometer resolution reconstructions from cryo-EM. A general protocol for analyzing subnanometer resolution density maps is presented along with a full description of steps used in analyzing the 4.3Å resolution structure of Mm-cpn.
Collapse
Affiliation(s)
- Matthew L Baker
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA
| | | | | | | |
Collapse
|
34
|
An automated procedure for detecting protein folds from sub-nanometer resolution electron density. J Struct Biol 2009; 170:513-21. [PMID: 20026407 DOI: 10.1016/j.jsb.2009.12.014] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2009] [Revised: 12/09/2009] [Accepted: 12/15/2009] [Indexed: 11/24/2022]
Abstract
The use of sub-nanometer resolution electron density as spatial constraints for de novo and ab initio structure prediction requires knowledge of protein boundaries to accurately segment the electron density for the prediction algorithms. Here we present a procedure where even poorly segmented density can be used to determine the fold of the protein. The method is automated, fast, capable of searching for multiple copies of a protein fold, and accessible to densities encompassing more than a thousand residues. The automation is particularly powerful as it allows the procedure to take full advantage of the expanding repository in the Protein Data Bank. We have tested the method on nine segmented sub-nanometer image reconstruction electron densities. The method successfully identifies the correct fold for the six densities for which an atomic structure is known, identifies a fold that agrees with prior structural data, a fold that agrees with predictions from the Fold & Function Assignment server, and a fold that correlates with secondary structure prediction. The identified folds in the last three examples can be used as templates for comparative modeling of the bacteriophage P22 tail-machine (a 3MDa complex composed of 39 protein subunits).
Collapse
|
35
|
Sun W, He J. Native secondary structure topology has near minimum contact energy among all possible geometrically constrained topologies. Proteins 2009; 77:159-73. [PMID: 19415754 DOI: 10.1002/prot.22427] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Secondary structure topology in this article refers to the order and the direction of the secondary structures, such as helices and strands, with respect to the protein sequence. Even when the locations of the secondary structure Calpha atoms are known, there are still (N!2(N))(M!2(M)) different possible topologies for a protein with N helices and M strands. This work explored the question if the native topology is likely to be identified among a large set of all possible geometrically constrained topologies through an evaluation of the residue contact energy formed by the secondary structures, instead of the entire chain. We developed a contact pair specific and distance specific multiwell function based on the statistical characterization of the side chain distances of 413 proteins in the Protein Data Bank. The multiwell function has specific parameters to each of the 210 pairs of residue contacts. We illustrated a general mathematical method to extend a single well function to a multiwell function to represent the statistical data. We have performed a mutation analysis using 50 proteins to generate all the possible geometrically constrained topologies of the secondary structures. The result shows that the native topology is within the top 25% of the list ranked by the effective contact energies of the secondary structures for all the 50 proteins, and is within the top 5% for 34 proteins. As an application, the method was used to derive the structure of the skeletons from a low resolution density map that can be obtained through electron cryomicroscopy.
Collapse
Affiliation(s)
- Weitao Sun
- Department of Computer Science, New Mexico State University, Las Cruces, New Mexico 88003, USA
| | | |
Collapse
|
36
|
Heuser P, Langer GG, Lamzin VS. Interpretation of very low resolution X-ray electron-density maps using core objects. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2009; 65:690-6. [PMID: 19564689 PMCID: PMC2703575 DOI: 10.1107/s090744490901991x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2009] [Accepted: 05/25/2009] [Indexed: 11/11/2022]
Abstract
A novel approach to obtaining structural information from macromolecular X-ray data extending to resolutions as low as 20 A is presented. Following a simple map-segmentation procedure, the approximate shapes of the domains forming the structure are identified. A pattern-recognition comparative analysis of these shapes and those derived from the structures of domains from the PDB results in candidate structural models that can be used for a fit into the density map. It is shown that the placed candidate models can be employed for subsequent phase extension to higher resolution.
Collapse
Affiliation(s)
- Philipp Heuser
- Hamburg Unit, European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, Hamburg 22603, Germany.
| | | | | |
Collapse
|
37
|
Lindert S, Stewart PL, Meiler J. Hybrid approaches: applying computational methods in cryo-electron microscopy. Curr Opin Struct Biol 2009; 19:218-25. [PMID: 19339173 PMCID: PMC2726835 DOI: 10.1016/j.sbi.2009.02.010] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Accepted: 02/26/2009] [Indexed: 12/20/2022]
Abstract
Recent advances in cryo-electron microscopy have led to an increasing number of high (3-5A) to medium (5-10A) resolution cryoEM density maps. These density maps contain valuable information about the protein structure but frequently require computational algorithms to aid their structural interpretation. It is these hybrid approaches between cryoEM and computational protein structure prediction algorithms that will shape protein structure elucidation from density maps.
Collapse
Affiliation(s)
- Steffen Lindert
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA
| | - Phoebe L. Stewart
- Department of Molecular Physiology and Biophysics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN 37212, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37212, USA
| |
Collapse
|
38
|
Sun W, He J. Reduction of the secondary structure topological space through direct estimation of the contact energy formed by the secondary structures. BMC Bioinformatics 2009; 10 Suppl 1:S40. [PMID: 19208142 PMCID: PMC2648730 DOI: 10.1186/1471-2105-10-s1-s40] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Background Electron cryomicroscopy is a fast developing technique aiming at the determination of the 3-dimensional structures of large protein complexes. Using this technique, protein density maps can be generated with 6 to 10 Å resolution. At such resolutions, the secondary structure elements such as helices and β-strands appear to be skeletons and can be computationally detected. However, it is not known which segment of the protein sequence corresponds to which of the skeletons. The topology in this paper refers to the linear order and the directionality of the secondary structures. For a protein with N helices and M strands, there are (N!2N)(M!2M) different topologies, each of which maps N helix segments and M strand segments on the protein sequence to N helix and M strand skeletons. Since the backbone position is not available in the skeleton, each topology of the skeletons corresponds to additional freedom to position the atoms in the skeletons. Results We have developed a method to construct the possible atomic structures for the helix skeletons by sampling the solution space of all the possible topologies of the skeletons. Our method also ranks the possible structures based on the contact energy formed by the secondary structures, rather than the entire chain. If we assume that the backbone atomic positions are known for the skeletons, then the native topology of the secondary structures can be found in the top 30% of the ranked list of all possible topologies for all the 30 proteins tested, and within the top 5% for most of the 30 proteins. Without assuming the backbone location of the skeletons, the possible atomic structures of the skeletons can be constructed using the axis of the skeleton and the sequence segments. The best constructed structure for the skeletons has RMSD to native between 4 and 5 Å for the four tested α-proteins. These best constructed structures were ranked the 17th, 31st, 16th and 5th respectively for the four proteins out of 32066, 391833, 98755 and 192935 possible assignments in the pool. Conclusion Our work suggested that the direct estimation of the contact energy formed by the secondary structures is quite effective in reducing the topological space to a small subset that includes a near native structure for the skeletons.
Collapse
Affiliation(s)
- Weitao Sun
- Department of Computer Science, New Mexico State University, Las Cruces, 88003, USA.
| | | |
Collapse
|
39
|
Yu Z, Bajaj C. Computational approaches for automatic structural analysis of large biomolecular complexes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:568-582. [PMID: 18989044 DOI: 10.1109/tcbb.2007.70226] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
We present computational solutions to two problems of macromolecular structure interpretation from reconstructed three-dimensional electron microscopy (3D-EM) maps of large bio-molecular complexes at intermediate resolution (5A-15 A). The two problems addressed are: 1) 3D structural alignment (matching) between identified and segmented 3D maps of structure units (e.g. trimeric configuration of proteins), and 2) the secondary structure identification of a segmented protein 3D map (i.e.locations of alpha-helices, beta-sheets). For problem 1, we present an efficient algorithm to correlate spatially (and structurally) two 3D maps of structure units. Besides providing a similarity score between structure units, the algorithm yields an effective technique for resolution refinement of repeated structure units, by 3D alignment and averaging. For problem 2, we present an efficient algorithm to compute eigenvalues and link eigenvectors of a Gaussian convoluted structure tensor derived from the protein 3D Map, thereby identifying and locating secondary structural motifs of proteins. The efficiency and performance of our approach is demonstrated on several experimentally reconstructed 3D maps of virus capsid shells from single-particle cryo-electron microscopy (cryo-EM), as well as computationally simulated protein structure density 3D maps generated from protein model entries in the Protein Data Bank.
Collapse
Affiliation(s)
- Zeyun Yu
- Department of Computer Science, University of Wisconsin, Milwaukee, WI 53211, USA.
| | | |
Collapse
|
40
|
Wu Y, Tian X, Lu M, Chen M, Wang Q, Ma J. Folding of small helical proteins assisted by small-angle X-ray scattering profiles. Structure 2008; 13:1587-97. [PMID: 16271882 DOI: 10.1016/j.str.2005.07.023] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2005] [Revised: 07/21/2005] [Accepted: 07/22/2005] [Indexed: 10/25/2022]
Abstract
This paper reports a computational method for folding small helical proteins. The goal was to determine the overall topology of proteins given secondary structure assignment on sequence. In doing so, a Monte Carlo protocol, which combines coarse-grained normal modes and a Hamiltonian at a different scale, was developed to enhance sampling. In addition to the knowledge-based potential functions, a small-angle X-ray scattering (SAXS) profile was also used as a weak constraint for guiding the folding. The algorithm can deliver structural models with overall correct topology, which makes them similar to those of 5 approximately 6 A cryo-EM density maps. The success could contribute to make the SAXS technique a fast and inexpensive solution-phase experimental method for determining the overall topology of small, soluble, but noncrystallizable, helical proteins.
Collapse
Affiliation(s)
- Yinghao Wu
- Department of Bioengineering, Rice University, Houston, Texas 77005, USA
| | | | | | | | | | | |
Collapse
|
41
|
Baker ML, Ju T, Chiu W. Identification of secondary structure elements in intermediate-resolution density maps. Structure 2007; 15:7-19. [PMID: 17223528 PMCID: PMC1810566 DOI: 10.1016/j.str.2006.11.008] [Citation(s) in RCA: 135] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2006] [Revised: 11/10/2006] [Accepted: 11/18/2006] [Indexed: 11/25/2022]
Abstract
An increasing number of structural studies of large macromolecular complexes, both in X-ray crystallography and cryo-electron microscopy, have resulted in intermediate-resolution (5-10 A) density maps. Despite being limited in resolution, significant structural and functional information may be extractable from these maps. To aid in the analysis and annotation of these complexes, we have developed SSEhunter, a tool for the quantitative detection of alpha helices and beta sheets. Based on density skeletonization, local geometry calculations, and a template-based search, SSEhunter has been tested and validated on a variety of simulated and authentic subnanometer-resolution density maps. The result is a robust, user-friendly approach that allows users to quickly visualize, assess, and annotate intermediate-resolution density maps. Beyond secondary structure element identification, the skeletonization algorithm in SSEhunter provides secondary structure topology, which is potentially useful in leading to structural models of individual molecular components directly from the density.
Collapse
Affiliation(s)
- Matthew L. Baker
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030
| | - Tao Ju
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO 63130
| | - Wah Chiu
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030
- *Corresponding author , Phone: 713-798-6985, Fax: 713-798-8682
| |
Collapse
|
42
|
Dror O, Lasker K, Nussinov R, Wolfson H. EMatch: an efficient method for aligning atomic resolution subunits into intermediate-resolution cryo-EM maps of large macromolecular assemblies. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2007; 63:42-9. [PMID: 17164525 PMCID: PMC2483490 DOI: 10.1107/s0907444906041059] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2006] [Accepted: 10/08/2006] [Indexed: 11/22/2022]
Abstract
Structural analysis of biological machines is essential for inferring their function and mechanism. Nevertheless, owing to their large size and instability, deciphering the atomic structure of macromolecular assemblies is still considered as a challenging task that cannot keep up with the rapid advances in the protein-identification process. In contrast, structural data at lower resolution is becoming more and more available owing to recent advances in cryo-electron microscopy (cryo-EM) techniques. Once a cryo-EM map is acquired, one of the basic questions asked is what are the folds of the components in the assembly and what is their configuration. Here, a novel knowledge-based computational method, named EMatch, towards tackling this task for cryo-EM maps at 6-10 A resolution is presented. The method recognizes and locates possible atomic resolution structural homologues of protein domains in the assembly. The strengths of EMatch are demonstrated on a cryo-EM map of native GroEL at 6 A resolution.
Collapse
Affiliation(s)
- Oranit Dror
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Keren Lasker
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ruth Nussinov
- Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
- Basic Research Program, SAIC-Frederick, Center for Cancer Research Nanobiology Program, NCI-Frederick, Building 469, Room 151, Frederick, MD 21702 USA
| | - Haim Wolfson
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
43
|
Lasker K, Dror O, Shatsky M, Nussinov R, Wolfson HJ. EMatch: discovery of high resolution structural homologues of protein domains in intermediate resolution cryo-EM maps. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2007; 4:28-39. [PMID: 17277411 DOI: 10.1109/tcbb.2007.1003] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Cryo-EM has become an increasingly powerful technique for elucidating the structure, dynamics, and function of large flexible macromolecule assemblies that cannot be determined at atomic resolution. However, due to the relatively low resolution of cryo-EM data, a major challenge is to identify components of complexes appearing in cryo-EM maps. Here, we describe EMatch, a novel integrated approach for recognizing structural homologues of protein domains present in a 6-10 A resolution cryo-EM map and constructing a quasi-atomic structural model of their assembly. The method is highly efficient and has been successfully validated on various simulated data. The strength of the method is demonstrated by a domain assembly of an experimental cryo-EM map of native GroEL at 6 A resolution.
Collapse
Affiliation(s)
- Keren Lasker
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Israel.
| | | | | | | | | |
Collapse
|
44
|
Baker ML, Jiang W, Wedemeyer WJ, Rixon FJ, Baker D, Chiu W. Ab initio modeling of the herpesvirus VP26 core domain assessed by CryoEM density. PLoS Comput Biol 2006; 2:e146. [PMID: 17069457 PMCID: PMC1626159 DOI: 10.1371/journal.pcbi.0020146] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2006] [Accepted: 09/26/2006] [Indexed: 12/22/2022] Open
Abstract
Efforts in structural biology have targeted the systematic determination of all protein structures through experimental determination or modeling. In recent years, 3-D electron cryomicroscopy (cryoEM) has assumed an increasingly important role in determining the structures of these large macromolecular assemblies to intermediate resolutions (6-10 A). While these structures provide a snapshot of the assembly and its components in well-defined functional states, the resolution limits the ability to build accurate structural models. In contrast, sequence-based modeling techniques are capable of producing relatively robust structural models for isolated proteins or domains. In this work, we developed and applied a hybrid modeling approach, utilizing cryoEM density and ab initio modeling to produce a structural model for the core domain of a herpesvirus structural protein, VP26. Specifically, this method, first tested on simulated data, utilizes the cryoEM density map as a geometrical constraint in identifying the most native-like models from a gallery of models generated by ab initio modeling. The resulting model for the core domain of VP26, based on the 8.5-A resolution herpes simplex virus type 1 (HSV-1) capsid cryoEM structure and mutational data, exhibited a novel fold. Additionally, the core domain of VP26 appeared to have a complementary interface to the known upper-domain structure of VP5, its cognate binding partner. While this new model provides for a better understanding of the assembly and interactions of VP26 in HSV-1, the approach itself may have broader applications in modeling the components of large macromolecular assemblies.
Collapse
Affiliation(s)
- Matthew L Baker
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, United States of America
| | - Wen Jiang
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - William J Wedemeyer
- Department of Biochemistry, Michigan State University, East Lansing, Michigan, United States of America
| | - Frazer J Rixon
- MRC Virology Unit, Institute of Virology, Glasgow, United Kingdom
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Wah Chiu
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, United States of America
| |
Collapse
|
45
|
Chiu W, Baker ML, Almo SC. Structural biology of cellular machines. Trends Cell Biol 2006; 16:144-50. [PMID: 16459078 DOI: 10.1016/j.tcb.2006.01.002] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2005] [Revised: 12/06/2005] [Accepted: 01/19/2006] [Indexed: 01/29/2023]
Abstract
Multi-component macromolecular machines contribute to all essential biological processes, from cell motility and signal transduction to information storage and processing. Structural analysis of assemblies at atomic resolution is emerging as the field of structural cell biology. Several recent studies, including those focused on the ribosome, the acrosomal bundle and bacterial flagella, have demonstrated the ability of a hybrid approach that combines imaging, crystallography and computational tools to generate testable atomic models of fundamental biological machines. A complete understanding of cellular and systems biology will require the detailed structural understanding of hundreds of biological machines. The realization of this goal demands a concerted effort to develop and apply new strategies for the systematic identification, isolation, structural characterization and mechanistic analysis of multi-component assemblies at all resolution ranges. The establishment of a database describing the structural and dynamic properties of protein assemblies will provide novel opportunities to define the molecular and atomic mechanisms controlling overall cell physiology.
Collapse
Affiliation(s)
- Wah Chiu
- National Center for Macromolecular Imaging and Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | |
Collapse
|
46
|
Bajaj C, Goswami S. Secondary and Tertiary Structural Fold Elucidation from 3D EM Maps of Macromolecules. COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING 2006. [DOI: 10.1007/11949619_24] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
47
|
Topf M, Sali A. Combining electron microscopy and comparative protein structure modeling. Curr Opin Struct Biol 2005; 15:578-85. [PMID: 16118050 DOI: 10.1016/j.sbi.2005.08.001] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2005] [Revised: 07/01/2005] [Accepted: 08/10/2005] [Indexed: 10/25/2022]
Abstract
Recently, advances have been made in methods and applications that integrate electron microscopy density maps and comparative modeling to produce atomic structures of macromolecular assemblies. Electron microscopy can benefit from comparative modeling through the fitting of comparative models into electron microscopy density maps. Also, comparative modeling can benefit from electron microscopy through the use of intermediate-resolution density maps in fold recognition, template selection and sequence-structure alignment.
Collapse
Affiliation(s)
- Maya Topf
- Department of Biopharmaceutical Sciences, University of California San Francisco, San Francisco, CA 94143, USA
| | | |
Collapse
|
48
|
Rossmann MG, Morais MC, Leiman PG, Zhang W. Combining X-ray crystallography and electron microscopy. Structure 2005; 13:355-62. [PMID: 15766536 PMCID: PMC7173138 DOI: 10.1016/j.str.2005.01.005] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2004] [Revised: 12/29/2004] [Accepted: 01/01/2005] [Indexed: 11/30/2022]
Abstract
The combination of cryo-electron microscopy to study large biological assemblies at low resolution with crystallography to determine near atomic structures of assembly fragments is quickly expanding the horizon of structural biology. This technique can be used to advantage in the study of large structures that cannot be crystallized, to follow dynamic processes, and to “purify” samples by visual selection of particles. Factors affecting the quality of cryo-electron microscopy maps and limits of accuracy in fitting known structural fragments are discussed.
Collapse
Affiliation(s)
- Michael G Rossmann
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907-2054, USA.
| | | | | | | |
Collapse
|
49
|
Wu Y, Chen M, Lu M, Wang Q, Ma J. Determining Protein Topology from Skeletons of Secondary Structures. J Mol Biol 2005; 350:571-86. [PMID: 15961102 DOI: 10.1016/j.jmb.2005.04.064] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2004] [Revised: 04/24/2005] [Accepted: 04/27/2005] [Indexed: 11/16/2022]
Abstract
We report a novel computational procedure for determining protein native topology, or fold, by defining loop connectivity based on skeletons of secondary structures that can usually be obtained from low to intermediate-resolution density maps. The procedure primarily involves a knowledge-based geometry filter followed by an energetics-based evaluation. It was tested on a large set of skeletons covering a wide range of protein architecture, including one modeled from an experimentally determined 7.6A cryo-electron microscopy (cryo-EM) density map. The results showed that the new procedure could effectively deduce protein folds without high-resolution structural data, a feature that could also be used to recognize native fold in structure prediction and to interpret data in fields like structure genomics. Most importantly, in the energetics-based evaluation, it was revealed that, despite the inevitable errors in the artificially constructed structures and limited accuracy of knowledge-based potential functions, the average energy of an ensemble of structures with slightly different configurations around the native skeleton is a much more robust parameter for marking native topology than the energy of individual structures in the ensemble. This result implies that, among all the possible topology candidates for a given skeleton, evolution has selected the native topology as the one that can accommodate the largest structural variations, not the one rigidly trapped in a deep, but narrow, conformational energy well.
Collapse
Affiliation(s)
- Yinghao Wu
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | | | | | | | | |
Collapse
|
50
|
Chiu W, Baker ML, Jiang W, Dougherty M, Schmid MF. Electron cryomicroscopy of biological machines at subnanometer resolution. Structure 2005; 13:363-72. [PMID: 15766537 DOI: 10.1016/j.str.2004.12.016] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2004] [Revised: 12/22/2004] [Accepted: 12/25/2004] [Indexed: 01/29/2023]
Abstract
Advances in electron cryomicroscopy (cryo-EM) have made possible the structural determination of large biological machines in the resolution range of 6-9 angstroms. Rice dwarf virus and the acrosomal bundle represent two distinct types of machines amenable to cryo-EM investigations at subnanometer resolutions. However, calculating the density map is only the first step, and much analysis remains to extract structural insights and the mechanism of action in these machines. This paper will review the computational and visualization methodologies necessary for analysis (structure mining) of the computed cryo-EM maps of these machines. These steps include component segmentation, averaging based on local symmetry among components, density connectivity trace, incorporation of bioinformatics analysis, and fitting of high-resolution component data, if available. The consequences of these analyses can not only identify accurately some of the secondary structure elements of the molecular components in machines but also suggest structural mechanisms related to their biological functions.
Collapse
Affiliation(s)
- Wah Chiu
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030, USA.
| | | | | | | | | |
Collapse
|