1
|
Punuru P, Jain A, Kihara D. Secondary Structure Detection and Structure Modeling for Cryo-EM. Methods Mol Biol 2025; 2870:341-355. [PMID: 39543043 DOI: 10.1007/978-1-0716-4213-9_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Rapid advancements in cryogenic electron microscopy (cryo-EM) have revolutionized the field of structural biology by enabling the determination of complex macromolecular structures at unprecedented resolutions. When cryo-EM density maps have a resolution around 3 Å, the atomic structure can be modeled manually. However, as the resolution decreases, analyzing these density maps becomes increasingly challenging. For modeling structures in lower resolution maps, deep learning can be used to identify structural features in the maps to assist in structure modeling.Here, we present a suite of deep learning-based tools developed by our lab that enable structural biologists to work with cryo-EM maps of a wide range of resolutions. For cryo-EM maps at near-atomic resolution (5 Å or better), DeepMainmast automatically models all-atom structures by tracing the main chain from local map features of amino acids and atoms detected by deep learning; DAQ score quantifies map-model fit and indicates potential misassignments in protein models. In intermediate resolution maps (5-10 Å), Emap2sec and Emap2sec+ can accurately detect protein secondary structures and nucleic acids. These tools and more are available at our web server: https://em.kiharalab.org/ .
Collapse
Affiliation(s)
- Pranav Punuru
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Anika Jain
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
2
|
Luo D, Alsuwaykit Z, Khan D, Strnad O, Isenberg T, Viola I. DiffFit: Visually-Guided Differentiable Fitting of Molecule Structures to a Cryo-EM Map. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:558-568. [PMID: 39255135 DOI: 10.1109/tvcg.2024.3456404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
We introduce DiffFit, a differentiable algorithm for fitting protein atomistic structures into an experimental reconstructed Cryo-Electron Microscopy (cryo-EM) volume map. In structural biology, this process is necessary to semi-automatically composite large mesoscale models of complex protein assemblies and complete cellular structures that are based on measured cryo-EM data. The current approaches require manual fitting in three dimensions to start, resulting in approximately aligned structures followed by an automated fine-tuning of the alignment. The DiffFit approach enables domain scientists to fit new structures automatically and visualize the results for inspection and interactive revision. The fitting begins with differentiable three-dimensional (3D) rigid transformations of the protein atom coordinates followed by sampling the density values at the atom coordinates from the target cryo-EM volume. To ensure a meaningful correlation between the sampled densities and the protein structure, we proposed a novel loss function based on a multi-resolution volume-array approach and the exploitation of the negative space. This loss function serves as a critical metric for assessing the fitting quality, ensuring the fitting accuracy and an improved visualization of the results. We assessed the placement quality of DiffFit with several large, realistic datasets and found it to be superior to that of previous methods. We further evaluated our method in two use cases: automating the integration of known composite structures into larger protein complexes and facilitating the fitting of predicted protein domains into volume densities to aid researchers in identifying unknown proteins. We implemented our algorithm as an open-source plugin (github.com/nanovis/DiffFit) in ChimeraX, a leading visualization software in the field. All supplemental materials are available at osf. io/5tx4q.
Collapse
|
3
|
Baghirov J, Zhu H, Wang X, Kihara D. Protein Secondary Structure and DNA/RNA Detection for Cryo-EM and Cryo-ET Using Emap2sec and Emap2sec . Methods Mol Biol 2025; 2867:105-120. [PMID: 39576577 DOI: 10.1007/978-1-0716-4196-5_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Cryo-electron microscopy (cryo-EM) has become a powerful tool for determining the structures of macromolecules, such as proteins and DNA/RNA complexes. While high-resolution cryo-EM maps are increasingly available, there is still a substantial number of maps determined at intermediate or low resolution. These maps present challenges when it comes to extracting structural information. In response to this, two computational methods, Emap2sec and Emap2sec+, have been developed by our group to address these challenges and benefit the analysis of cryo-EM maps. In this chapter, we describe how to use the web servers of two of our structure analysis software for cryo-EM, Emap2sec and Emapsec+. Both methods identify local structures in medium-resolution EM maps of 5-10 Å to help find and fit protein and DNA/RNA structures in EM maps. Emap2sec identifies the secondary structures of proteins, while Emap2sec+ also identifies DNA/RNA locations in cryo-EM maps. As cryo-electron tomogram (cryo-ET) has started to produce data of this resolution, these methods would be useful for cryo-ET, too. Both methods are available in the form of webservers and source code at https://kiharalab.org/emsuites/ .
Collapse
Affiliation(s)
- Javad Baghirov
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
4
|
Wang X, Zhu H, Terashi G, Taluja M, Kihara D. DiffModeler: large macromolecular structure modeling for cryo-EM maps using a diffusion model. Nat Methods 2024; 21:2307-2317. [PMID: 39433880 DOI: 10.1038/s41592-024-02479-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 09/19/2024] [Indexed: 10/23/2024]
Abstract
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multichain protein complexes. However, modeling a large complex structure, such as those with more than ten chains, is challenging, particularly when the map resolution decreases. Here we present DiffModeler, a fully automated method for modeling large protein complex structures. DiffModeler employs a diffusion model for backbone tracing and integrates AlphaFold2-predicted single-chain structures for structure fitting. DiffModeler showed an average template modeling score of 0.88 and 0.91 for two datasets of cryo-EM maps of 0-5 Å resolution and 0.92 for intermediate resolution maps (5-10 Å), substantially outperforming existing methodologies. Further benchmarking at low resolutions (10-20 Å) confirms its versatility, demonstrating plausible performance.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Manav Taluja
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
5
|
Wang X, Zhu H, Terashi G, Taluja M, Kihara D. DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576370. [PMID: 38328203 PMCID: PMC10849514 DOI: 10.1101/2024.01.20.576370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multi-chain protein complexes. However, modeling a complex structure is challenging particularly when the map resolution is low, typically in the intermediate resolution range of 5 to 10 Å. Within this resolution range, even accurate structure fitting is difficult, let alone de novo modeling. To address this challenge, here we present DiffModeler, a fully automated method for modeling protein complex structures. DiffModeler employs a diffusion model for backbone tracing and integrates AlphaFold2-predicted single-chain structures for structure fitting. Extensive testing on cryo-EM maps at intermediate resolutions demonstrates the exceptional accuracy of DiffModeler in structure modeling, achieving an average TM-Score of 0.92, surpassing existing methodologies significantly. Notably, DiffModeler successfully modeled a protein complex composed of 47 chains and 13,462 residues, achieving a high TM-Score of 0.94. Further benchmarking at low resolutions (10-20 Å confirms its versatility, demonstrating plausible performance. Moreover, when coupled with CryoREAD, DiffModeler excels in constructing protein-DNA/RNA complex structures for near-atomic resolution maps (0-5 Å), showcasing state-of-the-art performance with average TM-Scores of 0.88 and 0.91 across two datasets.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Manav Taluja
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- School of Computer Science and Engineering, Vellore Institute of Technology, Tamil Nadu 642014, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
6
|
Christoffer C, Kihara D. Domain-Based Protein Docking with Extremely Large Conformational Changes. J Mol Biol 2022; 434:167820. [PMID: 36089054 PMCID: PMC9992458 DOI: 10.1016/j.jmb.2022.167820] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/31/2022] [Accepted: 09/03/2022] [Indexed: 11/17/2022]
Abstract
Proteins are key components in many processes in living cells, and physical interactions with other proteins and nucleic acids often form key parts of their functions. In many cases, large flexibility of proteins as they interact is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D structures of such protein complexes. When such structures are not yet experimentally determined, protein docking has long been present to computationally generate useful structure models. However, protein docking has long had the limitation that the consideration of flexibility is usually limited to very small movements or very small structures. Methods have been developed which handle minor flexibility via normal mode or other structure sampling, but new methods are required to model ordered proteins which undergo large-scale conformational changes to elucidate their function at the molecular level. Here, we present Flex-LZerD, a framework for docking such complexes. Via partial assembly multidomain docking and an iterative normal mode analysis admitting curvilinear motions, we demonstrate the ability to model the assembly of a variety of protein-protein and protein-nucleic acid complexes.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|