1
|
Posani E, Janoš P, Haack D, Toor N, Bonomi M, Magistrato A, Bussi G. Ensemble refinement of mismodeled cryo-EM RNA structures using all-atom simulations. Nat Commun 2025; 16:4549. [PMID: 40379699 PMCID: PMC12084557 DOI: 10.1038/s41467-025-59769-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Accepted: 05/02/2025] [Indexed: 05/19/2025] Open
Abstract
The advent of single-particle cryogenic electron microscopy (cryo-EM) has enabled near-atomic resolution imaging of large macromolecules, enhancing functional insights. However, current cryo-EM refinement tools condense all single-particle images into a single structure, which can misrepresent highly flexible molecules like RNAs. Here, we combine molecular dynamics simulations with cryo-EM density maps to better account for the structural dynamics of a complex and biologically relevant RNA macromolecule. Namely, using metainference, a Bayesian method, we reconstruct an ensemble of structures of the group II intron ribozyme, which better matches experimental data, and we reveal inaccuracies of single-structure approaches in modeling flexible regions. An analysis of all RNA-containing structures deposited in the Protein Data Bank reveals that this issue affects most cryo-EM structures in the 2.5-4 Å range. Thus, RNA structures determined by cryo-EM require careful handling, and our method may be broadly applicable to other RNA systems.
Collapse
Affiliation(s)
- Elisa Posani
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, Italy
| | | | - Daniel Haack
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA
| | - Navtej Toor
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA
| | - Massimiliano Bonomi
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France
| | | | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, Italy.
| |
Collapse
|
2
|
Zhang Z, Xu L, Zhang S, Peng C, Zhang G, Zhou X. DEMO-EMol: modeling protein-nucleic acid complex structures from cryo-EM maps by coupling chain assembly with map segmentation. Nucleic Acids Res 2025:gkaf416. [PMID: 40366028 DOI: 10.1093/nar/gkaf416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2025] [Revised: 04/29/2025] [Accepted: 05/03/2025] [Indexed: 05/15/2025] Open
Abstract
Atomic structure modeling is a crucial step in determining the structures of protein complexes using cryo-electron microscopy (cryo-EM). This work introduces DEMO-EMol, an improved server that integrates deep learning-based map segmentation and chain fitting to accurately assemble protein-nucleic acid (NA) complex structures from cryo-EM density maps. Starting from a density map and independently modeled chain structures, DEMO-EMol first segments protein and NA regions from the density map using deep learning. The overall complex is then assembled by fitting protein and NA chain models into their respective segmented maps, followed by domain-level fitting and optimization for protein chains. The output of DEMO-EMol includes the final assembled complex model along with overall and residue-level quality assessments. DEMO-EMol was evaluated on a comprehensive benchmark set of cryo-EM maps with resolutions ranging from 1.96 to 12.77 Å, and the results demonstrated its superior performance over the state-of-the-art methods for both protein-NA and protein-protein complex modeling. The DEMO-EMol web server is freely accessible at https://zhanggroup.org/DEMO-EMol/.
Collapse
Affiliation(s)
- Ziying Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Liang Xu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Shuai Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Chunxiang Peng
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, United States
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
3
|
Giri N, Chen X, Wang L, Cheng J. A Labeled Dataset for AI-based Cryo-EM Map Enhancement. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.16.643562. [PMID: 40166245 PMCID: PMC11957003 DOI: 10.1101/2025.03.16.643562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Cryo-electron microscopy (cryo-EM) has transformed structural biology by enabling near-atomic resolution imaging of macromolecular complexes. However, cryo-EM density maps suffer from intrinsic noise arising from structural sources, shot noise, and digital recording, which complicates accurate atomic structure building. While various methods for denoising cryo-EM density maps exist, there is a lack of standardized datasets for benchmarking artificial intelligence (AI) approaches. Here, we present an open-source dataset for cryo-EM density map denoising comprising 650 high-resolution (1-4 Å) experimental maps paired with three types of generated label maps: regression maps capturing idealized density distributions, binary classification maps distinguishing structural elements from background, and atom-type classification maps. Each map is standardized to 1 Å voxel size and validated through Fourier Shell Correlation analysis, demonstrating substantial resolution improvements in label maps compared to experimental maps. This resource bridges the gap between structural biology and artificial intelligence communities, enabling researchers to develop and benchmark innovative methods for enhancing cryo-EM density maps.
Collapse
Affiliation(s)
- Nabin Giri
- University of Missouri, Electrical Engineering and Computer Science, Columbia, 65211, USA
- NextGen Precision Health Institute, Columbia, 65211, USA
| | - Xiao Chen
- Computer Science Department, Hamilton College, Clinton, NY, 13323, USA
| | - Liguo Wang
- Laboratory for Biological Structure, Brookhaven National Laboratory, Upton, NY, 11973, USA
| | - Jianlin Cheng
- University of Missouri, Electrical Engineering and Computer Science, Columbia, 65211, USA
- NextGen Precision Health Institute, Columbia, 65211, USA
| |
Collapse
|
4
|
Selvaraj J, Wang L, Cheng J. CryoTEN: efficiently enhancing cryo-EM density maps using transformers. Bioinformatics 2025; 41:btaf092. [PMID: 40036588 PMCID: PMC11906401 DOI: 10.1093/bioinformatics/btaf092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 01/26/2025] [Accepted: 02/24/2025] [Indexed: 03/06/2025] Open
Abstract
MOTIVATION Cryogenic electron microscopy (cryo-EM) is a core experimental technique used to determine the structure of macromolecules such as proteins. However, the effectiveness of cryo-EM is often hindered by the noise and missing density values in cryo-EM density maps caused by experimental conditions such as low contrast and conformational heterogeneity. Although various global and local map-sharpening techniques are widely employed to improve cryo-EM density maps, it is still challenging to efficiently improve their quality for building better protein structures from them. RESULTS In this study, we introduce CryoTEN-a 3D UNETR++ style transformer to improve cryo-EM maps effectively. CryoTEN is trained using a diverse set of 1295 cryo-EM maps as inputs and their corresponding simulated maps generated from known protein structures as targets. An independent test set containing 150 maps is used to evaluate CryoTEN, and the results demonstrate that it can robustly enhance the quality of cryo-EM density maps. In addition, automatic de novo protein structure modeling shows that protein structures built from the density maps processed by CryoTEN have substantially better quality than those built from the original maps. Compared to the existing state-of-the-art deep learning methods for enhancing cryo-EM density maps, CryoTEN ranks second in improving the quality of density maps, while running >10 times faster and requiring much less GPU memory than them. AVAILABILITY AND IMPLEMENTATION The source code and data are freely available at https://github.com/jianlin-cheng/cryoten.
Collapse
Affiliation(s)
- Joel Selvaraj
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| |
Collapse
|
5
|
Gaza J, Brini E, MacCallum JL, Dill KA, Perez A. MELD in Action: Harnessing Data to Accelerate Molecular Dynamics. J Chem Inf Model 2025; 65:1685-1693. [PMID: 39893583 DOI: 10.1021/acs.jcim.4c02108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2025]
Abstract
We review MELD, an accelerator of Molecular Dynamics simulations of biomolecules. MELD (Modeling Employing Limited Data) integrates molecular dynamics (MD) with a variety of types of structural information through Bayesian inference, generating ensembles of protein and DNA structures having proper Boltzmann populations. MELD minimizes the computational sampling of irrelevant regions of phase space by applying energetic penalties to areas that conflict with the available data. MELD is effective in refining protein structures using NMR or cryo-EM data or predicting protein-ligand binding poses. As a plugin for OpenMM, MELD is interoperable with other enhanced sampling methods, offering a versatile tool for structural determination in computational chemistry and biophysics.
Collapse
Affiliation(s)
- Jokent Gaza
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Emiliano Brini
- School of Chemistry and Materials Science, 85 Lomb Memorial Drive, Rochester, New York 14623, United States
| | - Justin L MacCallum
- Department of Chemistry, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, United States
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, United States
| | - Alberto Perez
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
- Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
6
|
Li S, Terashi G, Zhang Z, Kihara D. Advancing structure modeling from cryo-EM maps with deep learning. Biochem Soc Trans 2025; 53:BST20240784. [PMID: 39927816 DOI: 10.1042/bst20240784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 01/16/2025] [Accepted: 01/21/2025] [Indexed: 02/11/2025]
Abstract
Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by enabling the determination of biomolecular structures that are challenging to resolve using conventional methods. Interpreting a cryo-EM map requires accurate modeling of the structures of underlying biomolecules. Here, we concisely discuss the evolution and current state of automatic structure modeling from cryo-EM density maps. We classify modeling methods into two categories: de novo modeling methods from high-resolution maps (better than 5 Å) and methods that model by fitting individual structures of component proteins to maps at lower resolution (worse than 5 Å). Special attention is given to the role of deep learning in the modeling process, highlighting how AI-driven approaches are transformative in cryo-EM structure modeling. We conclude by discussing future directions in the field.
Collapse
Affiliation(s)
- Shu Li
- Department of Computer Science, Purdue University, West Lafayette, IN, U.S.A
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, U.S.A
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, U.S.A
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, U.S.A
- Department of Biological Sciences, Purdue University, West Lafayette, IN, U.S.A
| |
Collapse
|
7
|
Farheen F, Terashi G, Zhu H, Kihara D. AI-based methods for biomolecular structure modeling for Cryo-EM. Curr Opin Struct Biol 2025; 90:102989. [PMID: 39864242 PMCID: PMC11793015 DOI: 10.1016/j.sbi.2025.102989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 12/29/2024] [Accepted: 01/04/2025] [Indexed: 01/28/2025]
Abstract
Cryo-electron microscopy (Cryo-EM) has revolutionized structural biology by enabling the determination of macromolecular structures that were challenging to study with conventional methods. Processing cryo-EM data involves several computational steps to derive three-dimensional structures from raw projections. Recent advancements in artificial intelligence (AI) including deep learning have significantly improved the performance of these processes. In this review, we discuss state-of-the-art AI-based techniques used in key steps of cryo-EM data processing, including macromolecular structure modeling and heterogeneity analysis.
Collapse
Affiliation(s)
- Farhanaz Farheen
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
8
|
Punuru P, Jain A, Kihara D. Secondary Structure Detection and Structure Modeling for Cryo-EM. Methods Mol Biol 2025; 2870:341-355. [PMID: 39543043 DOI: 10.1007/978-1-0716-4213-9_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Rapid advancements in cryogenic electron microscopy (cryo-EM) have revolutionized the field of structural biology by enabling the determination of complex macromolecular structures at unprecedented resolutions. When cryo-EM density maps have a resolution around 3 Å, the atomic structure can be modeled manually. However, as the resolution decreases, analyzing these density maps becomes increasingly challenging. For modeling structures in lower resolution maps, deep learning can be used to identify structural features in the maps to assist in structure modeling.Here, we present a suite of deep learning-based tools developed by our lab that enable structural biologists to work with cryo-EM maps of a wide range of resolutions. For cryo-EM maps at near-atomic resolution (5 Å or better), DeepMainmast automatically models all-atom structures by tracing the main chain from local map features of amino acids and atoms detected by deep learning; DAQ score quantifies map-model fit and indicates potential misassignments in protein models. In intermediate resolution maps (5-10 Å), Emap2sec and Emap2sec+ can accurately detect protein secondary structures and nucleic acids. These tools and more are available at our web server: https://em.kiharalab.org/ .
Collapse
Affiliation(s)
- Pranav Punuru
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Anika Jain
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
9
|
Li T, He J, Cao H, Zhang Y, Chen J, Xiao Y, Huang SY. All-atom RNA structure determination from cryo-EM maps. Nat Biotechnol 2025; 43:97-105. [PMID: 38396075 DOI: 10.1038/s41587-024-02149-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Accepted: 01/24/2024] [Indexed: 02/25/2024]
Abstract
Many methods exist for determining protein structures from cryogenic electron microscopy maps, but this remains challenging for RNA structures. Here we developed EMRNA, a method for accurate, automated determination of full-length all-atom RNA structures from cryogenic electron microscopy maps. EMRNA integrates deep learning-based detection of nucleotides, three-dimensional backbone tracing and scoring with consideration of sequence and secondary structure information, and full-atom construction of the RNA structure. We validated EMRNA on 140 diverse RNA maps ranging from 37 to 423 nt at 2.0-6.0 Å resolutions, and compared EMRNA with auto-DRRAFTER, phenix.map_to_model and CryoREAD on a set of 71 cases. EMRNA achieves a median accuracy of 2.36 Å root mean square deviation and 0.86 TM-score for full-length RNA structures, compared with 6.66 Å and 0.58 for auto-DRRAFTER. EMRNA also obtains a high residue coverage and sequence match of 93.30% and 95.30% in the built models, compared with 58.20% and 42.20% for phenix.map_to_model and 56.45% and 52.3% for CryoREAD. EMRNA is fast and can build an RNA structure of 100 nt within 3 min.
Collapse
Affiliation(s)
- Tao Li
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Cao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Yi Zhang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Ji Chen
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Yi Xiao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
10
|
Wang X, Zhu H, Terashi G, Taluja M, Kihara D. DiffModeler: large macromolecular structure modeling for cryo-EM maps using a diffusion model. Nat Methods 2024; 21:2307-2317. [PMID: 39433880 DOI: 10.1038/s41592-024-02479-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 09/19/2024] [Indexed: 10/23/2024]
Abstract
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multichain protein complexes. However, modeling a large complex structure, such as those with more than ten chains, is challenging, particularly when the map resolution decreases. Here we present DiffModeler, a fully automated method for modeling large protein complex structures. DiffModeler employs a diffusion model for backbone tracing and integrates AlphaFold2-predicted single-chain structures for structure fitting. DiffModeler showed an average template modeling score of 0.88 and 0.91 for two datasets of cryo-EM maps of 0-5 Å resolution and 0.92 for intermediate resolution maps (5-10 Å), substantially outperforming existing methodologies. Further benchmarking at low resolutions (10-20 Å) confirms its versatility, demonstrating plausible performance.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Manav Taluja
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
11
|
Li T, Cao H, He J, Huang SY. Automated detection and de novo structure modeling of nucleic acids from cryo-EM maps. Nat Commun 2024; 15:9367. [PMID: 39477926 PMCID: PMC11525807 DOI: 10.1038/s41467-024-53721-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 10/18/2024] [Indexed: 11/02/2024] Open
Abstract
Cryo-electron microscopy (cryo-EM) is one of the most powerful experimental methods for macromolecular structure determination. However, accurate DNA/RNA structure modeling from cryo-EM maps is still challenging especially for protein-DNA/RNA or multi-chain DNA/RNA complexes. Here we propose a deep learning-based method for accurate de novo structure determination of DNA/RNA from cryo-EM maps at <5 Å resolutions, which is referred to as EM2NA. EM2NA is extensively evaluated on a diverse test set of 50 experimental maps at 2.0-5.0 Å resolutions, and compared with state-of-the-art methods including CryoREAD, ModelAngelo, and phenix.map_to_model. On average, EM2NA achieves a residue coverage of 83.15%, C4' RMSD of 1.06 Å, and sequence recall of 46.86%, which outperforms the existing methods. Moreover, EM2NA is applied to build the DNA/RNA structures with 10 to 5347 nt from an EMDB-wide data set of 263 unmodeled raw maps, demonstrating its ability in the blind model building of DNA/RNA from cryo-EM maps. EM2NA is fast and can normally build a DNA/RNA structure of <500 nt within 10 minutes.
Collapse
Affiliation(s)
- Tao Li
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Cao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
12
|
Chen S, Zhang S, Fang X, Lin L, Zhao H, Yang Y. Protein complex structure modeling by cross-modal alignment between cryo-EM maps and protein sequences. Nat Commun 2024; 15:8808. [PMID: 39394203 PMCID: PMC11470027 DOI: 10.1038/s41467-024-53116-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 10/02/2024] [Indexed: 10/13/2024] Open
Abstract
Cryo-electron microscopy (cryo-EM) technique is widely used for protein structure determination. Current automatic cryo-EM protein complex modeling methods mostly rely on prior chain separation. However, chain separation without sequence guidance often suffers from errors caused by cross-chain interaction or noise densities, which would accumulate and mislead the subsequent steps. Here, we present EModelX, a fully automated cryo-EM protein complex structure modeling method, which achieves sequence-guiding modeling through cross-modal alignments between cryo-EM maps and protein sequences. EModelX first employs multi-task deep learning to predict Cα atoms, backbone atoms, and amino acid types from cryo-EM maps, which is subsequently used to sample Cα traces with amino acid profiles. The profiles are then aligned with protein sequences to obtain initial structural models, which yielded an average RMSD of 1.17 Å in our test set, approaching atomic-level precision in recovering PDB-deposited structures. After filling unmodeled gaps through sequence-guiding Cα threading, the final models achieved an average TM-score of 0.808, outperforming the state-of-the-art method. The further combination with AlphaFold can improve the average TM-score to 0.911. Analyzes conducted by comparing some EModelX-built models and PDB structures highlight its potential to improve PDB structures. EModelX is accessible at https://bio-web1.nscc-gz.cn/app/EModelX .
Collapse
Affiliation(s)
- Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Sen Zhang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Xiaoyu Fang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Liang Lin
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
13
|
Selvaraj J, Wang L, Cheng J. CryoTEN: Efficiently Enhancing Cryo-EM Density Maps Using Transformers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.06.611715. [PMID: 39314387 PMCID: PMC11418965 DOI: 10.1101/2024.09.06.611715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Motivation Cryogenic Electron Microscopy (cryo-EM) is a core experimental technique used to determine the structure of macromolecules such as proteins. However, the effectiveness of cryo-EM is often hindered by the noise and missing density values in cryo-EM density maps caused by experimental conditions such as low contrast and conformational heterogeneity. Although various global and local map sharpening techniques are widely employed to improve cryo-EM density maps, it is still challenging to efficiently improve their quality for building better protein structures from them. Results In this study, we introduce CryoTEN - a three-dimensional U-Net style transformer to improve cryo-EM maps effectively. CryoTEN is trained using a diverse set of 1,295 cryo-EM maps as inputs and their corresponding simulated maps generated from known protein structures as targets. An independent test set containing 150 maps is used to evaluate CryoTEN, and the results demonstrate that it can robustly enhance the quality of cryo-EM density maps. In addition, the automatic de novo protein structure modeling shows that the protein structures built from the density maps processed by CryoTEN have substantially better quality than those built from the original maps. Compared to the existing state-of-the-art deep learning methods for enhancing cryo-EM density maps, CryoTEN ranks second in improving the quality of density maps, while running > 10 times faster and requiring much less GPU memory than them. Availability and implementation The source code and data is freely available at https://github.com/jianlin-cheng/cryoten.
Collapse
Affiliation(s)
- Joel Selvaraj
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, MO, United States
- NextGen Precision Health, University of Missouri, Columbia, 65211, MO, United States
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, 11973, NY, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, MO, United States
- NextGen Precision Health, University of Missouri, Columbia, 65211, MO, United States
| |
Collapse
|
14
|
Song X, Bao L, Feng C, Huang Q, Zhang F, Gao X, Han R. Accurate Prediction of Protein Structural Flexibility by Deep Learning Integrating Intricate Atomic Structures and Cryo-EM Density Information. Nat Commun 2024; 15:5538. [PMID: 38956032 PMCID: PMC11219796 DOI: 10.1038/s41467-024-49858-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 06/20/2024] [Indexed: 07/04/2024] Open
Abstract
The dynamics of proteins are crucial for understanding their mechanisms. However, computationally predicting protein dynamic information has proven challenging. Here, we propose a neural network model, RMSF-net, which outperforms previous methods and produces the best results in a large-scale protein dynamics dataset; this model can accurately infer the dynamic information of a protein in only a few seconds. By learning effectively from experimental protein structure data and cryo-electron microscopy (cryo-EM) data integration, our approach is able to accurately identify the interactive bidirectional constraints and supervision between cryo-EM maps and PDB models in maximizing the dynamic prediction efficacy. Rigorous 5-fold cross-validation on the dataset demonstrates that RMSF-net achieves test correlation coefficients of 0.746 ± 0.127 at the voxel level and 0.765 ± 0.109 at the residue level, showcasing its ability to deliver dynamic predictions closely approximating molecular dynamics simulations. Additionally, it offers real-time dynamic inference with minimal storage overhead on the order of megabytes. RMSF-net is a freely accessible tool and is anticipated to play an essential role in the study of protein dynamics.
Collapse
Affiliation(s)
- Xintao Song
- Research Center for Mathematics and Interdisciplinary Sciences (Ministry of Education Frontiers Science Center for Nonlinear Expectations), Shandong University, Qingdao, China
- BioMap Research, Menlo Park, CA, USA
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, Saudi Arabia
| | - Lei Bao
- School of Public Health, Hubei University of Medicine, Shiyan, China
| | - Chenjie Feng
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan, China
| | - Qiang Huang
- Research Center for Mathematics and Interdisciplinary Sciences (Ministry of Education Frontiers Science Center for Nonlinear Expectations), Shandong University, Qingdao, China
| | - Fa Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, China.
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, Saudi Arabia.
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences (Ministry of Education Frontiers Science Center for Nonlinear Expectations), Shandong University, Qingdao, China.
- BioMap Research, Menlo Park, CA, USA.
| |
Collapse
|
15
|
Lawson CL, Kryshtafovych A, Pintilie GD, Burley SK, Černý J, Chen VB, Emsley P, Gobbi A, Joachimiak A, Noreng S, Prisant MG, Read RJ, Richardson JS, Rohou AL, Schneider B, Sellers BD, Shao C, Sourial E, Williams CI, Williams CJ, Yang Y, Abbaraju V, Afonine PV, Baker ML, Bond PS, Blundell TL, Burnley T, Campbell A, Cao R, Cheng J, Chojnowski G, Cowtan KD, DiMaio F, Esmaeeli R, Giri N, Grubmüller H, Hoh SW, Hou J, Hryc CF, Hunte C, Igaev M, Joseph AP, Kao WC, Kihara D, Kumar D, Lang L, Lin S, Maddhuri Venkata Subramaniya SR, Mittal S, Mondal A, Moriarty NW, Muenks A, Murshudov GN, Nicholls RA, Olek M, Palmer CM, Perez A, Pohjolainen E, Pothula KR, Rowley CN, Sarkar D, Schäfer LU, Schlicksup CJ, Schröder GF, Shekhar M, Si D, Singharoy A, Sobolev OV, Terashi G, Vaiana AC, Vedithi SC, Verburgt J, Wang X, Warshamanage R, Winn MD, Weyand S, Yamashita K, Zhao M, Schmid MF, Berman HM, Chiu W. Outcomes of the EMDataResource cryo-EM Ligand Modeling Challenge. Nat Methods 2024; 21:1340-1348. [PMID: 38918604 PMCID: PMC11526832 DOI: 10.1038/s41592-024-02321-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 05/24/2024] [Indexed: 06/27/2024]
Abstract
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Collapse
Affiliation(s)
- Catherine L Lawson
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.
| | | | - Grigore D Pintilie
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
| | - Stephen K Burley
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
- RCSB Protein Data Bank and San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, USA
| | - Jiří Černý
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, Czech Republic
| | - Vincent B Chen
- Department of Biochemistry, Duke University, Durham, NC, USA
| | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Alberto Gobbi
- Discovery Chemistry, Genentech Inc., San Francisco, CA, USA
- , Berlin, Germany
| | - Andrzej Joachimiak
- Structural Biology Center, X-ray Science Division, Argonne National Laboratory, Argonne, IL, USA
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Sigrid Noreng
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
- Protein Science, Septerna, South San Francisco, CA, USA
| | | | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Alexis L Rohou
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
| | - Bohdan Schneider
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, Czech Republic
| | - Benjamin D Sellers
- Discovery Chemistry, Genentech Inc., San Francisco, CA, USA
- Computational Chemistry, Vilya, South San Francisco, CA, USA
| | - Chenghua Shao
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | | | | | - Ying Yang
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
| | - Venkat Abbaraju
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Pavel V Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Matthew L Baker
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Paul S Bond
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Tom Burnley
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Arthur Campbell
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | - K D Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Helmut Grubmüller
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- Department of Computer Science, Saint Louis University, St. Louis, MO, USA
| | - Corey F Hryc
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Carola Hunte
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS-Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany
| | - Maxim Igaev
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Agnel P Joseph
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Wei-Chun Kao
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS-Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Dilip Kumar
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
- Trivedi School of Biosciences, Ashoka University, Sonipat, India
| | - Lijun Lang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
- The Chinese University of Hong Kong, Hong Kong, China
| | - Sean Lin
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Sumit Mittal
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
- National Renewable Energy Laboratory (NREL), Golden, CO, USA
| | - Nigel W Moriarty
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew Muenks
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Robert A Nicholls
- MRC Laboratory of Molecular Biology, Cambridge, UK
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Mateusz Olek
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, UK
| | - Colin M Palmer
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Emmi Pohjolainen
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Karunakar R Pothula
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- MSU-DOE Plant Research Laboratory, East Lansing, MI, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ, USA
| | - Luisa U Schäfer
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Christopher J Schlicksup
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Gunnar F Schröder
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
- Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Oleg V Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Andrea C Vaiana
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Nature's Toolbox (NTx), Rio Rancho, NM, USA
| | | | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | | | - Martyn D Winn
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Simone Weyand
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | | | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Michael F Schmid
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Helen M Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Wah Chiu
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA.
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA.
| |
Collapse
|
16
|
Jamali K, Käll L, Zhang R, Brown A, Kimanius D, Scheres SHW. Automated model building and protein identification in cryo-EM maps. Nature 2024; 628:450-457. [PMID: 38408488 PMCID: PMC11006616 DOI: 10.1038/s41586-024-07215-4] [Citation(s) in RCA: 135] [Impact Index Per Article: 135.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 02/19/2024] [Indexed: 02/28/2024]
Abstract
Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high levels of expertise and labour-intensive manual intervention in three-dimensional computer graphics programs1,2. Here we present ModelAngelo, a machine-learning approach for automated atomic model building in cryo-EM maps. By combining information from the cryo-EM map with information from protein sequence and structure in a single graph neural network, ModelAngelo builds atomic models for proteins that are of similar quality to those generated by human experts. For nucleotides, ModelAngelo builds backbones with similar accuracy to those built by humans. By using its predicted amino acid probabilities for each residue in hidden Markov model sequence searches, ModelAngelo outperforms human experts in the identification of proteins with unknown sequences. ModelAngelo will therefore remove bottlenecks and increase objectivity in cryo-EM structure determination.
Collapse
Affiliation(s)
| | - Lukas Käll
- Science for Life Laboratory, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Rui Zhang
- Washington University in St Louis, St Louis, MO, USA
| | - Alan Brown
- Blavatnik Institute, Harvard Medical School, Boston, MA, USA
| | | | | |
Collapse
|
17
|
Chen J, Zia A, Luo A, Meng H, Wang F, Hou J, Cao R, Si D. Enhancing cryo-EM structure prediction with DeepTracer and AlphaFold2 integration. Brief Bioinform 2024; 25:bbae118. [PMID: 38609330 PMCID: PMC11014792 DOI: 10.1093/bib/bbae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 01/23/2024] [Accepted: 03/02/2024] [Indexed: 04/14/2024] Open
Abstract
Understanding the protein structures is invaluable in various biomedical applications, such as vaccine development. Protein structure model building from experimental electron density maps is a time-consuming and labor-intensive task. To address the challenge, machine learning approaches have been proposed to automate this process. Currently, the majority of the experimental maps in the database lack atomic resolution features, making it challenging for machine learning-based methods to precisely determine protein structures from cryogenic electron microscopy density maps. On the other hand, protein structure prediction methods, such as AlphaFold2, leverage evolutionary information from protein sequences and have recently achieved groundbreaking accuracy. However, these methods often require manual refinement, which is labor intensive and time consuming. In this study, we present DeepTracer-Refine, an automated method that refines AlphaFold predicted structures by aligning them to DeepTracers modeled structure. Our method was evaluated on 39 multi-domain proteins and we improved the average residue coverage from 78.2 to 90.0% and average local Distance Difference Test score from 0.67 to 0.71. We also compared DeepTracer-Refine with Phenixs AlphaFold refinement and demonstrated that our method not only performs better when the initial AlphaFold model is less precise but also surpasses Phenix in run-time performance.
Collapse
Affiliation(s)
- Jason Chen
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| | - Ayisha Zia
- Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Birmingham, AL 35233, USA
| | - Albert Luo
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| | - Hanze Meng
- Department of Computer Science, Duke University, Durham, NC 27708, USA
| | - Fengbin Wang
- Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Birmingham, AL 35233, USA
| | - Jie Hou
- Department of Computer Science, Saint Louis University, Saint Louis, MO 63103, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA
| | - Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| |
Collapse
|
18
|
Wang X, Zhu H, Terashi G, Taluja M, Kihara D. DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576370. [PMID: 38328203 PMCID: PMC10849514 DOI: 10.1101/2024.01.20.576370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multi-chain protein complexes. However, modeling a complex structure is challenging particularly when the map resolution is low, typically in the intermediate resolution range of 5 to 10 Å. Within this resolution range, even accurate structure fitting is difficult, let alone de novo modeling. To address this challenge, here we present DiffModeler, a fully automated method for modeling protein complex structures. DiffModeler employs a diffusion model for backbone tracing and integrates AlphaFold2-predicted single-chain structures for structure fitting. Extensive testing on cryo-EM maps at intermediate resolutions demonstrates the exceptional accuracy of DiffModeler in structure modeling, achieving an average TM-Score of 0.92, surpassing existing methodologies significantly. Notably, DiffModeler successfully modeled a protein complex composed of 47 chains and 13,462 residues, achieving a high TM-Score of 0.94. Further benchmarking at low resolutions (10-20 Å confirms its versatility, demonstrating plausible performance. Moreover, when coupled with CryoREAD, DiffModeler excels in constructing protein-DNA/RNA complex structures for near-atomic resolution maps (0-5 Å), showcasing state-of-the-art performance with average TM-Scores of 0.88 and 0.91 across two datasets.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Manav Taluja
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- School of Computer Science and Engineering, Vellore Institute of Technology, Tamil Nadu 642014, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
19
|
He B, Zhang F, Feng C, Yang J, Gao X, Han R. Accurate global and local 3D alignment of cryo-EM density maps using local spatial structural features. Nat Commun 2024; 15:1593. [PMID: 38383438 PMCID: PMC10881975 DOI: 10.1038/s41467-024-45861-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/05/2024] [Indexed: 02/23/2024] Open
Abstract
Advances in cryo-electron microscopy (cryo-EM) imaging technologies have led to a rapidly increasing number of cryo-EM density maps. Alignment and comparison of density maps play a crucial role in interpreting structural information, such as conformational heterogeneity analysis using global alignment and atomic model assembly through local alignment. Here, we present a fast and accurate global and local cryo-EM density map alignment method called CryoAlign, that leverages local density feature descriptors to capture spatial structure similarities. CryoAlign is a feature-based cryo-EM map alignment tool, in which the employment of feature-based architecture enables the rapid establishment of point pair correspondences and robust estimation of alignment parameters. Extensive experimental evaluations demonstrate the superiority of CryoAlign over the existing methods in terms of both alignment accuracy and speed.
Collapse
Affiliation(s)
- Bintao He
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Fa Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Chenjie Feng
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan, 750004, China
| | - Jianyi Yang
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, 23955, Saudi Arabia.
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
| |
Collapse
|
20
|
Lawson CL, Kryshtafovych A, Pintilie GD, Burley SK, Černý J, Chen VB, Emsley P, Gobbi A, Joachimiak A, Noreng S, Prisant M, Read RJ, Richardson JS, Rohou AL, Schneider B, Sellers BD, Shao C, Sourial E, Williams CI, Williams CJ, Yang Y, Abbaraju V, Afonine PV, Baker ML, Bond PS, Blundell TL, Burnley T, Campbell A, Cao R, Cheng J, Chojnowski G, Cowtan KD, DiMaio F, Esmaeeli R, Giri N, Grubmüller H, Hoh SW, Hou J, Hryc CF, Hunte C, Igaev M, Joseph AP, Kao WC, Kihara D, Kumar D, Lang L, Lin S, Maddhuri Venkata Subramaniya SR, Mittal S, Mondal A, Moriarty NW, Muenks A, Murshudov GN, Nicholls RA, Olek M, Palmer CM, Perez A, Pohjolainen E, Pothula KR, Rowley CN, Sarkar D, Schäfer LU, Schlicksup CJ, Schröder GF, Shekhar M, Si D, Singharoy A, Sobolev OV, Terashi G, Vaiana AC, Vedithi SC, Verburgt J, Wang X, Warshamanage R, Winn MD, Weyand S, Yamashita K, Zhao M, Schmid MF, Berman HM, Chiu W. Outcomes of the EMDataResource Cryo-EM Ligand Modeling Challenge. RESEARCH SQUARE 2024:rs.3.rs-3864137. [PMID: 38343795 PMCID: PMC10854310 DOI: 10.21203/rs.3.rs-3864137/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Abstract
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: E. coli beta-galactosidase with inhibitor, SARS-CoV-2 RNA-dependent RNA polymerase with covalently bound nucleotide analog, and SARS-CoV-2 ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. We found that (1) the quality of submitted ligand models and surrounding atoms varied, as judged by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics, and contact scores, and (2) a composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Collapse
Affiliation(s)
- Catherine L. Lawson
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | - Grigore D. Pintilie
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
| | - Stephen K. Burley
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ USA
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA USA
| | - Jiří Černý
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, CZ
| | | | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Alberto Gobbi
- Discovery Chemistry, Genentech Inc, South San Francisco, USA
| | - Andrzej Joachimiak
- Structural Biology Center, X-ray Science Division, Argonne National Laboratory, Argonne, IL, USA
| | - Sigrid Noreng
- Structural Biology, Genentech Inc, South San Francisco, USA
| | | | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | | | - Bohdan Schneider
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, CZ
| | | | - Chenghua Shao
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | | | | | - Ying Yang
- Structural Biology, Genentech Inc, South San Francisco, USA
| | - Venkat Abbaraju
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Pavel V. Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Matthew L. Baker
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Paul S. Bond
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Tom Burnley
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Arthur Campbell
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | - Kevin D. Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Helmut Grubmüller
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- Department of Computer Science, Saint Louis University, St. Louis, MO, USA
| | - Corey F. Hryc
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Carola Hunte
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS - Centre for Integrative Biological Signalling Studies, University of Freiburg, 79104 Freiburg, Germany
| | - Maxim Igaev
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Agnel P. Joseph
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Wei-Chun Kao
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS - Centre for Integrative Biological Signalling Studies, University of Freiburg, 79104 Freiburg, Germany
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Dilip Kumar
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Lijun Lang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Sean Lin
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Sumit Mittal
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nigel W. Moriarty
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew Muenks
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | | | - Mateusz Olek
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, UK
| | - Colin M. Palmer
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Emmi Pohjolainen
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Karunakar R. Pothula
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Luisa U. Schäfer
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Christopher J. Schlicksup
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Gunnar F. Schröder
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
- Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Oleg V. Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Andrea C. Vaiana
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Nature’s Toolbox (NTx), Rio Rancho, NM, USA
| | | | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | | | - Martyn D. Winn
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Simone Weyand
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | | | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Michael F. Schmid
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Helen M. Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Wah Chiu
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| |
Collapse
|
21
|
Ray DD, Flagel L, Schrider DR. IntroUNET: identifying introgressed alleles via semantic segmentation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.07.527435. [PMID: 36865105 PMCID: PMC9979274 DOI: 10.1101/2023.02.07.527435] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient-ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual's alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled "ghost" population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method's success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data.
Collapse
Affiliation(s)
- Dylan D. Ray
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Lex Flagel
- Division of Data Science, Gencove Inc., New York, NY 11101, USA
- Department of Plant and Microbial Biology, University of Minnesota, St Paul MN, 55108, USA
| | - Daniel R. Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
22
|
Heo L, Feig M. One bead per residue can describe all-atom protein structures. Structure 2024; 32:97-111.e6. [PMID: 38000367 PMCID: PMC10872525 DOI: 10.1016/j.str.2023.10.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/16/2023] [Accepted: 10/30/2023] [Indexed: 11/26/2023]
Abstract
Atomistic resolution is the standard for high-resolution biomolecular structures, but experimental structural data are often at lower resolution. Coarse-grained models are also used extensively in computational studies to reach biologically relevant spatial and temporal scales. This study explores the use of advanced machine learning networks for reconstructing atomistic models from reduced representations. The main finding is that a single bead per amino acid residue allows construction of accurate and stereochemically realistic all-atom structures with minimal loss of information. This suggests that lower resolution representations of proteins may be sufficient for many applications when combined with a machine learning framework that encodes knowledge from known structures. Practical applications include the rapid addition of atomistic detail to low-resolution structures from experiment or computational coarse-grained models. The application of rapid, deterministic all-atom reconstruction within multi-scale frameworks is further demonstrated with a rapid protocol for the generation of accurate models from cryo-EM densities close to experimental structures.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
23
|
Terashi G, Wang X, Prasad D, Nakamura T, Kihara D. DeepMainmast: integrated protocol of protein structure modeling for cryo-EM with deep learning and structure prediction. Nat Methods 2024; 21:122-131. [PMID: 38066344 DOI: 10.1038/s41592-023-02099-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 10/22/2023] [Indexed: 12/19/2023]
Abstract
Three-dimensional structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy. Although the resolution of determined cryogenic electron microscopy maps has generally improved, there are still many cases where tracing protein main chains is difficult, even in maps determined at a near-atomic resolution. Here we developed a protein structure modeling method, DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, we integrated AlphaFold2 with the de novo density tracing protocol to combine their complementary strengths and achieved even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign the chain identity to the structure models of homo-multimers, which is not a trivial task for existing methods.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Devashish Prasad
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
24
|
Terashi G, Wang X, Prasad D, Nakamura T, Zhu H, Kihara D. Integrated Protocol of Protein Structure Modeling for Cryo-EM with Deep Learning and Structure Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.19.563151. [PMID: 37904978 PMCID: PMC10614963 DOI: 10.1101/2023.10.19.563151] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM maps has generally improved, there are still many cases where tracing protein main-chains is difficult, even in maps determined at a near atomic resolution. Here, we have developed a protein structure modeling method, called DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, since Alphafold2 demonstrates high accuracy in protein structure prediction, we have integrated complementary strengths of de novo density tracing using deep learning with Alphafold2's structure modeling to achieve even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign chain identity to the structure models of homo-multimers.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Devashish Prasad
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
25
|
Park J, Joung I, Joo K, Lee J. Application of conformational space annealing to the protein structure modeling using cryo-EM maps. J Comput Chem 2023; 44:2332-2346. [PMID: 37585026 DOI: 10.1002/jcc.27200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 04/26/2023] [Accepted: 07/16/2023] [Indexed: 08/17/2023]
Abstract
Conformational space annealing (CSA), a global optimization method, has been applied to various protein structure modeling tasks. In this paper, we applied CSA to the cryo-EM structure modeling task by combining the python subroutine of CSA (PyCSA) and the fast relax (FastRelax) protocol of PyRosetta. Refinement of initial structures generated from two methods, rigid fitting of predicted structures to the Cryo-EM map and de novo protein modeling by tracing the Cryo-EM map, was performed by CSA. In the refinement of the rigid-fitted structures, the final models showed that CSA can generate reliable atomic structures of proteins, even when large movements of protein domains were required. In the de novo modeling case, although the overall structural qualities of the final models were rather dependent on the initial models, the final models generated by CSA showed improved MolProbity scores and cross-correlation coefficients to the maps. These results suggest that CSA can accomplish flexible fitting and refinement together by sampling diverse conformations effectively and thus can be utilized for cryo-EM structure modeling.
Collapse
Affiliation(s)
| | | | - Keehyoung Joo
- Center for Advanced Computations, Korea Institute for Advanced Study, Seoul, South Korea
| | - Jooyoung Lee
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, South Korea
| |
Collapse
|
26
|
Wang X, Terashi G, Kihara D. CryoREAD: de novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nat Methods 2023; 20:1739-1747. [PMID: 37783885 PMCID: PMC10841814 DOI: 10.1038/s41592-023-02032-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 08/24/2023] [Indexed: 10/04/2023]
Abstract
DNA and RNA play fundamental roles in various cellular processes, where their three-dimensional structures provide information critical to understanding the molecular mechanisms of their functions. Although an increasing number of nucleic acid structures and their complexes with proteins are determined by cryogenic electron microscopy (cryo-EM), structure modeling for DNA and RNA remains challenging particularly when the map is determined at a resolution coarser than atomic level. Moreover, computational methods for nucleic acid structure modeling are relatively scarce. Here, we present CryoREAD, a fully automated de novo DNA/RNA atomic structure modeling method using deep learning. CryoREAD identifies phosphate, sugar and base positions in a cryo-EM map using deep learning, which are traced and modeled into a three-dimensional structure. When tested on cryo-EM maps determined at 2.0 to 5.0 Å resolution, CryoREAD built substantially more accurate models than existing methods. We also applied the method to cryo-EM maps of biomolecular complexes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
27
|
Jamali K, Käll L, Zhang R, Brown A, Kimanius D, Scheres SH. Automated model building and protein identification in cryo-EM maps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541002. [PMID: 37292681 PMCID: PMC10245678 DOI: 10.1101/2023.05.16.541002] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high levels of expertise and labour-intensive manual intervention. We present ModelAngelo, a machine-learning approach for automated atomic model building in cryo-EM maps. By combining information from the cryo-EM map with information from protein sequence and structure in a single graph neural network, ModelAngelo builds atomic models for proteins that are of similar quality as those generated by human experts. For nucleotides, ModelAngelo builds backbones with similar accuracy as humans. By using its predicted amino acid probabilities for each residue in hidden Markov model sequence searches, ModelAngelo outperforms human experts in the identification of proteins with unknown sequences. ModelAngelo will thus remove bottlenecks and increase objectivity in cryo-EM structure determination.
Collapse
Affiliation(s)
| | - Lukas Käll
- Science for Life Laboratory, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Rui Zhang
- Washington University in St. Louis, St. Louis, MO, USA
| | - Alan Brown
- Blavatnik Institute, Harvard Medical School, Boston, MA, USA
| | | | | |
Collapse
|
28
|
Miyashita O, Tama F. Advancing cryo-electron microscopy data analysis through accelerated simulation-based flexible fitting approaches. Curr Opin Struct Biol 2023; 82:102653. [PMID: 37451233 DOI: 10.1016/j.sbi.2023.102653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 05/30/2023] [Accepted: 06/19/2023] [Indexed: 07/18/2023]
Abstract
Flexible fitting based on molecular dynamics simulation is a technique for structure modeling from cryo-EM data. It has been utilized for nearly two decades, and while cryo-EM resolution has improved significantly, it remains a powerful approach that can provide structural and dynamical insights that are not directly accessible from experimental data alone. Molecular dynamics simulations provide a means to extract atomistic details of conformational changes that are encoded in cryo-EM data and can also assist in improving the quality of structural models. Additionally, molecular dynamics simulations enable the characterization of conformational heterogeneity in cryo-EM data. We will summarize the advancements made in these techniques and highlight recent developments in this field.
Collapse
Affiliation(s)
- Osamu Miyashita
- RIKEN Center for Computational Science, 6-7-1, Minatojima-minami-machi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.
| | - Florence Tama
- RIKEN Center for Computational Science, 6-7-1, Minatojima-minami-machi, Chuo-ku, Kobe, Hyogo 650-0047, Japan; Department of Physics, Graduate School of Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464-8602, Japan; Institute of Transformative Bio-Molecules, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464-8602, Japan.
| |
Collapse
|
29
|
Sarkar D, Lee H, Vant JW, Turilli M, Vermaas JV, Jha S, Singharoy A. Adaptive Ensemble Refinement of Protein Structures in High Resolution Electron Microscopy Density Maps with Radical Augmented Molecular Dynamics Flexible Fitting. J Chem Inf Model 2023; 63:5834-5846. [PMID: 37661856 DOI: 10.1021/acs.jcim.3c00350] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Recent advances in cryo-electron microscopy (cryo-EM) have enabled modeling macromolecular complexes that are essential components of the cellular machinery. The density maps derived from cryo-EM experiments are often integrated with manual, knowledge-driven or artificial intelligence-driven and physics-guided computational methods to build, fit, and refine molecular structures. Going beyond a single stationary-structure determination scheme, it is becoming more common to interpret the experimental data with an ensemble of models that contributes to an average observation. Hence, there is a need to decide on the quality of an ensemble of protein structures on-the-fly while refining them against the density maps. We introduce such an adaptive decision-making scheme during the molecular dynamics flexible fitting (MDFF) of biomolecules. Using RADICAL-Cybertools, the new RADICAL augmented MDFF implementation (R-MDFF) is examined in high-performance computing environments for refinement of two prototypical protein systems, adenylate kinase and carbon monoxide dehydrogenase. For these test cases, use of multiple replicas in flexible fitting with adaptive decision making in R-MDFF improves the overall correlation to the density by 40% relative to the refinements of the brute-force MDFF. The improvements are particularly significant at high, 2-3 Å map resolutions. More importantly, the ensemble model captures key features of biologically relevant molecular dynamics that are inaccessible to a single-model interpretation. Finally, the pipeline is applicable to systems of growing sizes, which is demonstrated using ensemble refinement of capsid proteins from the chimpanzee adenovirus. The overhead for decision making remains low and robust to computing environments. The software is publicly available on GitHub and includes a short user guide to install R-MDFF on different computing environments, from local Linux-based workstations to high-performance computing environments.
Collapse
Affiliation(s)
- Daipayan Sarkar
- MSU-DOE Plant Research Laboratory, East Lansing, Michigan 48824, United States
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
| | - Hyungro Lee
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
- Electrical & Computer Engineering, Rutgers University, New Brunswick, New Jersey 08854, United States
| | - John W Vant
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
| | - Matteo Turilli
- Electrical & Computer Engineering, Rutgers University, New Brunswick, New Jersey 08854, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Josh V Vermaas
- MSU-DOE Plant Research Laboratory, East Lansing, Michigan 48824, United States
| | - Shantenu Jha
- Electrical & Computer Engineering, Rutgers University, New Brunswick, New Jersey 08854, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Abhishek Singharoy
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
| |
Collapse
|
30
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
31
|
Maddhuri Venkata Subramaniya SR, Terashi G, Kihara D. Enhancing cryo-EM maps with 3D deep generative networks for assisting protein structure modeling. Bioinformatics 2023; 39:btad494. [PMID: 37549063 PMCID: PMC10444963 DOI: 10.1093/bioinformatics/btad494] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/28/2023] [Accepted: 08/04/2023] [Indexed: 08/09/2023] Open
Abstract
MOTIVATION The tertiary structures of an increasing number of biological macromolecules have been determined using cryo-electron microscopy (cryo-EM). However, there are still many cases where the resolution is not high enough to model the molecular structures with standard computational tools. If the resolution obtained is near the empirical borderline (3-4.5 Å), improvement in the map quality facilitates structure modeling. RESULTS We report EM-GAN, a novel approach that modifies an input cryo-EM map to assist protein structure modeling. The method uses a 3D generative adversarial network (GAN) that has been trained on high- and low-resolution density maps to learn the density patterns, and modifies the input map to enhance its suitability for modeling. The method was tested extensively on a dataset of 65 EM maps in the resolution range of 3-6 Å and showed substantial improvements in structure modeling using popular protein structure modeling tools. AVAILABILITY AND IMPLEMENTATION https://github.com/kiharalab/EM-GAN, Google Colab: https://tinyurl.com/3ccxpttx.
Collapse
Affiliation(s)
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, United States
| |
Collapse
|
32
|
Yvonnesdotter L, Rovšnik U, Blau C, Lycksell M, Howard RJ, Lindahl E. Automated simulation-based membrane protein refinement into cryo-EM data. Biophys J 2023; 122:2773-2781. [PMID: 37277992 PMCID: PMC10397807 DOI: 10.1016/j.bpj.2023.05.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 04/02/2023] [Accepted: 05/31/2023] [Indexed: 06/07/2023] Open
Abstract
The resolution revolution has increasingly enabled single-particle cryogenic electron microscopy (cryo-EM) reconstructions of previously inaccessible systems, including membrane proteins-a category that constitutes a disproportionate share of drug targets. We present a protocol for using density-guided molecular dynamics simulations to automatically refine atomistic models into membrane protein cryo-EM maps. Using adaptive force density-guided simulations as implemented in the GROMACS molecular dynamics package, we show how automated model refinement of a membrane protein is achieved without the need to manually tune the fitting force ad hoc. We also present selection criteria to choose the best-fit model that balances stereochemistry and goodness of fit. The proposed protocol was used to refine models into a new cryo-EM density of the membrane protein maltoporin, either in a lipid bilayer or detergent micelle, and we found that results do not substantially differ from fitting in solution. Fitted structures satisfied classical model-quality metrics and improved the quality and the model-to-map correlation of the x-ray starting structure. Additionally, the density-guided fitting in combination with generalized orientation-dependent all-atom potential was used to correct the pixel-size estimation of the experimental cryo-EM density map. This work demonstrates the applicability of a straightforward automated approach to fitting membrane protein cryo-EM densities. Such computational approaches promise to facilitate rapid refinement of proteins under different conditions or with various ligands present, including targets in the highly relevant superfamily of membrane proteins.
Collapse
Affiliation(s)
- Linnea Yvonnesdotter
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Urška Rovšnik
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Christian Blau
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Marie Lycksell
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Rebecca Joy Howard
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Erik Lindahl
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden; Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden.
| |
Collapse
|
33
|
He J, Li T, Huang SY. Improvement of cryo-EM maps by simultaneous local and non-local deep learning. Nat Commun 2023; 14:3217. [PMID: 37270635 DOI: 10.1038/s41467-023-39031-1] [Citation(s) in RCA: 68] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 05/25/2023] [Indexed: 06/05/2023] Open
Abstract
Cryo-EM has emerged as the most important technique for structure determination of macromolecular complexes. However, raw cryo-EM maps often exhibit loss of contrast at high resolution and heterogeneity over the entire map. As such, various post-processing methods have been proposed to improve cryo-EM maps. Nevertheless, it is still challenging to improve both the quality and interpretability of EM maps. Addressing the challenge, we present a three-dimensional Swin-Conv-UNet-based deep learning framework to improve cryo-EM maps, named EMReady, by not only implementing both local and non-local modeling modules in a multiscale UNet architecture but also simultaneously minimizing the local smooth L1 distance and maximizing the non-local structural similarity between processed experimental and simulated target maps in the loss function. EMReady was extensively evaluated on diverse test sets of 110 primary cryo-EM maps and 25 pairs of half-maps at 3.0-6.0 Å resolutions, and compared with five state-of-the-art map post-processing methods. It is shown that EMReady can not only robustly enhance the quality of cryo-EM maps in terms of map-model correlations, but also improve the interpretability of the maps in automatic de novo model building.
Collapse
Affiliation(s)
- Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Tao Li
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
34
|
Sekmen A, Al Nasr K, Bilgin B, Koku AB, Jones C. Mathematical and Machine Learning Approaches for Classification of Protein Secondary Structure Elements from Cα Coordinates. Biomolecules 2023; 13:923. [PMID: 37371503 DOI: 10.3390/biom13060923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 05/16/2023] [Accepted: 05/16/2023] [Indexed: 06/29/2023] Open
Abstract
Determining Secondary Structure Elements (SSEs) for any protein is crucial as an intermediate step for experimental tertiary structure determination. SSEs are identified using popular tools such as DSSP and STRIDE. These tools use atomic information to locate hydrogen bonds to identify SSEs. When some spatial atomic details are missing, locating SSEs becomes a hinder. To address the problem, when some atomic information is missing, three approaches for classifying SSE types using Cα atoms in protein chains were developed: (1) a mathematical approach, (2) a deep learning approach, and (3) an ensemble of five machine learning models. The proposed methods were compared against each other and with a state-of-the-art approach, PCASSO.
Collapse
Affiliation(s)
- Ali Sekmen
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| | - Bahadir Bilgin
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
- Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye
| | - Ahmet Bugra Koku
- Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye
- Center for Robotics and AI, Middle East Technical University, Ankara 06800, Türkiye
| | - Christopher Jones
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| |
Collapse
|
35
|
Chang L, Mondal A, MacCallum JL, Perez A. CryoFold 2.0: Cryo-EM Structure Determination with MELD. J Phys Chem A 2023; 127:3906-3913. [PMID: 37084537 DOI: 10.1021/acs.jpca.3c01731] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2023]
Abstract
Cryo-electron microscopy data are becoming more prevalent and accessible at higher resolution levels, leading to the development of new computational tools to determine the atomic structure of macromolecules. However, while existing tools adapted from X-ray crystallography are suitable for the highest-resolution maps, new tools are needed for lower-resolution levels and to account for map heterogeneity. In this article, we introduce CryoFold 2.0, an integrative physics-based approach that combines Bayesian inference and the ability to handle multiple data sources with the molecular dynamics flexible fitting (MDFF) approach to determine the structures of macromolecules by using cryo-EM data. CryoFold 2.0 is incorporated into the MELD (modeling employing limited data) plugin, resulting in a pipeline that is more computationally efficient and accurate than running MELD or MDFF alone. The approach requires fewer computational resources and shorter simulation times than the original CryoFold, and it minimizes manual intervention. We demonstrate the effectiveness of the approach on eight different systems, highlighting its various benefits.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Justin L MacCallum
- Department of Chemistry, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
36
|
Giri N, Roy RS, Cheng J. Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions. Curr Opin Struct Biol 2023; 79:102536. [PMID: 36773336 PMCID: PMC10023387 DOI: 10.1016/j.sbi.2023.102536] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/20/2022] [Accepted: 01/03/2023] [Indexed: 02/11/2023]
Abstract
Cryo-Electron Microscopy (cryo-EM) has emerged as a key technology to determine the structure of proteins, particularly large protein complexes and assemblies in recent years. A key challenge in cryo-EM data analysis is to automatically reconstruct accurate protein structures from cryo-EM density maps. In this review, we briefly overview various deep learning methods for building protein structures from cryo-EM density maps, analyze their impact, and discuss the challenges of preparing high-quality data sets for training deep learning models. Looking into the future, more advanced deep learning models of effectively integrating cryo-EM data with other sources of complementary data such as protein sequences and AlphaFold-predicted structures need to be developed to further advance the field.
Collapse
Affiliation(s)
- Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA; NextGen Precision Health, University of Missouri, Columbia, 65211, Missouri, USA. https://twitter.com/@nvngiri
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA. https://twitter.com/@rajshekhorroy
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA; NextGen Precision Health, University of Missouri, Columbia, 65211, Missouri, USA.
| |
Collapse
|
37
|
Nakamura A, Meng H, Zhao M, Wang F, Hou J, Cao R, Si D. Fast and automated protein-DNA/RNA macromolecular complex modeling from cryo-EM maps. Brief Bioinform 2023; 24:bbac632. [PMID: 36682003 PMCID: PMC10399284 DOI: 10.1093/bib/bbac632] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 12/15/2022] [Accepted: 12/29/2022] [Indexed: 01/23/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) allows a macromolecular structure such as protein-DNA/RNA complexes to be reconstructed in a three-dimensional coulomb potential map. The structural information of these macromolecular complexes forms the foundation for understanding the molecular mechanism including many human diseases. However, the model building of large macromolecular complexes is often difficult and time-consuming. We recently developed DeepTracer-2.0, an artificial-intelligence-based pipeline that can build amino acid and nucleic acid backbones from a single cryo-EM map, and even predict the best-fitting residues according to the density of side chains. The experiments showed improved accuracy and efficiency when benchmarking the performance on independent experimental maps of protein-DNA/RNA complexes and demonstrated the promising future of macromolecular modeling from cryo-EM maps. Our method and pipeline could benefit researchers worldwide who work in molecular biomedicine and drug discovery, and substantially increase the throughput of the cryo-EM model building. The pipeline has been integrated into the web portal https://deeptracer.uw.edu/.
Collapse
Affiliation(s)
- Andrew Nakamura
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| | - Hanze Meng
- Department of Computer Science, Duke University, Durham, NC 27708, USA
| | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Fengbin Wang
- Department of Biochemistry and Molecular Genetics, University of Alabama Birmingham, Heersink School of Medicine, Birmingham, AL 35233, USA
| | - Jie Hou
- Department of Computer Science, Saint Louis University, Saint Louis, MO 63103, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA
| | - Dong Si
- Corresponding author: Dong Si, Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA. E-mail:
| |
Collapse
|
38
|
Muenks A, Zepeda S, Zhou G, Veesler D, DiMaio F. Automatic and accurate ligand structure determination guided by cryo-electron microscopy maps. Nat Commun 2023; 14:1164. [PMID: 36859493 PMCID: PMC9976687 DOI: 10.1038/s41467-023-36732-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 02/15/2023] [Indexed: 03/03/2023] Open
Abstract
Advances in cryo-electron microscopy (cryoEM) and deep-learning guided protein structure prediction have expedited structural studies of protein complexes. However, methods for accurately determining ligand conformations are lacking. In this manuscript, we develop EMERALD, a tool for automatically determining ligand structures guided by medium-resolution cryoEM density. We show this method is robust at predicting ligands along with surrounding side chains in maps as low as 4.5 Å local resolution. Combining this with a measure of placement confidence and running on all protein/ligand structures in the EMDB, we show that 57% of ligands replicate the deposited model, 16% confidently find alternate conformations, 22% have ambiguous density where multiple conformations might be present, and 5% are incorrectly placed. For five cases where our approach finds an alternate conformation with high confidence, high-resolution crystal structures validate our placement. EMERALD and the resulting analysis should prove critical in using cryoEM to solve protein-ligand complexes.
Collapse
Affiliation(s)
- Andrew Muenks
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA
| | - Samantha Zepeda
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
| | - Guangfeng Zhou
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA
| | - David Veesler
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA.
| |
Collapse
|
39
|
Lee S, Seok C, Park H. Benchmarking applicability of medium-resolution cryo-EM protein structures for structure-based drug design. J Comput Chem 2023; 44:1360-1368. [PMID: 36847771 DOI: 10.1002/jcc.27091] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/18/2023] [Accepted: 02/05/2023] [Indexed: 03/01/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is gaining large attention for high-resolution protein structure determination in solutions. However, a very high percentage of cryo-EM structures correspond to resolutions of 3-5 Å, making the structures difficult to be used in in silico drug design. In this study, we analyze how useful cryo-EM protein structures are for in silico drug design by evaluating ligand docking accuracy. From realistic cross-docking scenarios using medium resolution (3-5 Å) cryo-EM structures and a popular docking tool Autodock-Vina, only 20% of docking succeeded, when the success rate doubles in the same kind of cross-docking but using high-resolution (<2 Å) crystal structures instead. We decipher the reason for failures by decomposing the contribution from resolution-dependent and independent factors. The heterogeneity in the protein side-chain and backbone conformations is identified as the major resolution-dependent factor causing docking difficulty from our analysis, while intrinsic receptor flexibility mainly comprises the resolution-independent factor. We demonstrate the flexibility implementation in current ligand docking tools is able to rescue only a portion of failures (10%), and the limited performance was majorly due to potential structural errors than conformational changes. Our work suggests the strong necessity of more robust method developments on ligand docking and EM modeling techniques in order to fully utilize cryo-EM structures for in silico drug design.
Collapse
Affiliation(s)
- Seho Lee
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea.,Galux Inc., Seoul, Republic of Korea
| | - Hahnbeom Park
- Brain Science Institute, Korea Institute of Science and Technology, Seoul, Republic of Korea
| |
Collapse
|
40
|
Beton JG, Cragnolini T, Kaleel M, Mulvaney T, Sweeney A, Topf M. Integrating model simulation tools and
cryo‐electron
microscopy. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Joseph George Beton
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck and University College London London UK
| | - Manaz Kaleel
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Aaron Sweeney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| |
Collapse
|
41
|
Christoffer C, Kihara D. Domain-Based Protein Docking with Extremely Large Conformational Changes. J Mol Biol 2022; 434:167820. [PMID: 36089054 PMCID: PMC9992458 DOI: 10.1016/j.jmb.2022.167820] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/31/2022] [Accepted: 09/03/2022] [Indexed: 11/17/2022]
Abstract
Proteins are key components in many processes in living cells, and physical interactions with other proteins and nucleic acids often form key parts of their functions. In many cases, large flexibility of proteins as they interact is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D structures of such protein complexes. When such structures are not yet experimentally determined, protein docking has long been present to computationally generate useful structure models. However, protein docking has long had the limitation that the consideration of flexibility is usually limited to very small movements or very small structures. Methods have been developed which handle minor flexibility via normal mode or other structure sampling, but new methods are required to model ordered proteins which undergo large-scale conformational changes to elucidate their function at the molecular level. Here, we present Flex-LZerD, a framework for docking such complexes. Via partial assembly multidomain docking and an iterative normal mode analysis admitting curvilinear motions, we demonstrate the ability to model the assembly of a variety of protein-protein and protein-nucleic acid complexes.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|
42
|
Ranno N, Si D. Neural representations of cryo-EM maps and a graph-based interpretation. BMC Bioinformatics 2022; 23:397. [PMID: 36171544 PMCID: PMC9517980 DOI: 10.1186/s12859-022-04942-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/10/2022] Open
Abstract
Background Advances in imagery at atomic and near-atomic resolution, such as cryogenic electron microscopy (cryo-EM), have led to an influx of high resolution images of proteins and other macromolecular structures to data banks worldwide. Producing a protein structure from the discrete voxel grid data of cryo-EM maps involves interpolation into the continuous spatial domain. We present a novel data format called the neural cryo-EM map, which is formed from a set of neural networks that accurately parameterize cryo-EM maps and provide native, spatially continuous data for density and gradient. As a case study of this data format, we create graph-based interpretations of high resolution experimental cryo-EM maps. Results Normalized cryo-EM map values interpolated using the non-linear neural cryo-EM format are more accurate, consistently scoring less than 0.01 mean absolute error, than a conventional tri-linear interpolation, which scores up to 0.12 mean absolute error. Our graph-based interpretations of 115 experimental cryo-EM maps from 1.15 to 4.0 Å resolution provide high coverage of the underlying amino acid residue locations, while accuracy of nodes is correlated with resolution. The nodes of graphs created from atomic resolution maps (higher than 1.6 Å) provide greater than 99% residue coverage as well as 85% full atomic coverage with a mean of 0.19 Å root mean squared deviation. Other graphs have a mean 84% residue coverage with less specificity of the nodes due to experimental noise and differences of density context at lower resolutions. Conclusions The fully continuous and differentiable nature of the neural cryo-EM map enables the adaptation of the voxel data to alternative data formats, such as a graph that characterizes the atomic locations of the underlying protein or macromolecular structure. Graphs created from atomic resolution maps are superior in finding atom locations and may serve as input to predictive residue classification and structure segmentation methods. This work may be generalized to transform any 3D grid-based data format into non-linear, continuous, and differentiable format for downstream geometric deep learning applications. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04942-1.
Collapse
Affiliation(s)
- Nathan Ranno
- Department of Computing and Software Systems, University of Washington, Bothell, WA, USA
| | - Dong Si
- Department of Computing and Software Systems, University of Washington, Bothell, WA, USA.
| |
Collapse
|
43
|
Bouvier G, Bardiaux B, Pellarin R, Rapisarda C, Nilges M. Building Protein Atomic Models from Cryo-EM Density Maps and Residue Co-Evolution. Biomolecules 2022; 12:biom12091290. [PMID: 36139128 PMCID: PMC9496541 DOI: 10.3390/biom12091290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/01/2022] [Accepted: 09/09/2022] [Indexed: 11/16/2022] Open
Abstract
Electron cryo-microscopy (cryo-EM) has emerged as a powerful method by which to obtain three-dimensional (3D) structures of macromolecular complexes at atomic or near-atomic resolution. However, de novo building of atomic models from near-atomic resolution (3–5 Å) cryo-EM density maps is a challenging task, in particular because poorly resolved side-chain densities hamper sequence assignment by automatic procedures at a lower resolution. Furthermore, segmentation of EM density maps into individual subunits remains a difficult problem when the structure of the subunits is not known, or when significant conformational rearrangement occurs between the isolated and associated form of the subunits. To tackle these issues, we have developed a graph-based method to thread most of the C-α trace of the protein backbone into the EM density map. The EM density is described as a weighted graph such that the resulting minimum spanning tree encompasses the high-density regions of the map. A pruning algorithm cleans the tree and finds the most probable positions of the C-α atoms, by using side-chain density when available, as a collection of C-α trace fragments. By complementing experimental EM maps with contact predictions from sequence co-evolutionary information, we demonstrate that this approach can correctly segment EM maps into individual subunits and assign amino acid sequences to backbone traces to generate atomic models.
Collapse
Affiliation(s)
- Guillaume Bouvier
- Structural Bioinformatics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR 3528, 75015 Paris, France
- Correspondence: (G.B.); (B.B.)
| | - Benjamin Bardiaux
- Structural Bioinformatics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR 3528, 75015 Paris, France
- Correspondence: (G.B.); (B.B.)
| | - Riccardo Pellarin
- Structural Bioinformatics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR 3528, 75015 Paris, France
| | - Chiara Rapisarda
- Microbiologie Fondamentale et Pathogènicité, University of Bordeaux, CNRS UMR 5234, 33076 Bordeaux, France
- Institut Européen de Chimie et Biologie, University of Bordeaux, 33600 Pessac, France
| | - Michael Nilges
- Structural Bioinformatics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR 3528, 75015 Paris, France
| |
Collapse
|
44
|
Chung JM, Durie CL, Lee J. Artificial Intelligence in Cryo-Electron Microscopy. Life (Basel) 2022; 12:1267. [PMID: 36013446 PMCID: PMC9410485 DOI: 10.3390/life12081267] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/15/2022] [Accepted: 08/18/2022] [Indexed: 11/17/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has become an unrivaled tool for determining the structure of macromolecular complexes. The biological function of macromolecular complexes is inextricably tied to the flexibility of these complexes. Single particle cryo-EM can reveal the conformational heterogeneity of a biochemically pure sample, leading to well-founded mechanistic hypotheses about the roles these complexes play in biology. However, the processing of increasingly large, complex datasets using traditional data processing strategies is exceedingly expensive in both user time and computational resources. Current innovations in data processing capitalize on artificial intelligence (AI) to improve the efficiency of data analysis and validation. Here, we review new tools that use AI to automate the data analysis steps of particle picking, 3D map reconstruction, and local resolution determination. We discuss how the application of AI moves the field forward, and what obstacles remain. We also introduce potential future applications of AI to use cryo-EM in understanding protein communities in cells.
Collapse
Affiliation(s)
- Jeong Min Chung
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Gyeonggi, Korea
| | - Clarissa L. Durie
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
| | - Jinseok Lee
- Department of Biomedical Engineering, Kyung Hee University, Yongin-si 17104, Gyeonggi, Korea
| |
Collapse
|
45
|
Alnabati E, Esquivel-Rodriguez J, Terashi G, Kihara D. MarkovFit: Structure Fitting for Protein Complexes in Electron Microscopy Maps Using Markov Random Field. Front Mol Biosci 2022; 9:935411. [PMID: 35959463 PMCID: PMC9358042 DOI: 10.3389/fmolb.2022.935411] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
An increasing number of protein complex structures are determined by cryo-electron microscopy (cryo-EM). When individual protein structures have been determined and are available, an important task in structure modeling is to fit the individual structures into the density map. Here, we designed a method that fits the atomic structures of proteins in cryo-EM maps of medium to low resolutions using Markov random fields, which allows probabilistic evaluation of fitted models. The accuracy of our method, MarkovFit, performed better than existing methods on datasets of 31 simulated cryo-EM maps of resolution 10 Å , nine experimentally determined cryo-EM maps of resolution less than 4 Å , and 28 experimentally determined cryo-EM maps of resolution 6 to 20 Å .
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | | | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
46
|
He J, Lin P, Chen J, Cao H, Huang SY. Model building of protein complexes from intermediate-resolution cryo-EM maps with deep learning-guided automatic assembly. Nat Commun 2022; 13:4066. [PMID: 35831370 PMCID: PMC9279371 DOI: 10.1038/s41467-022-31748-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 06/30/2022] [Indexed: 12/29/2022] Open
Abstract
Advances in microscopy instruments and image processing algorithms have led to an increasing number of cryo-electron microscopy (cryo-EM) maps. However, building accurate models into intermediate-resolution EM maps remains challenging and labor-intensive. Here, we propose an automatic model building method of multi-chain protein complexes from intermediate-resolution cryo-EM maps, named EMBuild, by integrating AlphaFold structure prediction, FFT-based global fitting, domain-based semi-flexible refinement, and graph-based iterative assembling on the main-chain probability map predicted by a deep convolutional network. EMBuild is extensively evaluated on diverse test sets of 47 single-particle EM maps at 4.0-8.0 Å resolution and 16 subtomogram averaging maps of cryo-ET data at 3.7-9.3 Å resolution, and compared with state-of-the-art approaches. We demonstrate that EMBuild is able to build high-quality complex structures that are comparably accurate to the manually built PDB structures from the cryo-EM maps. These results demonstrate the accuracy and reliability of EMBuild in automatic model building.
Collapse
Affiliation(s)
- Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Peicong Lin
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Ji Chen
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Hong Cao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
47
|
Alnabati E, Terashi G, Kihara D. Protein Structural Modeling for Electron Microscopy Maps Using VESPER and MAINMAST. Curr Protoc 2022; 2:e494. [PMID: 35849043 PMCID: PMC9299282 DOI: 10.1002/cpz1.494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
An increasing number of protein structures are determined by cryo-electron microscopy (cryo-EM) and stored in the Electron Microscopy Data Bank (EMDB). To interpret determined cryo-EM maps, several methods have been developed that model the tertiary structure of biomolecules, particularly proteins. Here we show how to use two such methods, VESPER and MAINMAST, which were developed in our group. VESPER is a method mainly for two purposes: fitting protein structure models into an EM map and aligning two EM maps locally or globally to capture their similarity. VESPER represents each EM map as a set of vectors pointing toward denser points. By considering matching the directions of vectors, in general, VESPER aligns maps better than conventional methods that only consider local densities of maps. MAINMAST is a de novo protein modeling tool designed for EM maps with resolution of 3-5 Å or better. MAINMAST builds a protein main chain directly from a density map by tracing dense points in an EM map and connecting them using a tree-graph structure. This article describes how to use these two tools using three illustrative modeling examples. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Protein structure model fitting using VESPER Alternate Protocol: Atomic model fitting using VESPER web server Basic Protocol 2: Protein de novo modeling using MAINMAST.
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer SciencePurdue UniversityWest LafayetteIndiana
| | - Genki Terashi
- Department of Biological SciencesPurdue UniversityWest LafayetteIndiana
| | - Daisuke Kihara
- Department of Computer SciencePurdue UniversityWest LafayetteIndiana
- Department of Biological SciencesPurdue UniversityWest LafayetteIndiana
| |
Collapse
|
48
|
Zhu Z, Deng Z, Wang Q, Wang Y, Zhang D, Xu R, Guo L, Wen H. Simulation and Machine Learning Methods for Ion-Channel Structure Determination, Mechanistic Studies and Drug Design. Front Pharmacol 2022; 13:939555. [PMID: 35837274 PMCID: PMC9275593 DOI: 10.3389/fphar.2022.939555] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Ion channels are expressed in almost all living cells, controlling the in-and-out communications, making them ideal drug targets, especially for central nervous system diseases. However, owing to their dynamic nature and the presence of a membrane environment, ion channels remain difficult targets for the past decades. Recent advancement in cryo-electron microscopy and computational methods has shed light on this issue. An explosion in high-resolution ion channel structures paved way for structure-based rational drug design and the state-of-the-art simulation and machine learning techniques dramatically improved the efficiency and effectiveness of computer-aided drug design. Here we present an overview of how simulation and machine learning-based methods fundamentally changed the ion channel-related drug design at different levels, as well as the emerging trends in the field.
Collapse
Affiliation(s)
- Zhengdan Zhu
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Beijing Institute of Big Data Research, Beijing, China
| | - Zhenfeng Deng
- DP Technology, Beijing, China
- School of Pharmaceutical Sciences, Peking University, Beijing, China
| | | | | | - Duo Zhang
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- DP Technology, Beijing, China
| | - Ruihan Xu
- DP Technology, Beijing, China
- National Engineering Research Center of Visual Technology, Peking University, Beijing, China
| | | | - Han Wen
- DP Technology, Beijing, China
| |
Collapse
|
49
|
Chua EYD, Mendez JH, Rapp M, Ilca SL, Tan YZ, Maruthi K, Kuang H, Zimanyi CM, Cheng A, Eng ET, Noble AJ, Potter CS, Carragher B. Better, Faster, Cheaper: Recent Advances in Cryo-Electron Microscopy. Annu Rev Biochem 2022; 91:1-32. [PMID: 35320683 PMCID: PMC10393189 DOI: 10.1146/annurev-biochem-032620-110705] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Cryo-electron microscopy (cryo-EM) continues its remarkable growth as a method for visualizing biological objects, which has been driven by advances across the entire pipeline. Developments in both single-particle analysis and in situ tomography have enabled more structures to be imaged and determined to better resolutions, at faster speeds, and with more scientists having improved access. This review highlights recent advances at each stageof the cryo-EM pipeline and provides examples of how these techniques have been used to investigate real-world problems, including antibody development against the SARS-CoV-2 spike during the recent COVID-19 pandemic.
Collapse
Affiliation(s)
- Eugene Y D Chua
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Center for CryoEM Access and Training, New York, NY, USA
| | - Joshua H Mendez
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Center for CryoEM Access and Training, New York, NY, USA
| | - Micah Rapp
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
| | - Serban L Ilca
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
| | - Yong Zi Tan
- Department of Biological Sciences, National University of Singapore, Singapore;
- Disease Intervention Technology Laboratory, Agency for Science, Technology and Research (A*STAR), Singapore
| | - Kashyap Maruthi
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Resource for Automated Molecular Microscopy, New York, NY, USA
| | - Huihui Kuang
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Resource for Automated Molecular Microscopy, New York, NY, USA
| | - Christina M Zimanyi
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Center for CryoEM Access and Training, New York, NY, USA
| | - Anchi Cheng
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Resource for Automated Molecular Microscopy, New York, NY, USA
| | - Edward T Eng
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Center for CryoEM Access and Training, New York, NY, USA
| | - Alex J Noble
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Resource for Automated Molecular Microscopy, New York, NY, USA
- National Center for In-Situ Tomographic Ultramicroscopy, New York, NY, USA
- Simons Machine Learning Center, New York, NY, USA
| | - Clinton S Potter
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Center for CryoEM Access and Training, New York, NY, USA
- National Resource for Automated Molecular Microscopy, New York, NY, USA
- National Center for In-Situ Tomographic Ultramicroscopy, New York, NY, USA
- Simons Machine Learning Center, New York, NY, USA
| | - Bridget Carragher
- New York Structural Biology Center, New York, NY, USA; , , , , , , , , , , ,
- Simons Electron Microscopy Center, New York, NY, USA
- National Center for CryoEM Access and Training, New York, NY, USA
- National Resource for Automated Molecular Microscopy, New York, NY, USA
- National Center for In-Situ Tomographic Ultramicroscopy, New York, NY, USA
- Simons Machine Learning Center, New York, NY, USA
| |
Collapse
|
50
|
Hryc CF, Baker ML. Beyond the Backbone: The Next Generation of Pathwalking Utilities for Model Building in CryoEM Density Maps. Biomolecules 2022; 12:773. [PMID: 35740898 PMCID: PMC9220806 DOI: 10.3390/biom12060773] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 05/25/2022] [Accepted: 05/30/2022] [Indexed: 01/18/2023] Open
Abstract
Single-particle electron cryomicroscopy (cryoEM) has become an indispensable tool for studying structure and function in macromolecular assemblies. As an integral part of the cryoEM structure determination process, computational tools have been developed to build atomic models directly from a density map without structural templates. Nearly a decade ago, we created Pathwalking, a tool for de novo modeling of protein structure in near-atomic resolution cryoEM density maps. Here, we present the latest developments in Pathwalking, including the addition of probabilistic models, as well as a companion tool for modeling waters and ligands. This software was evaluated on the 2021 CryoEM Ligand Challenge density maps, in addition to identifying ligands in three IP3R1 density maps at ~3 Å to 4.1 Å resolution. The results clearly demonstrate that the Pathwalking de novo modeling pipeline can construct accurate protein structures and reliably localize and identify ligand density directly from a near-atomic resolution map.
Collapse
Affiliation(s)
| | - Matthew L. Baker
- Department of Biochemistry and Molecular Biology, Structural Biology Imaging Center, McGovern Medical School, The University of Texas Health Science Center, 6431 Fannin Street, Houston, TX 77030, USA;
| |
Collapse
|