1
|
Zhang Z, Xu L, Zhang S, Peng C, Zhang G, Zhou X. DEMO-EMol: modeling protein-nucleic acid complex structures from cryo-EM maps by coupling chain assembly with map segmentation. Nucleic Acids Res 2025:gkaf416. [PMID: 40366028 DOI: 10.1093/nar/gkaf416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2025] [Revised: 04/29/2025] [Accepted: 05/03/2025] [Indexed: 05/15/2025] Open
Abstract
Atomic structure modeling is a crucial step in determining the structures of protein complexes using cryo-electron microscopy (cryo-EM). This work introduces DEMO-EMol, an improved server that integrates deep learning-based map segmentation and chain fitting to accurately assemble protein-nucleic acid (NA) complex structures from cryo-EM density maps. Starting from a density map and independently modeled chain structures, DEMO-EMol first segments protein and NA regions from the density map using deep learning. The overall complex is then assembled by fitting protein and NA chain models into their respective segmented maps, followed by domain-level fitting and optimization for protein chains. The output of DEMO-EMol includes the final assembled complex model along with overall and residue-level quality assessments. DEMO-EMol was evaluated on a comprehensive benchmark set of cryo-EM maps with resolutions ranging from 1.96 to 12.77 Å, and the results demonstrated its superior performance over the state-of-the-art methods for both protein-NA and protein-protein complex modeling. The DEMO-EMol web server is freely accessible at https://zhanggroup.org/DEMO-EMol/.
Collapse
Affiliation(s)
- Ziying Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Liang Xu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Shuai Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Chunxiang Peng
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, United States
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
2
|
Cao H, He J, Li T, Huang SY. Deciphering Protein Secondary Structures and Nucleic Acids in Cryo-EM Maps Using Deep Learning. J Chem Inf Model 2025; 65:1641-1652. [PMID: 39838545 DOI: 10.1021/acs.jcim.4c01971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2025]
Abstract
With the resolution revolution of cryo-electron microscopy (cryo-EM) and the rapid development of image processing technology, cryo-EM has become an indispensable experimental method for determining the three-dimensional structures of biological macromolecules. However, structural modeling from cryo-EM maps remains a difficult task for intermediate-resolution maps. In such cases, detection of protein secondary structures and nucleic acid locations in an EM map is of great value for model building of the map. Meeting the need, we present a deep learning-based method for detecting protein secondary structures and nucleic acid locations in cryo-EM density maps, named EMInfo. EMInfo was extensively evaluated on two protein-nucleic acid complex test sets including intermediate-resolution experimental maps and high-resolution experimental maps and compared them with two state-of-the-art methods including Emap2sec+ and Haruspex. It is shown that EMInfo can accurately predict different structural categories in an EM map. EMInfo is freely available at http://huanglab.phys.hust.edu.cn/EMInfo/.
Collapse
Affiliation(s)
- Hong Cao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Tao Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
3
|
Li T, He J, Cao H, Zhang Y, Chen J, Xiao Y, Huang SY. All-atom RNA structure determination from cryo-EM maps. Nat Biotechnol 2025; 43:97-105. [PMID: 38396075 DOI: 10.1038/s41587-024-02149-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Accepted: 01/24/2024] [Indexed: 02/25/2024]
Abstract
Many methods exist for determining protein structures from cryogenic electron microscopy maps, but this remains challenging for RNA structures. Here we developed EMRNA, a method for accurate, automated determination of full-length all-atom RNA structures from cryogenic electron microscopy maps. EMRNA integrates deep learning-based detection of nucleotides, three-dimensional backbone tracing and scoring with consideration of sequence and secondary structure information, and full-atom construction of the RNA structure. We validated EMRNA on 140 diverse RNA maps ranging from 37 to 423 nt at 2.0-6.0 Å resolutions, and compared EMRNA with auto-DRRAFTER, phenix.map_to_model and CryoREAD on a set of 71 cases. EMRNA achieves a median accuracy of 2.36 Å root mean square deviation and 0.86 TM-score for full-length RNA structures, compared with 6.66 Å and 0.58 for auto-DRRAFTER. EMRNA also obtains a high residue coverage and sequence match of 93.30% and 95.30% in the built models, compared with 58.20% and 42.20% for phenix.map_to_model and 56.45% and 52.3% for CryoREAD. EMRNA is fast and can build an RNA structure of 100 nt within 3 min.
Collapse
Affiliation(s)
- Tao Li
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Cao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Yi Zhang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Ji Chen
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Yi Xiao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
4
|
Mu Y, Nguyen T, Hawickhorst B, Wriggers W, Sun J, He J. The combined focal loss and dice loss function improves the segmentation of beta-sheets in medium-resolution cryo-electron-microscopy density maps. BIOINFORMATICS ADVANCES 2024; 4:vbae169. [PMID: 39600382 PMCID: PMC11590252 DOI: 10.1093/bioadv/vbae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 08/17/2024] [Accepted: 11/19/2024] [Indexed: 11/29/2024]
Abstract
Summary Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F1 score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software. Availability and implementation https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.
Collapse
Affiliation(s)
- Yongcheng Mu
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Thu Nguyen
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Bryan Hawickhorst
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, United States
| | - Jiangwen Sun
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| |
Collapse
|
5
|
Li T, Cao H, He J, Huang SY. Automated detection and de novo structure modeling of nucleic acids from cryo-EM maps. Nat Commun 2024; 15:9367. [PMID: 39477926 PMCID: PMC11525807 DOI: 10.1038/s41467-024-53721-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 10/18/2024] [Indexed: 11/02/2024] Open
Abstract
Cryo-electron microscopy (cryo-EM) is one of the most powerful experimental methods for macromolecular structure determination. However, accurate DNA/RNA structure modeling from cryo-EM maps is still challenging especially for protein-DNA/RNA or multi-chain DNA/RNA complexes. Here we propose a deep learning-based method for accurate de novo structure determination of DNA/RNA from cryo-EM maps at <5 Å resolutions, which is referred to as EM2NA. EM2NA is extensively evaluated on a diverse test set of 50 experimental maps at 2.0-5.0 Å resolutions, and compared with state-of-the-art methods including CryoREAD, ModelAngelo, and phenix.map_to_model. On average, EM2NA achieves a residue coverage of 83.15%, C4' RMSD of 1.06 Å, and sequence recall of 46.86%, which outperforms the existing methods. Moreover, EM2NA is applied to build the DNA/RNA structures with 10 to 5347 nt from an EMDB-wide data set of 263 unmodeled raw maps, demonstrating its ability in the blind model building of DNA/RNA from cryo-EM maps. EM2NA is fast and can normally build a DNA/RNA structure of <500 nt within 10 minutes.
Collapse
Affiliation(s)
- Tao Li
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Cao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
6
|
Song X, Bao L, Feng C, Huang Q, Zhang F, Gao X, Han R. Accurate Prediction of Protein Structural Flexibility by Deep Learning Integrating Intricate Atomic Structures and Cryo-EM Density Information. Nat Commun 2024; 15:5538. [PMID: 38956032 PMCID: PMC11219796 DOI: 10.1038/s41467-024-49858-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 06/20/2024] [Indexed: 07/04/2024] Open
Abstract
The dynamics of proteins are crucial for understanding their mechanisms. However, computationally predicting protein dynamic information has proven challenging. Here, we propose a neural network model, RMSF-net, which outperforms previous methods and produces the best results in a large-scale protein dynamics dataset; this model can accurately infer the dynamic information of a protein in only a few seconds. By learning effectively from experimental protein structure data and cryo-electron microscopy (cryo-EM) data integration, our approach is able to accurately identify the interactive bidirectional constraints and supervision between cryo-EM maps and PDB models in maximizing the dynamic prediction efficacy. Rigorous 5-fold cross-validation on the dataset demonstrates that RMSF-net achieves test correlation coefficients of 0.746 ± 0.127 at the voxel level and 0.765 ± 0.109 at the residue level, showcasing its ability to deliver dynamic predictions closely approximating molecular dynamics simulations. Additionally, it offers real-time dynamic inference with minimal storage overhead on the order of megabytes. RMSF-net is a freely accessible tool and is anticipated to play an essential role in the study of protein dynamics.
Collapse
Affiliation(s)
- Xintao Song
- Research Center for Mathematics and Interdisciplinary Sciences (Ministry of Education Frontiers Science Center for Nonlinear Expectations), Shandong University, Qingdao, China
- BioMap Research, Menlo Park, CA, USA
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, Saudi Arabia
| | - Lei Bao
- School of Public Health, Hubei University of Medicine, Shiyan, China
| | - Chenjie Feng
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan, China
| | - Qiang Huang
- Research Center for Mathematics and Interdisciplinary Sciences (Ministry of Education Frontiers Science Center for Nonlinear Expectations), Shandong University, Qingdao, China
| | - Fa Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, China.
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, Saudi Arabia.
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences (Ministry of Education Frontiers Science Center for Nonlinear Expectations), Shandong University, Qingdao, China.
- BioMap Research, Menlo Park, CA, USA.
| |
Collapse
|
7
|
Giri N, Wang L, Cheng J. Cryo2StructData: A Large Labeled Cryo-EM Density Map Dataset for AI-based Modeling of Protein Structures. Sci Data 2024; 11:458. [PMID: 38710720 PMCID: PMC11074267 DOI: 10.1038/s41597-024-03299-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/23/2024] [Indexed: 05/08/2024] Open
Abstract
The advent of single-particle cryo-electron microscopy (cryo-EM) has brought forth a new era of structural biology, enabling the routine determination of large biological molecules and their complexes at atomic resolution. The high-resolution structures of biological macromolecules and their complexes significantly expedite biomedical research and drug discovery. However, automatically and accurately building atomic models from high-resolution cryo-EM density maps is still time-consuming and challenging when template-based models are unavailable. Artificial intelligence (AI) methods such as deep learning trained on limited amount of labeled cryo-EM density maps generate inaccurate atomic models. To address this issue, we created a dataset called Cryo2StructData consisting of 7,600 preprocessed cryo-EM density maps whose voxels are labelled according to their corresponding known atomic structures for training and testing AI methods to build atomic models from cryo-EM density maps. Cryo2StructData is larger than existing, publicly available datasets for training AI methods to build atomic protein structures from cryo-EM density maps. We trained and tested deep learning models on Cryo2StructData to validate its quality showing that it is ready for being used to train and test AI methods for building atomic models.
Collapse
Affiliation(s)
- Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
- Roy Blunt NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY, 11973, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA.
- Roy Blunt NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
8
|
Giri N, Wang L, Cheng J. Cryo2StructData: A Large Labeled Cryo-EM Density Map Dataset for AI-based Modeling of Protein Structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.06.14.545024. [PMID: 37398020 PMCID: PMC10312718 DOI: 10.1101/2023.06.14.545024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The advent of single-particle cryo-electron microscopy (cryo-EM) has brought forth a new era of structural biology, enabling the routine determination of large biological molecules and their complexes at atomic resolution. The high-resolution structures of biological macromolecules and their complexes significantly expedite biomedical research and drug discovery. However, automatically and accurately building atomic models from high-resolution cryo-EM density maps is still time-consuming and challenging when template-based models are unavailable. Artificial intelligence (AI) methods such as deep learning trained on limited amount of labeled cryo-EM density maps generate inaccurate atomic models. To address this issue, we created a dataset called Cryo2StructData consisting of 7,600 preprocessed cryo-EM density maps whose voxels are labelled according to their corresponding known atomic structures for training and testing AI methods to build atomic models from cryo-EM density maps. It is larger and of higher quality than any existing, publicly available dataset. We trained and tested deep learning models on Cryo2StructData to make sure it is ready for the large-scale development of AI methods for building atomic models from cryo-EM density maps.
Collapse
Affiliation(s)
- Nabin Giri
- University of Missouri, Electrical Engineering and Computer Science, Columbia, 65211, USA
- NextGen Precision Health Institute, Columbia, 65211, USA
| | - Liguo Wang
- Laboratory for Biological Structure, Brookhaven National Laboratory, Upton, NY, 11973, USA
| | - Jianlin Cheng
- University of Missouri, Electrical Engineering and Computer Science, Columbia, 65211, USA
- NextGen Precision Health Institute, Columbia, 65211, USA
| |
Collapse
|
9
|
Dai X, Wu L, Yoo S, Liu Q. Integrating AlphaFold and deep learning for atomistic interpretation of cryo-EM maps. Brief Bioinform 2023; 24:bbad405. [PMID: 37982712 DOI: 10.1093/bib/bbad405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 10/09/2023] [Accepted: 10/23/2023] [Indexed: 11/21/2023] Open
Abstract
Interpretation of cryo-electron microscopy (cryo-EM) maps requires building and fitting 3D atomic models of biological molecules. AlphaFold-predicted models generate initial 3D coordinates; however, model inaccuracy and conformational heterogeneity often necessitate labor-intensive manual model building and fitting into cryo-EM maps. In this work, we designed a protein model-building workflow, which combines a deep-learning cryo-EM map feature enhancement tool, CryoFEM (Cryo-EM Feature Enhancement Model) and AlphaFold. A benchmark test using 36 cryo-EM maps shows that CryoFEM achieves state-of-the-art performance in optimizing the Fourier Shell Correlations between the maps and the ground truth models. Furthermore, in a subset of 17 datasets where the initial AlphaFold predictions are less accurate, the workflow significantly improves their model accuracy. Our work demonstrates that the integration of modern deep learning image enhancement and AlphaFold may lead to automated model building and fitting for the atomistic interpretation of cryo-EM maps.
Collapse
Affiliation(s)
- Xin Dai
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY, USA
| | - Longlong Wu
- Condensed Matter Physics and Materials Science Department, Brookhaven National Laboratory, Upton, NY, USA
| | - Shinjae Yoo
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY, USA
| | - Qun Liu
- Biology Department, Brookhaven National Laboratory, Upton, NY, USA
| |
Collapse
|
10
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
11
|
He J, Li T, Huang SY. Improvement of cryo-EM maps by simultaneous local and non-local deep learning. Nat Commun 2023; 14:3217. [PMID: 37270635 DOI: 10.1038/s41467-023-39031-1] [Citation(s) in RCA: 68] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 05/25/2023] [Indexed: 06/05/2023] Open
Abstract
Cryo-EM has emerged as the most important technique for structure determination of macromolecular complexes. However, raw cryo-EM maps often exhibit loss of contrast at high resolution and heterogeneity over the entire map. As such, various post-processing methods have been proposed to improve cryo-EM maps. Nevertheless, it is still challenging to improve both the quality and interpretability of EM maps. Addressing the challenge, we present a three-dimensional Swin-Conv-UNet-based deep learning framework to improve cryo-EM maps, named EMReady, by not only implementing both local and non-local modeling modules in a multiscale UNet architecture but also simultaneously minimizing the local smooth L1 distance and maximizing the non-local structural similarity between processed experimental and simulated target maps in the loss function. EMReady was extensively evaluated on diverse test sets of 110 primary cryo-EM maps and 25 pairs of half-maps at 3.0-6.0 Å resolutions, and compared with five state-of-the-art map post-processing methods. It is shown that EMReady can not only robustly enhance the quality of cryo-EM maps in terms of map-model correlations, but also improve the interpretability of the maps in automatic de novo model building.
Collapse
Affiliation(s)
- Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Tao Li
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
12
|
Giri N, Roy RS, Cheng J. Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions. Curr Opin Struct Biol 2023; 79:102536. [PMID: 36773336 PMCID: PMC10023387 DOI: 10.1016/j.sbi.2023.102536] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/20/2022] [Accepted: 01/03/2023] [Indexed: 02/11/2023]
Abstract
Cryo-Electron Microscopy (cryo-EM) has emerged as a key technology to determine the structure of proteins, particularly large protein complexes and assemblies in recent years. A key challenge in cryo-EM data analysis is to automatically reconstruct accurate protein structures from cryo-EM density maps. In this review, we briefly overview various deep learning methods for building protein structures from cryo-EM density maps, analyze their impact, and discuss the challenges of preparing high-quality data sets for training deep learning models. Looking into the future, more advanced deep learning models of effectively integrating cryo-EM data with other sources of complementary data such as protein sequences and AlphaFold-predicted structures need to be developed to further advance the field.
Collapse
Affiliation(s)
- Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA; NextGen Precision Health, University of Missouri, Columbia, 65211, Missouri, USA. https://twitter.com/@nvngiri
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA. https://twitter.com/@rajshekhorroy
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA; NextGen Precision Health, University of Missouri, Columbia, 65211, Missouri, USA.
| |
Collapse
|
13
|
Garcia Condado J, Muñoz-Barrutia A, Sorzano COS. Automatic determination of the handedness of single-particle maps of macromolecules solved by CryoEM. J Struct Biol 2022; 214:107915. [PMID: 36341955 DOI: 10.1016/j.jsb.2022.107915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/29/2022] [Accepted: 10/25/2022] [Indexed: 12/07/2022]
Abstract
Single-Particle Analysis by Cryo-Electron Microscopy is a well-established technique to elucidate the three-dimensional (3D) structure of biological macromolecules. The orientation of the acquired projection images must be initially estimated without any reference to the final structure. In this step, algorithms may find a mirrored version of all the orientations resulting in a mirrored 3D map. It is as compatible with the acquired images as its unmirrored version from the image processing point of view, only that it is not biologically plausible. In this article, we introduce HaPi (Handedness Pipeline), the first method to automatically determine the hand of electron density maps of macromolecules solved by CryoEM. HaPi is built by training two 3D convolutional neural networks. The first determines α-helices in a map, and the second determines whether the α-helix is left-handed or right-handed. A consensus strategy defines the overall map hand. The pipeline is trained on simulated and experimental data. The handedness can be detected only for maps whose resolution is better than 5 Å. HaPi can identify the hand in 89% of new simulated maps correctly. Moreover, we evaluated all the maps deposited at the Electron Microscopy Data Bank and 11 structures uploaded with the incorrect hand were identified.
Collapse
Affiliation(s)
- J Garcia Condado
- Biocruces Bizkaia Instituto Investigación Sanitaria, Cruces Plaza, 48903 Barakaldo, Bizkaia, Spain; Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Madrid, Spain; Centro Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - A Muñoz-Barrutia
- Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Madrid, Spain
| | - C O S Sorzano
- Centro Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain.
| |
Collapse
|
14
|
He J, Lin P, Chen J, Cao H, Huang SY. Model building of protein complexes from intermediate-resolution cryo-EM maps with deep learning-guided automatic assembly. Nat Commun 2022; 13:4066. [PMID: 35831370 PMCID: PMC9279371 DOI: 10.1038/s41467-022-31748-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 06/30/2022] [Indexed: 12/29/2022] Open
Abstract
Advances in microscopy instruments and image processing algorithms have led to an increasing number of cryo-electron microscopy (cryo-EM) maps. However, building accurate models into intermediate-resolution EM maps remains challenging and labor-intensive. Here, we propose an automatic model building method of multi-chain protein complexes from intermediate-resolution cryo-EM maps, named EMBuild, by integrating AlphaFold structure prediction, FFT-based global fitting, domain-based semi-flexible refinement, and graph-based iterative assembling on the main-chain probability map predicted by a deep convolutional network. EMBuild is extensively evaluated on diverse test sets of 47 single-particle EM maps at 4.0-8.0 Å resolution and 16 subtomogram averaging maps of cryo-ET data at 3.7-9.3 Å resolution, and compared with state-of-the-art approaches. We demonstrate that EMBuild is able to build high-quality complex structures that are comparably accurate to the manually built PDB structures from the cryo-EM maps. These results demonstrate the accuracy and reliability of EMBuild in automatic model building.
Collapse
Affiliation(s)
- Jiahua He
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Peicong Lin
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Ji Chen
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Hong Cao
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Sheng-You Huang
- School of Physics and Key Laboratory of Molecular Biophysics of MOE, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
15
|
Thorn A. Artificial intelligence in the experimental determination and prediction of macromolecular structures. Curr Opin Struct Biol 2022; 74:102368. [DOI: 10.1016/j.sbi.2022.102368] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 02/22/2022] [Accepted: 03/08/2022] [Indexed: 11/26/2022]
|
16
|
Behkamal B, Naghibzadeh M, Pagnani A, Saberi MR, Al Nasr K. LPTD: a novel linear programming-based topology determination method for cryo-EM maps. Bioinformatics 2022; 38:2734-2741. [PMID: 35561171 PMCID: PMC9306757 DOI: 10.1093/bioinformatics/btac170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 03/01/2022] [Accepted: 03/18/2022] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Topology determination is one of the most important intermediate steps toward building the atomic structure of proteins from their medium-resolution cryo-electron microscopy (cryo-EM) map. The main goal in the topology determination is to identify correct matches (i.e. assignment and direction) between secondary structure elements (SSEs) (α-helices and β-sheets) detected in a protein sequence and cryo-EM density map. Despite many recent advances in molecular biology technologies, the problem remains a challenging issue. To overcome the problem, this article proposes a linear programming-based topology determination (LPTD) method to solve the secondary structure topology problem in three-dimensional geometrical space. Through modeling of the protein's sequence with the aid of extracting highly reliable features and a distance-based scoring function, the secondary structure matching problem is transformed into a complete weighted bipartite graph matching problem. Subsequently, an algorithm based on linear programming is developed as a decision-making strategy to extract the true topology (native topology) between all possible topologies. The proposed automatic framework is verified using 12 experimental and 15 simulated α-β proteins. Results demonstrate that LPTD is highly efficient and extremely fast in such a way that for 77% of cases in the dataset, the native topology has been detected in the first rank topology in <2 s. Besides, this method is able to successfully handle large complex proteins with as many as 65 SSEs. Such a large number of SSEs have never been solved with current tools/methods. AVAILABILITY AND IMPLEMENTATION The LPTD package (source code and data) is publicly available at https://github.com/B-Behkamal/LPTD. Moreover, two test samples as well as the instruction of utilizing the graphical user interface have been provided in the shared readme file. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bahareh Behkamal
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran
| | - Andrea Pagnani
- Department of Applied Science and Technology (DISAT), Politecnico di Torino, Torino I-10129, Italy
- Italian Institute for Genomic Medicine (IIGM), IRCC-Candiolo, Candiolo (TO) I-10060, Italy
- INFN Sezione di Torino, Torino I-10125, Italy
| | - Mohammad Reza Saberi
- Medicinal Chemistry Department, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran
- Bioinformatics Research Group, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| |
Collapse
|
17
|
Wu JG, Yan Y, Zhang DX, Liu BW, Zheng QB, Xie XL, Liu SQ, Ge SX, Hou ZG, Xia NS. Machine Learning for Structure Determination in Single-Particle Cryo-Electron Microscopy: A Systematic Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:452-472. [PMID: 34932487 DOI: 10.1109/tnnls.2021.3131325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recently, single-particle cryo-electron microscopy (cryo-EM) has become an indispensable method for determining macromolecular structures at high resolution to deeply explore the relevant molecular mechanism. Its recent breakthrough is mainly because of the rapid advances in hardware and image processing algorithms, especially machine learning. As an essential support of single-particle cryo-EM, machine learning has powered many aspects of structure determination and greatly promoted its development. In this article, we provide a systematic review of the applications of machine learning in this field. Our review begins with a brief introduction of single-particle cryo-EM, followed by the specific tasks and challenges of its image processing. Then, focusing on the workflow of structure determination, we describe relevant machine learning algorithms and applications at different steps, including particle picking, 2-D clustering, 3-D reconstruction, and other steps. As different tasks exhibit distinct characteristics, we introduce the evaluation metrics for each task and summarize their dynamics of technology development. Finally, we discuss the open issues and potential trends in this promising field.
Collapse
|
18
|
Chen YX, Xie R, Yang Y, He L, Feng D, Shen HB. Fast Cryo-EM Image Alignment Algorithm Using Power Spectrum Features. J Chem Inf Model 2021; 61:4795-4806. [PMID: 34523929 DOI: 10.1021/acs.jcim.1c00745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Cryo-electron microscopy (cryo-EM) single-particle image analysis is a powerful technique to resolve structures of biomacromolecules, while the challenge is that the cryo-EM image is of a low signal-to-noise ratio. For both two-dimensional image analysis and three-dimensional density map analysis, image alignment is an important step to improve the precision of the image distance calculation. In this paper, we introduce a new algorithm for performing two-dimensional pairwise alignment for cryo-EM particle images, which is based on the Fourier transform and power spectrum analysis. Compared to the existing heuristic iterative alignment methods, our method utilizes the signal distribution and signal feature on images' power spectrum to directly compute the alignment parameters. It does not require iterative computations and is robust against the cryo-EM image noise. Both theoretical analysis and experimental results suggest that our power-spectrum-feature-based alignment method is highly computational-efficient and is capable of offering effective alignment results. This new alignment algorithm is publicly available at: www.csbio.sjtu.edu.cn/bioinf/EMAF/for academic use.
Collapse
Affiliation(s)
- Yu-Xuan Chen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Rui Xie
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Yang Yang
- Department of Computer Science, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Lin He
- Instrumental Analysis Center, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dagan Feng
- School of Computer Science, University of Sydney, Sydney 2006, Australia
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| |
Collapse
|
19
|
He J, Huang SY. Full-length de novo protein structure determination from cryo-EM maps using deep learning. Bioinformatics 2021; 37:3480-3490. [PMID: 33978686 DOI: 10.1093/bioinformatics/btab357] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 04/03/2021] [Accepted: 05/08/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION Advances in microscopy instruments and image processing algorithms have led to an increasing number of cryo-EM maps. However, building accurate models for the EM maps at 3-5 Å resolution remains a challenging and time-consuming process. With the rapid growth of deposited EM maps, there is an increasing gap between the maps and reconstructed/modeled 3-dimensional (3D) structures. Therefore, automatic reconstruction of atomic-accuracy full-atomstructures fromEMmaps is pressingly needed. RESULTS We present a semi-automatic de novo structure determination method using a deep learningbased framework, named as DeepMM, which builds atomic-accuracy all-atom models from cryo-EM maps at near-atomic resolution. In our method, the main-chain and Cα positions as well as their amino acid and secondary structure types are predicted in the EM map using Densely Connected Convolutional Networks. DeepMM was extensively validated on 40 simulated maps at 5 Å resolution and 30 experimental maps at 2.6-4.8 Å resolution as well as an EMDB-wide data set of 2931 experimental maps at 2.6-4.9 Å resolution, and compared with state-of-the-art algorithms including RosettaES, MAINMAST, and Phenix. Overall, our DeepMM algorithm obtained a significant improvement over existing methods in terms of both accuracy and coverage in building full-length protein structures on all test sets, demonstrating the efficacy and general applicability of DeepMM. AVAILABILITY http://huanglab.phys.hust.edu.cn/DeepMM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|