1
|
Zhu H, Terashi G, Farheen F, Nakamura T, Kihara D. AI-based quality assessment methods for protein structure models from cryo-EM. Curr Res Struct Biol 2025; 9:100164. [PMID: 39996138 PMCID: PMC11848767 DOI: 10.1016/j.crstbi.2025.100164] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 01/23/2025] [Accepted: 01/29/2025] [Indexed: 02/26/2025] Open
Abstract
Cryogenic electron microscopy (cryo-EM) has revolutionized structural biology, with an increasing number of structures being determined by cryo-EM each year, many at higher resolutions. However, challenges remain in accurately interpreting cryo-EM maps. Inaccuracies can arise in regions of locally low resolution, where manual model building is more prone to errors. Validation scores for structure models have been developed to assess both the compatibility between map density and the structure, as well as the geometric and stereochemical properties of protein models. Recent advancements have introduced artificial intelligence (AI) into this field. These emerging AI-driven tools offer unique capabilities in the validation and refinement of cryo-EM-derived protein atomic models, potentially leading to more accurate protein structures and deeper insights into complex biological systems.
Collapse
Affiliation(s)
- Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Farhanaz Farheen
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Structural Biology Research Center, High Energy Accelerator Research Organization (KEK), Tsukuba, Ibaraki, 305-0801, Japan
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| |
Collapse
|
2
|
Cao H, He J, Li T, Huang SY. Deciphering Protein Secondary Structures and Nucleic Acids in Cryo-EM Maps Using Deep Learning. J Chem Inf Model 2025; 65:1641-1652. [PMID: 39838545 DOI: 10.1021/acs.jcim.4c01971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2025]
Abstract
With the resolution revolution of cryo-electron microscopy (cryo-EM) and the rapid development of image processing technology, cryo-EM has become an indispensable experimental method for determining the three-dimensional structures of biological macromolecules. However, structural modeling from cryo-EM maps remains a difficult task for intermediate-resolution maps. In such cases, detection of protein secondary structures and nucleic acid locations in an EM map is of great value for model building of the map. Meeting the need, we present a deep learning-based method for detecting protein secondary structures and nucleic acid locations in cryo-EM density maps, named EMInfo. EMInfo was extensively evaluated on two protein-nucleic acid complex test sets including intermediate-resolution experimental maps and high-resolution experimental maps and compared them with two state-of-the-art methods including Emap2sec+ and Haruspex. It is shown that EMInfo can accurately predict different structural categories in an EM map. EMInfo is freely available at http://huanglab.phys.hust.edu.cn/EMInfo/.
Collapse
Affiliation(s)
- Hong Cao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Tao Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
3
|
Punuru P, Jain A, Kihara D. Secondary Structure Detection and Structure Modeling for Cryo-EM. Methods Mol Biol 2025; 2870:341-355. [PMID: 39543043 DOI: 10.1007/978-1-0716-4213-9_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Rapid advancements in cryogenic electron microscopy (cryo-EM) have revolutionized the field of structural biology by enabling the determination of complex macromolecular structures at unprecedented resolutions. When cryo-EM density maps have a resolution around 3 Å, the atomic structure can be modeled manually. However, as the resolution decreases, analyzing these density maps becomes increasingly challenging. For modeling structures in lower resolution maps, deep learning can be used to identify structural features in the maps to assist in structure modeling.Here, we present a suite of deep learning-based tools developed by our lab that enable structural biologists to work with cryo-EM maps of a wide range of resolutions. For cryo-EM maps at near-atomic resolution (5 Å or better), DeepMainmast automatically models all-atom structures by tracing the main chain from local map features of amino acids and atoms detected by deep learning; DAQ score quantifies map-model fit and indicates potential misassignments in protein models. In intermediate resolution maps (5-10 Å), Emap2sec and Emap2sec+ can accurately detect protein secondary structures and nucleic acids. These tools and more are available at our web server: https://em.kiharalab.org/ .
Collapse
Affiliation(s)
- Pranav Punuru
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Anika Jain
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
4
|
Baghirov J, Zhu H, Wang X, Kihara D. Protein Secondary Structure and DNA/RNA Detection for Cryo-EM and Cryo-ET Using Emap2sec and Emap2sec . Methods Mol Biol 2025; 2867:105-120. [PMID: 39576577 DOI: 10.1007/978-1-0716-4196-5_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Cryo-electron microscopy (cryo-EM) has become a powerful tool for determining the structures of macromolecules, such as proteins and DNA/RNA complexes. While high-resolution cryo-EM maps are increasingly available, there is still a substantial number of maps determined at intermediate or low resolution. These maps present challenges when it comes to extracting structural information. In response to this, two computational methods, Emap2sec and Emap2sec+, have been developed by our group to address these challenges and benefit the analysis of cryo-EM maps. In this chapter, we describe how to use the web servers of two of our structure analysis software for cryo-EM, Emap2sec and Emapsec+. Both methods identify local structures in medium-resolution EM maps of 5-10 Å to help find and fit protein and DNA/RNA structures in EM maps. Emap2sec identifies the secondary structures of proteins, while Emap2sec+ also identifies DNA/RNA locations in cryo-EM maps. As cryo-electron tomogram (cryo-ET) has started to produce data of this resolution, these methods would be useful for cryo-ET, too. Both methods are available in the form of webservers and source code at https://kiharalab.org/emsuites/ .
Collapse
Affiliation(s)
- Javad Baghirov
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
5
|
Wang X, Zhu H, Terashi G, Taluja M, Kihara D. DiffModeler: large macromolecular structure modeling for cryo-EM maps using a diffusion model. Nat Methods 2024; 21:2307-2317. [PMID: 39433880 DOI: 10.1038/s41592-024-02479-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 09/19/2024] [Indexed: 10/23/2024]
Abstract
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multichain protein complexes. However, modeling a large complex structure, such as those with more than ten chains, is challenging, particularly when the map resolution decreases. Here we present DiffModeler, a fully automated method for modeling large protein complex structures. DiffModeler employs a diffusion model for backbone tracing and integrates AlphaFold2-predicted single-chain structures for structure fitting. DiffModeler showed an average template modeling score of 0.88 and 0.91 for two datasets of cryo-EM maps of 0-5 Å resolution and 0.92 for intermediate resolution maps (5-10 Å), substantially outperforming existing methodologies. Further benchmarking at low resolutions (10-20 Å) confirms its versatility, demonstrating plausible performance.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Manav Taluja
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
6
|
Mu Y, Nguyen T, Hawickhorst B, Wriggers W, Sun J, He J. The combined focal loss and dice loss function improves the segmentation of beta-sheets in medium-resolution cryo-electron-microscopy density maps. BIOINFORMATICS ADVANCES 2024; 4:vbae169. [PMID: 39600382 PMCID: PMC11590252 DOI: 10.1093/bioadv/vbae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 08/17/2024] [Accepted: 11/19/2024] [Indexed: 11/29/2024]
Abstract
Summary Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F1 score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software. Availability and implementation https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.
Collapse
Affiliation(s)
- Yongcheng Mu
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Thu Nguyen
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Bryan Hawickhorst
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, United States
| | - Jiangwen Sun
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States
| |
Collapse
|
7
|
Bou‐Abdallah F, Fish J, Terashi G, Zhang Y, Kihara D, Arosio P. Unveiling the stochastic nature of human heteropolymer ferritin self-assembly mechanism. Protein Sci 2024; 33:e5104. [PMID: 38995055 PMCID: PMC11241160 DOI: 10.1002/pro.5104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/18/2024] [Accepted: 06/23/2024] [Indexed: 07/13/2024]
Abstract
Despite ferritin's critical role in regulating cellular and systemic iron levels, our understanding of the structure and assembly mechanism of isoferritins, discovered over eight decades ago, remains limited. Unveiling how the composition and molecular architecture of hetero-oligomeric ferritins confer distinct functionality to isoferritins is essential to understanding how the structural intricacies of H and L subunits influence their interactions with cellular machinery. In this study, ferritin heteropolymers with specific H to L subunit ratios were synthesized using a uniquely engineered plasmid design, followed by high-resolution cryo-electron microscopy analysis and deep learning-based amino acid modeling. Our structural examination revealed unique architectural features during the self-assembly mechanism of heteropolymer ferritins and demonstrated a significant preference for H-L heterodimer formation over H-H or L-L homodimers. Unexpectedly, while dimers seem essential building blocks in the protein self-assembly process, the overall mechanism of ferritin self-assembly is observed to proceed randomly through diverse pathways. The physiological significance of these findings is discussed including how ferritin microheterogeneity could represent a tissue-specific adaptation process that imparts distinctive tissue-specific functions to isoferritins.
Collapse
Affiliation(s)
- Fadi Bou‐Abdallah
- Department of ChemistryState University of New YorkPotsdamNew YorkUSA
| | - Jeremie Fish
- Department of Electrical & Computer EngineeringCoulter School of Engineering, Clarkson UniversityPotsdamNew YorkUSA
| | - Genki Terashi
- Department of Biological Sciences and Department of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Yuanyuan Zhang
- Department of Biological Sciences and Department of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Daisuke Kihara
- Department of Biological Sciences and Department of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Paolo Arosio
- Department of Molecular and Translational MedicineUniversity of BresciaBresciaItaly
| |
Collapse
|
8
|
He S, Deng H, Li P, Tian Q, Yang Y, Hu J, Li H, Zhao T, Ling H, Liu Y, Liu S, Guo Q. Bimodal DNA self-origami material with nucleic acid function enhancement. J Nanobiotechnology 2024; 22:39. [PMID: 38279115 PMCID: PMC10821560 DOI: 10.1186/s12951-024-02296-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 01/02/2024] [Indexed: 01/28/2024] Open
Abstract
BACKGROUND The design of DNA materials with specific nanostructures for biomedical tissue engineering applications remains a challenge. High-dimensional DNA nanomaterials are difficult to prepare and are unstable; moreover, their synthesis relies on heavy metal ions. Herein, we developed a bimodal DNA self-origami material with good biocompatibility and differing functions using a simple synthesis method. We simulated and characterized this material using a combination of oxDNA, freeze-fracture electron microscopy, and atomic force microscopy. Subsequently, we optimized the synthesis procedure to fix the morphology of this material. RESULTS Using molecular dynamics simulation, we found that the bimodal DNA self-origami material exhibited properties of spontaneous stretching and curling and could be fixed in a single morphology via synthesis control. The application of different functional nucleic acids enabled the achievement of various biological functions, and the performance of functional nucleic acids was significantly enhanced in the material. Consequently, leveraging the various functional nucleic acids enhanced by this material will facilitate the attainment of diverse biological functions. CONCLUSION The developed design can comprehensively reveal the morphology and dynamics of DNA materials. We thus report a novel strategy for the construction of high-dimensional DNA materials and the application of functional nucleic acid-enhancing materials.
Collapse
Affiliation(s)
- Songlin He
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Haotian Deng
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Peiqi Li
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Qinyu Tian
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
| | - Yongkang Yang
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Jingjing Hu
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
- Department of Gastroenterology, the Second Medical Center and National Clinical Research Center of Geriatric Diseases, 28 Fuxing Road, Haidian District, Beijing, 100853, China
| | - Hao Li
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Tianyuan Zhao
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
| | - Hongkun Ling
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Yin Liu
- School of Medicine, Nankai University, Tianjin, 300071, China.
- Nankai University Eye Institute, Nankai University, Tianjin, 300071, China.
| | - Shuyun Liu
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China.
- School of Medicine, Nankai University, Tianjin, 300071, China.
| | - Quanyi Guo
- Institute of Orthopedics, First Medical Center, Chinese PLA General Hospital; Beijing Key Laboratory of Regenerative Medicine in Orthopedics; Key Laboratory of Musculoskeletal Trauma and War Injuries PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China.
- School of Medicine, Nankai University, Tianjin, 300071, China.
| |
Collapse
|
9
|
Zhang Y, Wang X, Zhang Z, Huang Y, Kihara D. Assessment of Protein-Protein Docking Models Using Deep Learning. Methods Mol Biol 2024; 2780:149-162. [PMID: 38987469 DOI: 10.1007/978-1-0716-3985-6_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Protein-protein interactions are involved in almost all processes in a living cell and determine the biological functions of proteins. To obtain mechanistic understandings of protein-protein interactions, the tertiary structures of protein complexes have been determined by biophysical experimental methods, such as X-ray crystallography and cryogenic electron microscopy. However, as experimental methods are costly in resources, many computational methods have been developed that model protein complex structures. One of the difficulties in computational protein complex modeling (protein docking) is to select the most accurate models among many models that are usually generated by a docking method. This article reviews advances in protein docking model assessment methods, focusing on recent developments that apply deep learning to several network architectures.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Yunhan Huang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
10
|
Terashi G, Wang X, Prasad D, Nakamura T, Zhu H, Kihara D. Integrated Protocol of Protein Structure Modeling for Cryo-EM with Deep Learning and Structure Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.19.563151. [PMID: 37904978 PMCID: PMC10614963 DOI: 10.1101/2023.10.19.563151] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM maps has generally improved, there are still many cases where tracing protein main-chains is difficult, even in maps determined at a near atomic resolution. Here, we have developed a protein structure modeling method, called DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, since Alphafold2 demonstrates high accuracy in protein structure prediction, we have integrated complementary strengths of de novo density tracing using deep learning with Alphafold2's structure modeling to achieve even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign chain identity to the structure models of homo-multimers.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Devashish Prasad
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Tsukasa Nakamura
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Han Zhu
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
11
|
Wang X, Terashi G, Kihara D. CryoREAD: de novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nat Methods 2023; 20:1739-1747. [PMID: 37783885 PMCID: PMC10841814 DOI: 10.1038/s41592-023-02032-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 08/24/2023] [Indexed: 10/04/2023]
Abstract
DNA and RNA play fundamental roles in various cellular processes, where their three-dimensional structures provide information critical to understanding the molecular mechanisms of their functions. Although an increasing number of nucleic acid structures and their complexes with proteins are determined by cryogenic electron microscopy (cryo-EM), structure modeling for DNA and RNA remains challenging particularly when the map is determined at a resolution coarser than atomic level. Moreover, computational methods for nucleic acid structure modeling are relatively scarce. Here, we present CryoREAD, a fully automated de novo DNA/RNA atomic structure modeling method using deep learning. CryoREAD identifies phosphate, sugar and base positions in a cryo-EM map using deep learning, which are traced and modeled into a three-dimensional structure. When tested on cryo-EM maps determined at 2.0 to 5.0 Å resolution, CryoREAD built substantially more accurate models than existing methods. We also applied the method to cryo-EM maps of biomolecular complexes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
12
|
Zeng X, Kahng A, Xue L, Mahamid J, Chang YW, Xu M. High-throughput cryo-ET structural pattern mining by unsupervised deep iterative subtomogram clustering. Proc Natl Acad Sci U S A 2023; 120:e2213149120. [PMID: 37027429 PMCID: PMC10104553 DOI: 10.1073/pnas.2213149120] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 02/24/2023] [Indexed: 04/08/2023] Open
Abstract
Cryoelectron tomography directly visualizes heterogeneous macromolecular structures in their native and complex cellular environments. However, existing computer-assisted structure sorting approaches are low throughput or inherently limited due to their dependency on available templates and manual labels. Here, we introduce a high-throughput template-and-label-free deep learning approach, Deep Iterative Subtomogram Clustering Approach (DISCA), that automatically detects subsets of homogeneous structures by learning and modeling 3D structural features and their distributions. Evaluation on five experimental cryo-ET datasets shows that an unsupervised deep learning based method can detect diverse structures with a wide range of molecular sizes. This unsupervised detection paves the way for systematic unbiased recognition of macromolecular complexes in situ.
Collapse
Affiliation(s)
- Xiangrui Zeng
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| | - Anson Kahng
- Computer Science Department, University of Rochester, Rochester, NY14620
| | - Liang Xue
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg69117, Germany
- Faculty of Biosciences, Collaboration for joint PhD degree between European Molecular Biology Laboratory and Heidelberg University, Heidelberg69117, Germany
| | - Julia Mahamid
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg69117, Germany
| | - Yi-Wei Chang
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
| | - Min Xu
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| |
Collapse
|
13
|
Jiang J, Li J, Li J, Pei H, Li M, Zou Q, Lv Z. A Machine Learning Method to Identify Umami Peptide Sequences by Using Multiplicative LSTM Embedded Features. Foods 2023; 12:foods12071498. [PMID: 37048319 PMCID: PMC10094688 DOI: 10.3390/foods12071498] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open
Abstract
Umami peptides enhance the umami taste of food and have good food processing properties, nutritional value, and numerous potential applications. Wet testing for the identification of umami peptides is a time-consuming and expensive process. Here, we report the iUmami-DRLF that uses a logistic regression (LR) method solely based on the deep learning pre-trained neural network feature extraction method, unified representation (UniRep based on multiplicative LSTM), for feature extraction from the peptide sequences. The findings demonstrate that deep learning representation learning significantly enhanced the capability of models in identifying umami peptides and predictive precision solely based on peptide sequence information. The newly validated taste sequences were also used to test the iUmami-DRLF and other predictors, and the result indicates that the iUmami-DRLF has better robustness and accuracy and remains valid at higher probability thresholds. The iUmami-DRLF method can aid further studies on enhancing the umami flavor of food for satisfying the need for an umami-flavored diet.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Junxian Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
- Wu Yuzhang Honors College, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
14
|
Nakamura A, Meng H, Zhao M, Wang F, Hou J, Cao R, Si D. Fast and automated protein-DNA/RNA macromolecular complex modeling from cryo-EM maps. Brief Bioinform 2023; 24:bbac632. [PMID: 36682003 PMCID: PMC10399284 DOI: 10.1093/bib/bbac632] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 12/15/2022] [Accepted: 12/29/2022] [Indexed: 01/23/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) allows a macromolecular structure such as protein-DNA/RNA complexes to be reconstructed in a three-dimensional coulomb potential map. The structural information of these macromolecular complexes forms the foundation for understanding the molecular mechanism including many human diseases. However, the model building of large macromolecular complexes is often difficult and time-consuming. We recently developed DeepTracer-2.0, an artificial-intelligence-based pipeline that can build amino acid and nucleic acid backbones from a single cryo-EM map, and even predict the best-fitting residues according to the density of side chains. The experiments showed improved accuracy and efficiency when benchmarking the performance on independent experimental maps of protein-DNA/RNA complexes and demonstrated the promising future of macromolecular modeling from cryo-EM maps. Our method and pipeline could benefit researchers worldwide who work in molecular biomedicine and drug discovery, and substantially increase the throughput of the cryo-EM model building. The pipeline has been integrated into the web portal https://deeptracer.uw.edu/.
Collapse
Affiliation(s)
- Andrew Nakamura
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| | - Hanze Meng
- Department of Computer Science, Duke University, Durham, NC 27708, USA
| | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Fengbin Wang
- Department of Biochemistry and Molecular Genetics, University of Alabama Birmingham, Heersink School of Medicine, Birmingham, AL 35233, USA
| | - Jie Hou
- Department of Computer Science, Saint Louis University, Saint Louis, MO 63103, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA
| | - Dong Si
- Corresponding author: Dong Si, Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA. E-mail:
| |
Collapse
|
15
|
Garcia Condado J, Muñoz-Barrutia A, Sorzano COS. Automatic determination of the handedness of single-particle maps of macromolecules solved by CryoEM. J Struct Biol 2022; 214:107915. [PMID: 36341955 DOI: 10.1016/j.jsb.2022.107915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/29/2022] [Accepted: 10/25/2022] [Indexed: 12/07/2022]
Abstract
Single-Particle Analysis by Cryo-Electron Microscopy is a well-established technique to elucidate the three-dimensional (3D) structure of biological macromolecules. The orientation of the acquired projection images must be initially estimated without any reference to the final structure. In this step, algorithms may find a mirrored version of all the orientations resulting in a mirrored 3D map. It is as compatible with the acquired images as its unmirrored version from the image processing point of view, only that it is not biologically plausible. In this article, we introduce HaPi (Handedness Pipeline), the first method to automatically determine the hand of electron density maps of macromolecules solved by CryoEM. HaPi is built by training two 3D convolutional neural networks. The first determines α-helices in a map, and the second determines whether the α-helix is left-handed or right-handed. A consensus strategy defines the overall map hand. The pipeline is trained on simulated and experimental data. The handedness can be detected only for maps whose resolution is better than 5 Å. HaPi can identify the hand in 89% of new simulated maps correctly. Moreover, we evaluated all the maps deposited at the Electron Microscopy Data Bank and 11 structures uploaded with the incorrect hand were identified.
Collapse
Affiliation(s)
- J Garcia Condado
- Biocruces Bizkaia Instituto Investigación Sanitaria, Cruces Plaza, 48903 Barakaldo, Bizkaia, Spain; Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Madrid, Spain; Centro Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - A Muñoz-Barrutia
- Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Madrid, Spain
| | - C O S Sorzano
- Centro Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain.
| |
Collapse
|
16
|
Beton JG, Cragnolini T, Kaleel M, Mulvaney T, Sweeney A, Topf M. Integrating model simulation tools and
cryo‐electron
microscopy. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Joseph George Beton
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck and University College London London UK
| | - Manaz Kaleel
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Aaron Sweeney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| |
Collapse
|
17
|
Yang J, Cai Y, Zhao K, Xie H, Chen X. Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov Today 2022; 27:103356. [PMID: 36113834 DOI: 10.1016/j.drudis.2022.103356] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 07/28/2022] [Accepted: 09/08/2022] [Indexed: 11/22/2022]
Abstract
Molecular fingerprints are used to represent chemical (structural, physicochemical, etc.) properties of large-scale chemical sets in a low computational cost way. They have a prominent role in transforming chemical data sets into consistent input formats (bit strings or numeric values) suitable for in silico approaches. In this review, we summarize and classify common and state-of-the-art fingerprints into eight different types (dictionary based, circular, topological, pharmacophore, protein-ligand interaction, shape based, reinforced, and multi). We also highlight applications of fingerprints in early drug research and development (R&D). Thus, this review provides a guide for the selection of appropriate fingerprints of compounds (or ligand-protein complexes) for use in drug R&D.
Collapse
Affiliation(s)
- Jingbo Yang
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China
| | - Yiyang Cai
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China
| | - Kairui Zhao
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China
| | - Hongbo Xie
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China.
| | - Xiujie Chen
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China.
| |
Collapse
|
18
|
Residue-wise local quality estimation for protein models from cryo-EM maps. Nat Methods 2022; 19:1116-1125. [PMID: 35953671 PMCID: PMC10024464 DOI: 10.1038/s41592-022-01574-4] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 07/11/2022] [Indexed: 01/31/2023]
Abstract
An increasing number of protein structures are being determined by cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM density maps is improving in general, there are still many cases where amino acids of a protein are assigned with different levels of confidence. Here we developed a method that identifies potential misassignment of residues in the map, including residue shifts along an otherwise correct main-chain trace. The score, named DAQ, computes the likelihood that the local density corresponds to different amino acids, atoms, and secondary structures, estimated via deep learning, and assesses the consistency of the amino acid assignment in the protein structure model with that likelihood. When DAQ was applied to different versions of model structures in the Protein Data Bank that were derived from the same density maps, a clear improvement in the DAQ score was observed in the newer versions of the models. DAQ also found potential misassignment errors in a substantial number of deposited protein structure models built into cryo-EM maps.
Collapse
|
19
|
Cryo-EM structures of Escherichia coli Ec86 retron complexes reveal architecture and defence mechanism. Nat Microbiol 2022; 7:1480-1489. [PMID: 35982312 DOI: 10.1038/s41564-022-01197-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 07/05/2022] [Indexed: 11/09/2022]
Abstract
First discovered in the 1980s, retrons are bacterial genetic elements consisting of a reverse transcriptase and a non-coding RNA (ncRNA). Retrons mediate antiphage defence in bacteria but their structure and defence mechanisms are unknown. Here, we investigate the Escherichia coli Ec86 retron and use cryo-electron microscopy to determine the structures of the Ec86 (3.1 Å) and cognate effector-bound Ec86 (2.5 Å) complexes. The Ec86 reverse transcriptase exhibits a characteristic right-hand-like fold consisting of finger, palm and thumb subdomains. Ec86 reverse transcriptase reverse-transcribes part of the ncRNA into satellite, multicopy single-stranded DNA (msDNA, a DNA-RNA hybrid) that we show wraps around the reverse transcriptase electropositive surface. In msDNA, both inverted repeats are present and the 3' sides of the DNA/RNA chains are close to the reverse transcriptase active site. The Ec86 effector adopts a two-lobe fold and directly binds reverse transcriptase and msDNA. These findings offer insights into the structure-function relationship of the retron-effector unit and provide a structural basis for the optimization of retron-based genome editing systems.
Collapse
|
20
|
Alnabati E, Esquivel-Rodriguez J, Terashi G, Kihara D. MarkovFit: Structure Fitting for Protein Complexes in Electron Microscopy Maps Using Markov Random Field. Front Mol Biosci 2022; 9:935411. [PMID: 35959463 PMCID: PMC9358042 DOI: 10.3389/fmolb.2022.935411] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
An increasing number of protein complex structures are determined by cryo-electron microscopy (cryo-EM). When individual protein structures have been determined and are available, an important task in structure modeling is to fit the individual structures into the density map. Here, we designed a method that fits the atomic structures of proteins in cryo-EM maps of medium to low resolutions using Markov random fields, which allows probabilistic evaluation of fitted models. The accuracy of our method, MarkovFit, performed better than existing methods on datasets of 31 simulated cryo-EM maps of resolution 10 Å , nine experimentally determined cryo-EM maps of resolution less than 4 Å , and 28 experimentally determined cryo-EM maps of resolution 6 to 20 Å .
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | | | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
21
|
Xu B, Zhu Y, Cao C, Chen H, Jin Q, Li G, Ma J, Yang SL, Zhao J, Zhu J, Ding Y, Fang X, Jin Y, Kwok CK, Ren A, Wan Y, Wang Z, Xue Y, Zhang H, Zhang QC, Zhou Y. Recent advances in RNA structurome. SCIENCE CHINA. LIFE SCIENCES 2022; 65:1285-1324. [PMID: 35717434 PMCID: PMC9206424 DOI: 10.1007/s11427-021-2116-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/01/2022] [Indexed: 12/27/2022]
Abstract
RNA structures are essential to support RNA functions and regulation in various biological processes. Recently, a range of novel technologies have been developed to decode genome-wide RNA structures and novel modes of functionality across a wide range of species. In this review, we summarize key strategies for probing the RNA structurome and discuss the pros and cons of representative technologies. In particular, these new technologies have been applied to dissect the structural landscape of the SARS-CoV-2 RNA genome. We also summarize the functionalities of RNA structures discovered in different regulatory layers-including RNA processing, transport, localization, and mRNA translation-across viruses, bacteria, animals, and plants. We review many versatile RNA structural elements in the context of different physiological and pathological processes (e.g., cell differentiation, stress response, and viral replication). Finally, we discuss future prospects for RNA structural studies to map the RNA structurome at higher resolution and at the single-molecule and single-cell level, and to decipher novel modes of RNA structures and functions for innovative applications.
Collapse
Affiliation(s)
- Bingbing Xu
- MOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yanda Zhu
- MOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Changchang Cao
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Hao Chen
- Life Sciences Institute, Zhejiang University, Hangzhou, 310058, China
| | - Qiongli Jin
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Guangnan Li
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Junfeng Ma
- Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Siwy Ling Yang
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Jieyu Zhao
- Department of Chemistry, and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Jianghui Zhu
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology and Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing, 100084, China
| | - Yiliang Ding
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom.
| | - Xianyang Fang
- Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| | - Yongfeng Jin
- MOE Laboratory of Biosystems Homeostasis & Protection, Innovation Center for Cell Signaling Network, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
| | - Chun Kit Kwok
- Department of Chemistry, and State Key Laboratory of Marine Pollution, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.
- Shenzhen Research Institute of City University of Hong Kong, Shenzhen, 518057, China.
| | - Aiming Ren
- Life Sciences Institute, Zhejiang University, Hangzhou, 310058, China.
| | - Yue Wan
- Stem Cell and Regenerative Biology, Genome Institute of Singapore, A*STAR, Singapore, Singapore.
| | - Zhiye Wang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
| | - Yuanchao Xue
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100101, China.
| | - Huakun Zhang
- Key Laboratory of Molecular Epigenetics of the Ministry of Education, Northeast Normal University, Changchun, 130024, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology and Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
- Tsinghua-Peking Center for Life Sciences, Beijing, 100084, China.
| | - Yu Zhou
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, 430072, China.
| |
Collapse
|
22
|
Alnabati E, Terashi G, Kihara D. Protein Structural Modeling for Electron Microscopy Maps Using VESPER and MAINMAST. Curr Protoc 2022; 2:e494. [PMID: 35849043 PMCID: PMC9299282 DOI: 10.1002/cpz1.494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
An increasing number of protein structures are determined by cryo-electron microscopy (cryo-EM) and stored in the Electron Microscopy Data Bank (EMDB). To interpret determined cryo-EM maps, several methods have been developed that model the tertiary structure of biomolecules, particularly proteins. Here we show how to use two such methods, VESPER and MAINMAST, which were developed in our group. VESPER is a method mainly for two purposes: fitting protein structure models into an EM map and aligning two EM maps locally or globally to capture their similarity. VESPER represents each EM map as a set of vectors pointing toward denser points. By considering matching the directions of vectors, in general, VESPER aligns maps better than conventional methods that only consider local densities of maps. MAINMAST is a de novo protein modeling tool designed for EM maps with resolution of 3-5 Å or better. MAINMAST builds a protein main chain directly from a density map by tracing dense points in an EM map and connecting them using a tree-graph structure. This article describes how to use these two tools using three illustrative modeling examples. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Protein structure model fitting using VESPER Alternate Protocol: Atomic model fitting using VESPER web server Basic Protocol 2: Protein de novo modeling using MAINMAST.
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer SciencePurdue UniversityWest LafayetteIndiana
| | - Genki Terashi
- Department of Biological SciencesPurdue UniversityWest LafayetteIndiana
| | - Daisuke Kihara
- Department of Computer SciencePurdue UniversityWest LafayetteIndiana
- Department of Biological SciencesPurdue UniversityWest LafayetteIndiana
| |
Collapse
|
23
|
From single-omics to interactomics: How can ligand-induced perturbations modulate single-cell phenotypes? ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 131:45-83. [PMID: 35871896 DOI: 10.1016/bs.apcsb.2022.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Cells suffer from perturbations by different stimuli, which, consequently, rise to individual alterations in their profile and function that may end up affecting the tissue as a whole. This is no different if we consider the effect of a therapeutic agent on a biological system. As cells are exposed to external ligands their profile can change at different single-omics levels. Detecting how these changes take place through different sequencing technologies is key to a better understanding of the effects of therapeutic agents. Single-cell RNA-sequencing stands out as one of the most common approaches for cell profiling and perturbation analysis. As a result, single-cell transcriptomics data can be integrated with other omics data sources, such as proteomics and epigenomics data, to clarify the perturbation effects and mechanism at the cell level. Appropriate computational tools are key to process and integrate the available information. This chapter focuses on the recent advances on ligand-induced perturbation and single-cell omics computational tools and algorithms, their current limitations, and how the deluge of data can be used to improve the current process of drug research and development.
Collapse
|
24
|
Hyperspectral Image Classification with Imbalanced Data Based on Semi-Supervised Learning. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12083943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Hyperspectral remote sensing image classification has been widely employed for numerous applications, such as environmental monitoring, agriculture, and mineralogy. During such classification, the number of training samples in each class often varies significantly. This imbalance in the dataset is often not identified because most classifiers are designed under a balanced dataset assumption, which can distort the minority classes or even treat them as noise. This may lead to biased and inaccurate classification results. This issue can be alleviated by applying preprocessing techniques that enable a uniform distribution of the imbalanced data for further classification. However, it is difficult to add new natural features to a training model by artificial combination of samples by using existing preprocessing techniques. For minority classes with sparse samples, the addition of sufficient natural features can effectively alleviate bias and improve the generalization. For such an imbalanced problem, semi-supervised learning is a creative solution that utilizes the rich natural features of unlabeled data, which can be collected at a low cost in the remote sensing classification. In this paper, we propose a novel semi-supervised learning-based preprocessing solution called NearPseudo. In NearPseudo, pseudo-labels are created by the initialization classifier and added to minority classes with the corresponding unlabeled samples. Simultaneously, to increase reliability and reduce the misclassification cost of pseudo-labels, we created a feedback mechanism based on a consistency check to effectively select the unlabeled data and its pseudo-labels. Experiments were conducted on a state-of-the-art representative hyperspectral dataset to verify the proposed method. The experimental results demonstrate that NearPseudo can achieve better classification accuracy than other common processing methods. Furthermore, it can be flexibly applied to most typical classifiers to improve their classification accuracy. With the intervention of NearPseudo, the accuracy of random forest, k-nearest neighbors, logistic regression, and classification and regression tree increased by 1.8%, 4.0%, 6.4%, and 3.7%, respectively. This study addresses research a gap to solve the imbalanced data-based limitations in hyperspectral image classification.
Collapse
|
25
|
Ofusa K, Chijimatsu R, Ishii H. Techniques to detect epitranscriptomic marks. Am J Physiol Cell Physiol 2022; 322:C787-C793. [PMID: 35294846 DOI: 10.1152/ajpcell.00460.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Similar to epigenetic DNA modification, RNA can be methylated and altered for stability and processing. RNA modifications, i.e., epitranscriptomes involve three functions, that is, writing, erasing, and reading of marks. Methods for measurement and position detection are useful for the assessment of cellular function and human disease biomarkers. Since the first detection of pyrimidine 5-methylcytosine hundred years ago, numerous techniques have been developed to study the modifications of nucleotides, including RNAs. Recent studies focused on high throughput and direct measurements to investigate the precise function of epitranscriptomes, including the characterization of SARS-CoV-2. The current work presents an overview of the development of detection techniques for epitranscriptomic marks and updates recent progress on the related field.
Collapse
Affiliation(s)
- Ken Ofusa
- Prophoenix Division, Food and Life-Science Laboratory, Idea Consultants, Inc., Osaka-city, Osaka, Japan.,Center of Medical Innovation and Translational Research, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
| | - Ryota Chijimatsu
- Center of Medical Innovation and Translational Research, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
| | - Hideshi Ishii
- Center of Medical Innovation and Translational Research, Osaka University Graduate School of Medicine, Suita, Osaka, Japan
| |
Collapse
|
26
|
Wu JG, Yan Y, Zhang DX, Liu BW, Zheng QB, Xie XL, Liu SQ, Ge SX, Hou ZG, Xia NS. Machine Learning for Structure Determination in Single-Particle Cryo-Electron Microscopy: A Systematic Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:452-472. [PMID: 34932487 DOI: 10.1109/tnnls.2021.3131325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recently, single-particle cryo-electron microscopy (cryo-EM) has become an indispensable method for determining macromolecular structures at high resolution to deeply explore the relevant molecular mechanism. Its recent breakthrough is mainly because of the rapid advances in hardware and image processing algorithms, especially machine learning. As an essential support of single-particle cryo-EM, machine learning has powered many aspects of structure determination and greatly promoted its development. In this article, we provide a systematic review of the applications of machine learning in this field. Our review begins with a brief introduction of single-particle cryo-EM, followed by the specific tasks and challenges of its image processing. Then, focusing on the workflow of structure determination, we describe relevant machine learning algorithms and applications at different steps, including particle picking, 2-D clustering, 3-D reconstruction, and other steps. As different tasks exhibit distinct characteristics, we introduce the evaluation metrics for each task and summarize their dynamics of technology development. Finally, we discuss the open issues and potential trends in this promising field.
Collapse
|
27
|
Wood DM, Dobson RC, Horne CR. Using cryo-EM to uncover mechanisms of bacterial transcriptional regulation. Biochem Soc Trans 2021; 49:2711-2726. [PMID: 34854920 PMCID: PMC8786299 DOI: 10.1042/bst20210674] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/10/2021] [Accepted: 11/15/2021] [Indexed: 11/17/2022]
Abstract
Transcription is the principal control point for bacterial gene expression, and it enables a global cellular response to an intracellular or environmental trigger. Transcriptional regulation is orchestrated by transcription factors, which activate or repress transcription of target genes by modulating the activity of RNA polymerase. Dissecting the nature and precise choreography of these interactions is essential for developing a molecular understanding of transcriptional regulation. While the contribution of X-ray crystallography has been invaluable, the 'resolution revolution' of cryo-electron microscopy has transformed our structural investigations, enabling large, dynamic and often transient transcription complexes to be resolved that in many cases had resisted crystallisation. In this review, we highlight the impact cryo-electron microscopy has had in gaining a deeper understanding of transcriptional regulation in bacteria. We also provide readers working within the field with an overview of the recent innovations available for cryo-electron microscopy sample preparation and image reconstruction of transcription complexes.
Collapse
Affiliation(s)
- David M. Wood
- Biomolecular Interaction Centre and School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
| | - Renwick C.J. Dobson
- Biomolecular Interaction Centre and School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
- Bio21 Molecular Science and Biotechnology Institute, Department of Biochemistry and Pharmacology, University of Melbourne, Parkville, VIC, Australia
| | - Christopher R. Horne
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC 3052, Australia
| |
Collapse
|
28
|
Chang WH, Huang SH, Lin HH, Chung SC, Tu IP. Cryo-EM Analyses Permit Visualization of Structural Polymorphism of Biological Macromolecules. FRONTIERS IN BIOINFORMATICS 2021; 1:788308. [PMID: 36303748 PMCID: PMC9580929 DOI: 10.3389/fbinf.2021.788308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 11/16/2021] [Indexed: 11/13/2022] Open
Abstract
The functions of biological macromolecules are often associated with conformational malleability of the structures. This phenomenon of chemically identical molecules with different structures is coined structural polymorphism. Conventionally, structural polymorphism is observed directly by structural determination at the density map level from X-ray crystal diffraction. Although crystallography approach can report the conformation of a macromolecule with the position of each atom accurately defined in it, the exploration of structural polymorphism and interpreting biological function in terms of crystal structures is largely constrained by the crystal packing. An alternative approach to studying the macromolecule of interest in solution is thus desirable. With the advancement of instrumentation and computational methods for image analysis and reconstruction, cryo-electron microscope (cryo-EM) has been transformed to be able to produce “in solution” structures of macromolecules routinely with resolutions comparable to crystallography but without the need of crystals. Since the sample preparation of single-particle cryo-EM allows for all forms co-existing in solution to be simultaneously frozen, the image data contain rich information as to structural polymorphism. The ensemble of structure information can be subsequently disentangled through three-dimensional (3D) classification analyses. In this review, we highlight important examples of protein structural polymorphism in relation to allostery, subunit cooperativity and function plasticity recently revealed by cryo-EM analyses, and review recent developments in 3D classification algorithms including neural network/deep learning approaches that would enable cryo-EM analyese in this regard. Finally, we brief the frontier of cryo-EM structure determination of RNA molecules where resolving the structural polymorphism is at dawn.
Collapse
Affiliation(s)
- Wei-Hau Chang
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- *Correspondence: Wei-Hau Chang,
| | | | - Hsin-Hung Lin
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
| | - Szu-Chi Chung
- Department of Applied Mathematics, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - I-Ping Tu
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
29
|
Mu Y, Sazzed S, Alshammari M, Sun J, He J. A Tool for Segmentation of Secondary Structures in 3D Cryo-EM Density Map Components Using Deep Convolutional Neural Networks. FRONTIERS IN BIOINFORMATICS 2021; 1:710119. [PMID: 36303800 PMCID: PMC9581063 DOI: 10.3389/fbinf.2021.710119] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 09/28/2021] [Indexed: 07/20/2023] Open
Abstract
Although cryo-electron microscopy (cryo-EM) has been successfully used to derive atomic structures for many proteins, it is still challenging to derive atomic structures when the resolution of cryo-EM density maps is in the medium resolution range, such as 5-10 Å. Detection of protein secondary structures, such as helices and β-sheets, from cryo-EM density maps provides constraints for deriving atomic structures from such maps. As more deep learning methodologies are being developed for solving various molecular problems, effective tools are needed for users to access them. We have developed an effective software bundle, DeepSSETracer, for the detection of protein secondary structure from cryo-EM component maps in medium resolution. The bundle contains the network architecture and a U-Net model trained with a curriculum and gradient of episodic memory (GEM). The bundle integrates the deep neural network with the visualization capacity provided in ChimeraX. Using a Linux server that is remotely accessed by Windows users, it takes about 6 s on one CPU and one GPU for the trained deep neural network to detect secondary structures in a cryo-EM component map containing 446 amino acids. A test using 28 chain components of cryo-EM maps shows overall residue-level F1 scores of 0.72 and 0.65 to detect helices and β-sheets, respectively. Although deep learning applications are built on software frameworks, such as PyTorch and Tensorflow, our pioneer work here shows that integration of deep learning applications with ChimeraX is a promising and effective approach. Our experiments show that the F1 score measured at the residue level is an effective evaluation of secondary structure detection for individual classes. The test using 28 cryo-EM component maps shows that DeepSSETracer detects β-sheets more accurately than Emap2sec+, with a weighted average residue-level F1 score of 0.65 and 0.42, respectively. It also shows that Emap2sec+ detects helices more accurately than DeepSSETracer with a weighted average residue-level F1 score of 0.77 and 0.72 respectively.
Collapse
Affiliation(s)
| | | | | | | | - Jing He
- *Correspondence: Jing He, ; Jiangwen Sun,
| |
Collapse
|
30
|
Zumbado-Corrales M, Esquivel-Rodríguez J. EvoSeg: Automated Electron Microscopy Segmentation through Random Forests and Evolutionary Optimization. Biomimetics (Basel) 2021; 6:biomimetics6020037. [PMID: 34206006 PMCID: PMC8293153 DOI: 10.3390/biomimetics6020037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/17/2021] [Accepted: 05/28/2021] [Indexed: 11/30/2022] Open
Abstract
Electron Microscopy Maps are key in the study of bio-molecular structures, ranging from borderline atomic level to the sub-cellular range. These maps describe the envelopes that cover possibly a very large number of proteins that form molecular machines within the cell. Within those envelopes, we are interested to find what regions correspond to specific proteins so that we can understand how they function, and design drugs that can enhance or suppress a process that they are involved in, along with other experimental purposes. A classic approach by which we can begin the exploration of map regions is to apply a segmentation algorithm. This yields a mask where each voxel in 3D space is assigned an identifier that maps it to a segment; an ideal segmentation would map each segment to one protein unit, which is rarely the case. In this work, we present a method that uses bio-inspired optimization, through an Evolutionary-Optimized Segmentation algorithm, to iteratively improve upon baseline segments obtained from a classical approach, called watershed segmentation. The cost function used by the evolutionary optimization is based on an ideal segmentation classifier trained as part of this development, which uses basic structural information available to scientists, such as the number of expected units, volume and topology. We show that a basic initial segmentation with the additional information allows our evolutionary method to find better segmentation results, compared to the baseline generated by the watershed.
Collapse
|
31
|
Wang X, Alnabati E, Aderinwale TW, Maddhuri Venkata Subramaniya SR, Terashi G, Kihara D. Detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning. Nat Commun 2021; 12:2302. [PMID: 33863902 PMCID: PMC8052361 DOI: 10.1038/s41467-021-22577-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/19/2021] [Indexed: 12/21/2022] Open
Abstract
An increasing number of density maps of macromolecular structures, including proteins and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extracting structure information is not trivial. Here, we report a new computational method, Emap2sec+, which identifies DNA or RNA as well as the secondary structures of proteins in cryo-EM maps of 5 to 10 Å resolution. Emap2sec+ employs the deep Residual convolutional neural network. Emap2sec+ assigns structural labels with associated probabilities at each voxel in a cryo-EM map, which will help structure modeling in an EM map. Emap2sec+ showed stable and high assignment accuracy for nucleotides in low resolution maps and improved performance for protein secondary structure assignments than its earlier version when tested on simulated and experimental maps.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Tunde W Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | | | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|