1
|
Zhu J, Zhang Q, Zhang H, Shi Z, Hu M, Bao C. A minority of final stacks yields superior amplitude in single-particle cryo-EM. Nat Commun 2023; 14:7822. [PMID: 38072910 PMCID: PMC10711021 DOI: 10.1038/s41467-023-43555-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 11/13/2023] [Indexed: 12/18/2023] Open
Abstract
Cryogenic electron microscopy (cryo-EM) is widely used to determine near-atomic resolution structures of biological macromolecules. Due to the low signal-to-noise ratio, cryo-EM relies on averaging many images. However, a crucial question in the field of cryo-EM remains unanswered: how close can we get to the minimum number of particles required to reach a specific resolution in practice? The absence of an answer to this question has impeded progress in understanding sample behavior and the performance of sample preparation methods. To address this issue, we develop an iterative particle sorting and/or sieving method called CryoSieve. Extensive experiments demonstrate that CryoSieve outperforms other cryo-EM particle sorting algorithms, revealing that most particles are unnecessary in final stacks. The minority of particles remaining in the final stacks yield superior high-resolution amplitude in reconstructed density maps. For some datasets, the size of the finest subset approaches the theoretical limit.
Collapse
Affiliation(s)
- Jianying Zhu
- Yau Mathematical Sciences Center, Tsinghua University, Beijing, China
| | - Qi Zhang
- Key Laboratory of Protein Sciences (Tsinghua University), Ministry of Education, Beijing, China
- School of Life Science, Tsinghua University, Beijing, China
- Beijing Advanced Innovation Center for Structural Biology, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Beijing, China
| | - Hui Zhang
- Qiuzhen College, Tsinghua University, Beijing, China
| | - Zuoqiang Shi
- Yau Mathematical Sciences Center, Tsinghua University, Beijing, China.
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, China.
| | - Mingxu Hu
- Key Laboratory of Protein Sciences (Tsinghua University), Ministry of Education, Beijing, China.
- School of Life Science, Tsinghua University, Beijing, China.
- Beijing Advanced Innovation Center for Structural Biology, Beijing, China.
- Beijing Frontier Research Center for Biological Structure, Beijing, China.
- Shenzhen Academy of Research and Translation, Shenzhen, China.
| | - Chenglong Bao
- Yau Mathematical Sciences Center, Tsinghua University, Beijing, China.
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, China.
- State Key Laboratory of Membrane Biology, School of Life Sciences, Tsinghua University, Beijing, China.
| |
Collapse
|
2
|
Huang Y, Zhang Y, Ni T. Towards in situ high-resolution imaging of viruses and macromolecular complexes using cryo-electron tomography. J Struct Biol 2023; 215:108000. [PMID: 37467823 DOI: 10.1016/j.jsb.2023.108000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/15/2023] [Accepted: 07/16/2023] [Indexed: 07/21/2023]
Abstract
Cryo-electron tomography and subtomogram averaging are rising and fast-evolving imaging techniques to study biological events, providing structural information at an unprecedented resolution while preserving spatial correlation in their native contexts. The latest technology and methodology development ranging from sample preparation to data collection and data processing, has enabled significant advancement in its applications to various biological systems. This review provides an overview of the current technology development enabling high-resolution structural study in situ, highlighting the use of a priori information of biological samples to assess the quality of subtomogram averaging pipeline. We exemplify the applications of this technique to understanding viruses and principles of macromolecule assembly using different biological systems, ranging from in vitro to in situ samples, which provide structural information at different resolutions and contexts.
Collapse
Affiliation(s)
- Yixin Huang
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong Special Administrative Region
| | - Yu Zhang
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong Special Administrative Region
| | - Tao Ni
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong Special Administrative Region.
| |
Collapse
|
3
|
Sorzano COS, Vilas JL, Ramírez-Aportela E, Krieger J, Del Hoyo D, Herreros D, Fernandez-Giménez E, Marchán D, Macías JR, Sánchez I, Del Caño L, Fonseca-Reyna Y, Conesa P, García-Mena A, Burguet J, García Condado J, Méndez García J, Martínez M, Muñoz-Barrutia A, Marabini R, Vargas J, Carazo JM. Image processing tools for the validation of CryoEM maps. Faraday Discuss 2022; 240:210-227. [PMID: 35861059 DOI: 10.1039/d2fd00059h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The number of maps deposited in public databases (Electron Microscopy Data Bank, EMDB) determined by cryo-electron microscopy has quickly grown in recent years. With this rapid growth, it is critical to guarantee their quality. So far, map validation has primarily focused on the agreement between maps and models. From the image processing perspective, the validation has been mostly restricted to using two half-maps and the measurement of their internal consistency. In this article, we suggest that map validation can be taken much further from the point of view of image processing if 2D classes, particles, angles, coordinates, defoci, and micrographs are also provided. We present a progressive validation scheme that qualifies a result validation status from 0 to 5 and offers three optional qualifiers (A, W, and O) that can be added. The simplest validation state is 0, while the most complete would be 5AWO. This scheme has been implemented in a website https://biocomp.cnb.csic.es/EMValidationService/ to which reconstructed maps and their ESI can be uploaded.
Collapse
Affiliation(s)
- C O S Sorzano
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - J L Vilas
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | | | - J Krieger
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - D Del Hoyo
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - D Herreros
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | | | - D Marchán
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - J R Macías
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - I Sánchez
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - L Del Caño
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - Y Fonseca-Reyna
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - P Conesa
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - A García-Mena
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - J Burguet
- Depto. de Óptica, Univ. Complutense de Madrid, Pl. Ciencias, 1, 28040, Madrid, Spain
| | - J García Condado
- Biocruces Bizkaia Instituto Investigación Sanitaria, Cruces Plaza, 48903, Barakaldo, Bizkaia, Spain
| | | | - M Martínez
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - A Muñoz-Barrutia
- Univ. Carlos III de Madrid, Avda. de la Universidad 30, 28911, Leganés, Madrid, Spain
| | - R Marabini
- Escuela Politécnica Superior, Univ. Autónoma de Madrid, CSIC, C. Francisco Tomás y Valiente, 11, 28049, Madrid, Spain
| | - J Vargas
- Depto. de Óptica, Univ. Complutense de Madrid, Pl. Ciencias, 1, 28040, Madrid, Spain
| | - J M Carazo
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| |
Collapse
|
4
|
Mamizu N, Yasunaga T. Estimation of Projection Parameter Distribution and Initial Model Generation in Single-Particle Analysis. Microscopy (Oxf) 2022; 71:347-356. [PMID: 35904535 DOI: 10.1093/jmicro/dfac039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 07/25/2022] [Accepted: 07/29/2022] [Indexed: 11/13/2022] Open
Abstract
This study focused on the problem of projection parameter search in 3D reconstruction using single-particle analysis. We treated the sampling distribution for the parameter search as a prior distribution and designed a probabilistic model for efficient parameter estimation. Using our method, we showed that it is possible to perform 3D reconstruction from synthetic and actual electron microscope images using an initial model, and to generate the initial model itself. We also examined whether the optimization function used in the stochastic gradient descent method can be applied with loose constraints to improve the convergence of initial model generation and confirmed the effect. In order to investigate the advantage of generating a smooth sampling distribution from the stochastic model, we compared the distribution of estimated projection directions with the conventional method of performing a global search using spherical gridding. As a result, our method, which is simple in both mathematical model and implementation, showed no algorithmic artifacts.
Collapse
Affiliation(s)
- Nobuya Mamizu
- Imaging Technology Division, System in Frontier Inc., 2-8-3 Shinsuzuharu Bldg.4F Akebono-cho Tachikawa-shi, Tokyo 190-0012
| | - Takuo Yasunaga
- Department of Physics and Information Technology, Kyushu Institute of Technology Faculty of Computer Science and Systems Engineering, 680-4 Kawazu Iizuka-shi, Fukuoka 820-8502
| |
Collapse
|
5
|
Sorzano COS, Jiménez-Moreno A, Maluenda D, Martínez M, Ramírez-Aportela E, Krieger J, Melero R, Cuervo A, Conesa J, Filipovic J, Conesa P, del Caño L, Fonseca YC, Jiménez-de la Morena J, Losana P, Sánchez-García R, Strelak D, Fernández-Giménez E, de Isidro-Gómez FP, Herreros D, Vilas JL, Marabini R, Carazo JM. On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy. Acta Crystallogr D Struct Biol 2022; 78:410-423. [PMID: 35362465 PMCID: PMC8972802 DOI: 10.1107/s2059798322001978] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 02/18/2022] [Indexed: 12/05/2022] Open
Abstract
Cryo-electron microscopy (cryoEM) has become a well established technique to elucidate the 3D structures of biological macromolecules. Projection images from thousands of macromolecules that are assumed to be structurally identical are combined into a single 3D map representing the Coulomb potential of the macromolecule under study. This article discusses possible caveats along the image-processing path and how to avoid them to obtain a reliable 3D structure. Some of these problems are very well known in the community. These may be referred to as sample-related (such as specimen denaturation at interfaces or non-uniform projection geometry leading to underrepresented projection directions). The rest are related to the algorithms used. While some have been discussed in depth in the literature, such as the use of an incorrect initial volume, others have received much less attention. However, they are fundamental in any data-analysis approach. Chiefly among them, instabilities in estimating many of the key parameters that are required for a correct 3D reconstruction that occur all along the processing workflow are referred to, which may significantly affect the reliability of the whole process. In the field, the term overfitting has been coined to refer to some particular kinds of artifacts. It is argued that overfitting is a statistical bias in key parameter-estimation steps in the 3D reconstruction process, including intrinsic algorithmic bias. It is also shown that common tools (Fourier shell correlation) and strategies (gold standard) that are normally used to detect or prevent overfitting do not fully protect against it. Alternatively, it is proposed that detecting the bias that leads to overfitting is much easier when addressed at the level of parameter estimation, rather than detecting it once the particle images have been combined into a 3D map. Comparing the results from multiple algorithms (or at least, independent executions of the same algorithm) can detect parameter bias. These multiple executions could then be averaged to give a lower variance estimate of the underlying parameters.
Collapse
Affiliation(s)
- C. O. S. Sorzano
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - A. Jiménez-Moreno
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Maluenda
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - M. Martínez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - E. Ramírez-Aportela
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Krieger
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - R. Melero
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - A. Cuervo
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Conesa
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | | | - P. Conesa
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - L. del Caño
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - Y. C. Fonseca
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Jiménez-de la Morena
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - P. Losana
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - R. Sánchez-García
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Strelak
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
- Masaryk University, Brno, Czech Republic
| | - E. Fernández-Giménez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - F. P. de Isidro-Gómez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Herreros
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. L. Vilas
- School of Engineering and Applied Science, Yale University, New Haven, CT 06520-829, USA
| | - R. Marabini
- Escuela Politecnica Superior, Universidad Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain
| | - J. M. Carazo
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|