1
|
Ding HJ, Oikonomou CM, Jensen GJ. The Caltech Tomography Database and Automatic Processing Pipeline. J Struct Biol 2015; 192:279-86. [PMID: 26087141 DOI: 10.1016/j.jsb.2015.06.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Revised: 06/11/2015] [Accepted: 06/13/2015] [Indexed: 10/23/2022]
Abstract
Here we describe the Caltech Tomography Database and automatic image processing pipeline, designed to process, store, display, and distribute electron tomographic data including tilt-series, sample information, data collection parameters, 3D reconstructions, correlated light microscope images, snapshots, segmentations, movies, and other associated files. Tilt-series are typically uploaded automatically during collection to a user's "Inbox" and processed automatically, but can also be entered and processed in batches via scripts or file-by-file through an internet interface. As with the video website YouTube, each tilt-series is represented on the browsing page with a link to the full record, a thumbnail image and a video icon that delivers a movie of the tomogram in a pop-out window. Annotation tools allow users to add notes and snapshots. The database is fully searchable, and sets of tilt-series can be selected and re-processed, edited, or downloaded to a personal workstation. The results of further processing and snapshots of key results can be recorded in the database, automatically linked to the appropriate tilt-series. While the database is password-protected for local browsing and searching, datasets can be made public and individual files can be shared with collaborators over the Internet. Together these tools facilitate high-throughput tomography work by both individuals and groups.
Collapse
Affiliation(s)
- H Jane Ding
- Division of Biology, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, United States
| | - Catherine M Oikonomou
- Division of Biology, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, United States
| | - Grant J Jensen
- Division of Biology, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, United States; Howard Hughes Medical Institute, United States.
| |
Collapse
|
2
|
|
3
|
Ludtke SJ, Nason L, Tu H, Peng L, Chiu W. Object oriented database and electronic notebook for transmission electron microscopy. MICROSCOPY AND MICROANALYSIS : THE OFFICIAL JOURNAL OF MICROSCOPY SOCIETY OF AMERICA, MICROBEAM ANALYSIS SOCIETY, MICROSCOPICAL SOCIETY OF CANADA 2003; 9:556-565. [PMID: 14750990 DOI: 10.1017/s1431927603030575] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
As high-resolution biological transmission electron microscopy (TEM) has increased in popularity over recent years, the volume of data and number of projects underway has risen dramatically. A robust tool for effective data management is essential to efficiently process large data sets and extract maximum information from the available data. We present the Electron Microscopy Electronic Notebook (EMEN), a portable, object-oriented, web-based tool for TEM data archival and project management. EMEN has several unique features. First, the database is logically organized and annotated so multiple collaborators at different geographical locations can easily access and interpret the data without assistance. Second, the database was designed to provide flexibility to the user, so it can be used much as a lab notebook would be, while maintaining a structure suitable for data mining and direct interaction with data-processing software. Finally, as an object-oriented database, the database structure is dynamic and can be easily extended to incorporate information not defined in the original database specification.
Collapse
Affiliation(s)
- Steven J Ludtke
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | | | | | |
Collapse
|
4
|
Martone ME, Gupta A, Wong M, Qian X, Sosinsky G, Ludäscher B, Ellisman MH. A cell-centered database for electron tomographic data. J Struct Biol 2002; 138:145-55. [PMID: 12160711 DOI: 10.1016/s1047-8477(02)00006-0] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Electron tomography is providing a wealth of 3D structural data on biological components ranging from molecules to cells. We are developing a web-accessible database tailored to high-resolution cellular level structural and protein localization data derived from electron tomography. The Cell Centered Database or CCDB is built on an object-relational framework using Oracle 8i and is housed on a server at the San Diego Supercomputer Center at the University of California, San Diego. Data can be deposited and accessed via a web interface. Each volume reconstruction is stored with a full set of descriptors along with tilt images and any derived products such as segmented objects and animations. Tomographic data are supplemented by high-resolution light microscopic data in order to provide correlated data on higher-order cellular and tissue structure. Every object segmented from a reconstruction is included as a distinct entity in the database along with measurements such as volume, surface area, diameter, and length and amount of protein labeling, allowing the querying of image-specific attributes. Data sets obtained in response to a CCDB query are retrieved via the Storage Resource Broker, a data management system for transparent access to local and distributed data collections. The CCDB is designed to provide a resource for structural biologists and to make tomographic data sets available to the scientific community at large.
Collapse
Affiliation(s)
- Maryann E Martone
- National Center for Microscopy and Imaging Research, Center for Research in Biological Structure and Department of Neurosciences, University of California, San Diego, La Jolla, 92093-0608, USA.
| | | | | | | | | | | | | |
Collapse
|
5
|
Machtynger J, Shotton DM. VANQUIS, a system for the interactive semantic content analysis and spatio-temporal query by content of videos. J Microsc 2002; 205:43-52. [PMID: 11856380 DOI: 10.1046/j.0022-2720.2001.00967.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Using the video metadata descriptors and data model defined in the accompanying paper (Shotton, D. M. et al. (2002) A metadata classification schema for semantic content analysis of videos. J. Microsc. 205, 33-42), we discuss how analysis of the content of scientific videos, and subsequent query by content of the resulting semantic metadata, can be enhanced by the use of an object-relational database. We illustrate this by describing VANQUIS, a Web-based prototype video analysis and query interface system for the interactive spatio-temporal analysis and subsequent query by content of videos. Using VANQUIS to generate standard SQL (structured query language) statements that address complex data types stored in an object-relational database, relationships between characters and events contained within and between videos can be identified, and the appropriate video segments containing these characters and events can be retrieved for viewing. We give examples of analysis and query implementation by using VANQUIS to analyse a biological microscopy video, and discuss the wider potential of this methodology for the analysis and query by content of videos containing more general subject matter.
Collapse
Affiliation(s)
- J Machtynger
- Institute of Cognitive Neuroscience, University College London, UK
| | | |
Collapse
|
6
|
Shotton DM, Rodríguez A, Guil N, Trelles O. A metadata classification schema for semantic content analysis of videos. J Microsc 2002; 205:33-42. [PMID: 11856379 DOI: 10.1046/j.0022-2720.2001.00966.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Simple ancillary metadata, such as those encompassed by the 15 elements of the Dublin Core, may be sufficient and entirely appropriate for basic coarse-granularity cross-domain resource discovery. However, they are insufficient and inappropriate for content description of complex data types such as videos, which require more detailed relational models. We propose a metadata classification schema for the characterization of items and events in videos that permits subsequent query by content. Following MPEG-7 nomenclature, metadata intrinsic to the information content of the video are defined as either structural or semantic, where structural metadata are numerical feature primitives produced by analysing the colour, shape, texture, structure and motion within the video frames, whereas semantic metadata describe the locations and timings of individual items and particular actions or events in the video, and are thus of higher information value. In this paper, the semantic metadata required to describe the visual information content of videos are defined and classified into four distinct classes: Media Entities; Content Items; Events; and Supplementary Items, and three types of property tables are defined: Identity Tables; Spatio-Temporal Position Tables; and Event Tables, in which these metadata may be stored in a relational database.
Collapse
Affiliation(s)
- D M Shotton
- Image Bioinformatics Laboratory, Department of Zoology, University of Oxford, UK.
| | | | | | | |
Collapse
|
7
|
Carazo JM, Stelzer EH. The BioImage Database Project: organizing multidimensional biological images in an object-relational database. J Struct Biol 1999; 125:97-102. [PMID: 10222266 DOI: 10.1006/jsbi.1999.4103] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The BioImage Database Project collects and structures multidimensional data sets recorded by various microscopic techniques relevant to modern life sciences. It provides, as precisely as possible, the circumstances in which the sample was prepared and the data were recorded. It grants access to the actual data and maintains links between related data sets. In order to promote the interdisciplinary approach of modern science, it offers a large set of key words, which covers essentially all aspects of microscopy. Nonspecialists can, therefore, access and retrieve significant information recorded and submitted by specialists in other areas. A key issue of the undertaking is to exploit the available technology and to provide a well-defined yet flexible structure for dealing with data. Its pivotal element is, therefore, a modern object relational database that structures the metadata and ameliorates the provision of a complete service. The BioImage database can be accessed through the Internet.
Collapse
Affiliation(s)
- J M Carazo
- Centro Nacional de Biotecnología-CSIC, Campus Universidad Autonoma, Madrid, E-28049, Spain
| | | |
Collapse
|
8
|
Pittet JJ, Henn C, Engel A, Heymann JB. Visualizing 3D data obtained from microscopy on the Internet. J Struct Biol 1999; 125:123-32. [PMID: 10222269 DOI: 10.1006/jsbi.1998.4075] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The Internet is a powerful communication medium increasingly exploited by business and science alike, especially in structural biology and bioinformatics. The traditional presentation of static two-dimensional images of real-world objects on the limited medium of paper can now be shown interactively in three dimensions. Many facets of this new capability have already been developed, particularly in the form of VRML (virtual reality modeling language), but there is a need to extend this capability for visualizing scientific data. Here we introduce a real-time isosurfacing node for VRML, based on the marching cube approach, allowing interactive isosurfacing. A second node does three-dimensional (3D) texture-based volume-rendering for a variety of representations. The use of computers in the microscopic and structural biosciences is extensive, and many scientific file formats exist. To overcome the problem of accessing such data from VRML and other tools, we implemented extensions to SGI's IFL (image format library). IFL is a file format abstraction layer defining communication between a program and a data file. These technologies are developed in support of the BioImage project, aiming to establish a database prototype for multidimensional microscopic data with the ability to view the data within a 3D interactive environment.
Collapse
Affiliation(s)
- J J Pittet
- Maurice E. Müller Institute for Microscopy, Biozentrum, University of Basel, Klingelbergstrasse 70, Basel, 4056, Switzerland
| | | | | | | |
Collapse
|
9
|
de Alarcón PA, Gupta A, Carazo JM. A framework for querying a database for structural information on 3D images of macromolecules: A web-based query-by-content prototype on the BioImage macromolecular server. J Struct Biol 1999; 125:112-22. [PMID: 10222268 DOI: 10.1006/jsbi.1999.4102] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Nowadays we are experiencing a remarkable growth in the number of databases that have become accessible over the Web. However, in a certain number of cases, for example, in the case of BioImage, this information is not of a textual nature, thus posing new challenges in the design of tools to handle these data. In this work, we concentrate on the development of new mechanisms aimed at "querying" these databases of complex data sets by their intrinsic content, rather than by their textual annotations only. We concentrate our efforts on a subset of BioImage containing 3D images (volumes) of biological macromolecules, implementing a first prototype of a "query-by-content" system. In the context of databases of complex data types the term query-by-content makes reference to those data modeling techniques in which user-defined functions aim at "understanding" (to some extent) the informational content of the data sets. In these systems the matching criteria introduced by the user are related to intrinsic features concerning the 3D images themselves, hence, complementing traditional queries by textual key words only. Efficient computational algorithms are required in order to "extract" structural information of the 3D images prior to storing them in the database. Also, easy-to-use interfaces should be implemented in order to obtain feedback from the expert. Our query-by-content prototype is used to construct a concrete query, making use of basic structural features, which are then evaluated over a set of three-dimensional images of biological macromolecules. This experimental implementation can be accessed via the Web at the BioImage server in Madrid, at http://www.bioimage.org/qbc/index.html.
Collapse
Affiliation(s)
- P A de Alarcón
- Centro Nacional de Biotecnología-CSIC, Campus Universidad Autonoma, Cantoblanco, Madrid, 28049, Spain
| | | | | |
Collapse
|
10
|
Boudier T, Shotton DM. Video on the Internet: An introduction to the digital encoding, compression, and transmission of moving image data. J Struct Biol 1999; 125:133-55. [PMID: 10222270 DOI: 10.1006/jsbi.1999.4097] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In this paper, we seek to provide an introduction to the fast-moving field of digital video on the Internet, from the viewpoint of the biological microscopist who might wish to store or access videos, for instance in image databases such as the BioImage Database (http://www.bioimage.org). We describe and evaluate the principal methods used for encoding and compressing moving image data for digital storage and transmission over the Internet, which involve compromises between compression efficiency and retention of image fidelity, and describe the existing alternate software technologies for downloading or streaming compressed digitized videos using a Web browser. We report the results of experiments on video microscopy recordings and three-dimensional confocal animations of biological specimens to evaluate the compression efficiencies of the principal video compression-decompression algorithms (codecs) and to document the artefacts associated with each of them. Because MPEG-1 gives very high compression while yet retaining reasonable image quality, these studies lead us to recommend that video databases should store both a high-resolution original version of each video, ideally either uncompressed or losslessly compressed, and a separate edited and highly compressed MPEG-1 preview version that can be rapidly downloaded for interactive viewing by the database user.
Collapse
Affiliation(s)
- T Boudier
- Department of Zoology, University of Oxford, South Parks Road, Oxford, OX1 3PS, United Kingdom.
| | | |
Collapse
|