1
|
Jones FM, Arteta C, Zisserman A, Lempitsky V, Lintott CJ, Hart T. Processing citizen science- and machine-annotated time-lapse imagery for biologically meaningful metrics. Sci Data 2020; 7:102. [PMID: 32218449 PMCID: PMC7099010 DOI: 10.1038/s41597-020-0442-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 02/03/2020] [Indexed: 11/11/2022] Open
Abstract
Time-lapse cameras facilitate remote and high-resolution monitoring of wild animal and plant communities, but the image data produced require further processing to be useful. Here we publish pipelines to process raw time-lapse imagery, resulting in count data (number of penguins per image) and 'nearest neighbour distance' measurements. The latter provide useful summaries of colony spatial structure (which can indicate phenological stage) and can be used to detect movement - metrics which could be valuable for a number of different monitoring scenarios, including image capture during aerial surveys. We present two alternative pathways for producing counts: (1) via the Zooniverse citizen science project Penguin Watch and (2) via a computer vision algorithm (Pengbot), and share a comparison of citizen science-, machine learning-, and expert- derived counts. We provide example files for 14 Penguin Watch cameras, generated from 63,070 raw images annotated by 50,445 volunteers. We encourage the use of this large open-source dataset, and the associated processing methodologies, for both ecological studies and continued machine learning and computer vision development.
Collapse
Affiliation(s)
- Fiona M Jones
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK.
| | - Carlos Arteta
- Department of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, UK
| | - Andrew Zisserman
- Department of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, UK
| | - Victor Lempitsky
- Samsung AI Center, Butyrskiy Val Ulitsa, 10, Moscow, Russia, 125047 & Skolkovo Institute of Science and Technology (Skoltech), Bolshoy Boulevard 30, bld. 1, Moscow, 121205, Russia
| | - Chris J Lintott
- Zooniverse, Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford, OX1 3RH, UK
| | - Tom Hart
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK.
| |
Collapse
|
2
|
|
3
|
Kulikov V, Guo SM, Stone M, Goodman A, Carpenter A, Bathe M, Lempitsky V. DoGNet: A deep architecture for synapse detection in multiplexed fluorescence images. PLoS Comput Biol 2019; 15:e1007012. [PMID: 31083649 PMCID: PMC6533009 DOI: 10.1371/journal.pcbi.1007012] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 05/23/2019] [Accepted: 04/08/2019] [Indexed: 11/19/2022] Open
Abstract
Neuronal synapses transmit electrochemical signals between cells through the coordinated action of presynaptic vesicles, ion channels, scaffolding and adapter proteins, and membrane receptors. In situ structural characterization of numerous synaptic proteins simultaneously through multiplexed imaging facilitates a bottom-up approach to synapse classification and phenotypic description. Objective automation of efficient and reliable synapse detection within these datasets is essential for the high-throughput investigation of synaptic features. Convolutional neural networks can solve this generalized problem of synapse detection, however, these architectures require large numbers of training examples to optimize their thousands of parameters. We propose DoGNet, a neural network architecture that closes the gap between classical computer vision blob detectors, such as Difference of Gaussians (DoG) filters, and modern convolutional networks. DoGNet is optimized to analyze highly multiplexed microscopy data. Its small number of training parameters allows DoGNet to be trained with few examples, which facilitates its application to new datasets without overfitting. We evaluate the method on multiplexed fluorescence imaging data from both primary mouse neuronal cultures and mouse cortex tissue slices. We show that DoGNet outperforms convolutional networks with a low-to-moderate number of training examples, and DoGNet is efficiently transferred between datasets collected from separate research groups. DoGNet synapse localizations can then be used to guide the segmentation of individual synaptic protein locations and spatial extents, revealing their spatial organization and relative abundances within individual synapses. The source code is publicly available: https://github.com/kulikovv/dognet.
Collapse
Affiliation(s)
| | - Syuan-Ming Guo
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Matthew Stone
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Allen Goodman
- Imaging Platform, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - Anne Carpenter
- Imaging Platform, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - Mark Bathe
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | | |
Collapse
|
4
|
Kononenko D, Ganin Y, Sungatullina D, Lempitsky V. Photorealistic Monocular Gaze Redirection Using Machine Learning. IEEE Trans Pattern Anal Mach Intell 2018; 40:2696-2710. [PMID: 28809672 DOI: 10.1109/tpami.2017.2737423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We propose a general approach to the gaze redirection problem in images that utilizes machine learning. The idea is to learn to re-synthesize images by training on pairs of images with known disparities between gaze directions. We show that such learning-based re-synthesis can achieve convincing gaze redirection based on monocular input, and that the learned systems generalize well to people and imaging conditions unseen during training. We describe and compare three instantiations of our idea. The first system is based on efficient decision forest predictors and redirects the gaze by a fixed angle in real-time (on a single CPU), being particularly suitable for the videoconferencing gaze correction. The second system is based on a deep architecture and allows gaze redirection by a range of angles. The second system achieves higher photorealism, while being several times slower. The third system is based on real-time decision forests at test time, while using the supervision from a "teacher" deep network during training. The third system approaches the quality of a teacher network in our experiments, and thus provides a highly realistic real-time monocular solution to the gaze correction problem. We present in-depth assessment and comparisons of the proposed systems based on quantitative measurements and a user study.
Collapse
|
5
|
Liotti E, Arteta C, Zisserman A, Lui A, Lempitsky V, Grant PS. Crystal nucleation in metallic alloys using x-ray radiography and machine learning. Sci Adv 2018; 4:eaar4004. [PMID: 29662954 PMCID: PMC5898834 DOI: 10.1126/sciadv.aar4004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 02/22/2018] [Indexed: 05/14/2023]
Abstract
The crystallization of solidifying Al-Cu alloys over a wide range of conditions was studied in situ by synchrotron x-ray radiography, and the data were analyzed using a computer vision algorithm trained using machine learning. The effect of cooling rate and solute concentration on nucleation undercooling, crystal formation rate, and crystal growth rate was measured automatically for thousands of separate crystals, which was impossible to achieve manually. Nucleation undercooling distributions confirmed the efficiency of extrinsic grain refiners and gave support to the widely assumed free growth model of heterogeneous nucleation. We show that crystallization occurred in temporal and spatial bursts associated with a solute-suppressed nucleation zone.
Collapse
Affiliation(s)
- Enzo Liotti
- Department of Materials, University of Oxford, Oxford OX1 3PH, UK
- Corresponding author.
| | - Carlos Arteta
- Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, UK
| | - Andrew Zisserman
- Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, UK
| | - Andrew Lui
- Department of Materials, University of Oxford, Oxford OX1 3PH, UK
| | | | - Patrick S. Grant
- Department of Materials, University of Oxford, Oxford OX1 3PH, UK
| |
Collapse
|
6
|
Ranger BJ, Feigin M, Pestrov N, Zhang X, Lempitsky V, Herr HM, Anthony BW. Motion compensation in a tomographic ultrasound imaging system: Toward volumetric scans of a limb for prosthetic socket design. Annu Int Conf IEEE Eng Med Biol Soc 2016; 2015:7204-7. [PMID: 26737954 DOI: 10.1109/embc.2015.7320054] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Current methods of prosthetic socket fabrication remain subjective and ineffective at creating an interface to the human body that is both comfortable and functional. Though there has been recent success using methods like magnetic resonance imaging and biomechanical modeling, a low-cost, streamlined, and repeatable process has not been fully demonstrated. Medical ultrasonography, which has significant potential to expand its clinical applications, is being pursued to acquire data that may quantify and improve the design process and fabrication of prosthetic sockets. This paper presents a new multi-modal imaging approach for acquiring volumetric images of a human limb, specifically focusing on how motion of the limb is compensated for using optical imagery.
Collapse
|
7
|
Fincke J, Kuzmin A, Lempitsky V, Anthony B. A single element 3D ultrasound tomography system. Annu Int Conf IEEE Eng Med Biol Soc 2016; 2015:5541-4. [PMID: 26737547 DOI: 10.1109/embc.2015.7319647] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Over the past decade, substantial effort has been directed toward developing ultrasonic systems for medical imaging. With advances in computational power, previously theorized scanning methods such as ultrasound tomography can now be realized. In this paper, we present the design, error analysis, and initial backprojection images from a single element 3D ultrasound tomography system. The system enables volumetric pulse-echo or transmission imaging of distal limbs. The motivating clinical applications include: improving prosthetic fittings, monitoring bone density, and characterizing muscle health. The system is designed as a flexible mechanical platform for iterative development of algorithms targeting imaging of soft tissue and bone. The mechanical system independently controls movement of two single element ultrasound transducers in a cylindrical water tank. Each transducer can independently circle about the center of the tank as well as move vertically in depth. High resolution positioning feedback (~1μm) and control enables flexible positioning of the transmitter and the receiver around the cylindrical tank; exchangeable transducers enable algorithm testing with varying transducer frequencies and beam geometries. High speed data acquisition (DAQ) through a dedicated National Instrument PXI setup streams digitized data directly to the host PC. System positioning error has been quantified and is within limits for the imaging requirements of the motivating applications.
Collapse
|
8
|
Kuzmin A, Zakrzewski AM, Anthony BW, Lempitsky V. Multi-frame elastography using a handheld force-controlled ultrasound probe. IEEE Trans Ultrason Ferroelectr Freq Control 2015; 62:1486-500. [PMID: 26276958 DOI: 10.1109/tuffc.2015.007133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
We propose a new method for strain field estimation in quasi-static ultrasound elastography based on matching RF data frames of compressed tissues. The method benefits from using a handheld force-controlled ultrasound probe, which provides the contact force magnitude and therefore improves repeatability of displacement field estimation. The displacement field is estimated in a two-phase manner using triplets of RF data frames consisting of a pre-compression image and two post-compression images obtained with lower and higher compression ratios. First, a reliable displacement field estimate is calculated for the first post-compression frame. Second, we use this displacement estimate to warp the second post-compression frame while using linear elasticity to obtain an initial approximation. Final displacement estimation is refined using the warped image. The two-phase displacement estimation allows for higher compression ratios, thus increasing the practical resolution of the strain estimates. The strain field is computed from a displacement field using a smoothness- regularized energy functional, which takes into consideration local displacement estimation quality. The minimization is performed using an efficient primal-dual hybrid gradient algorithm, which can leverage the architecture of a graphical processing unit. The method is quantitatively evaluated using finite element simulations. We compute strain estimates for tissue-mimicking phantoms with known elastic properties and finally perform a qualitative validation using in vivo patient data.
Collapse
|
9
|
Abstract
A new data structure for efficient similarity search in very large datasets of high-dimensional vectors is introduced. This structure called the inverted multi-index generalizes the inverted index idea by replacing the standard quantization within inverted indices with product quantization. For very similar retrieval complexity and pre-processing time, inverted multi-indices achieve a much denser subdivision of the search space compared to inverted indices, while retaining their memory efficiency. Our experiments with large datasets of SIFT and GIST vectors demonstrate that because of the denser subdivision, inverted multi-indices are able to return much shorter candidate lists with higher recall. Augmented with a suitable reranking procedure, multi-indices were able to significantly improve the speed of approximate nearest neighbor search on the dataset of 1 billion SIFT vectors compared to the best previously published systems, while achieving better recall and incurring only few percent of memory overhead.
Collapse
|
10
|
Arteta C, Lempitsky V, Noble JA, Zisserman A. Detecting overlapping instances in microscopy images using extremal region trees. Med Image Anal 2015; 27:3-16. [PMID: 25980675 DOI: 10.1016/j.media.2015.03.002] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 11/20/2014] [Accepted: 03/03/2015] [Indexed: 11/29/2022]
Abstract
In many microscopy applications the images may contain both regions of low and high cell densities corresponding to different tissues or colonies at different stages of growth. This poses a challenge to most previously developed automated cell detection and counting methods, which are designed to handle either the low-density scenario (through cell detection) or the high-density scenario (through density estimation or texture analysis). The objective of this work is to detect all the instances of an object of interest in microscopy images. The instances may be partially overlapping and clustered. To this end we introduce a tree-structured discrete graphical model that is used to select and label a set of non-overlapping regions in the image by a global optimization of a classification score. Each region is labeled with the number of instances it contains - for example regions can be selected that contain two or three object instances, by defining separate classes for tuples of objects in the detection process. We show that this formulation can be learned within the structured output SVM framework and that the inference in such a model can be accomplished using dynamic programming on a tree structured region graph. Furthermore, the learning only requires weak annotations - a dot on each instance. The candidate regions for the selection are obtained as extremal region of a surface computed from the microscopy image, and we show that the performance of the model can be improved by considering a proxy problem for learning the surface that allows better selection of the extremal regions. Furthermore, we consider a number of variations for the loss function used in the structured output learning. The model is applied and evaluated over six quite disparate data sets of images covering: fluorescence microscopy, weak-fluorescence molecular images, phase contrast microscopy and histopathology images, and is shown to exceed the state of the art in performance.
Collapse
Affiliation(s)
- Carlos Arteta
- Department of Engineering Science, University of Oxford, Oxford OX1 2JD, UK.
| | - Victor Lempitsky
- Skolkovo Institute of Science and Technology (Skoltech), Skolkovo 143025 Russia
| | - J Alison Noble
- Department of Engineering Science, University of Oxford, Oxford OX1 2JD, UK
| | - Andrew Zisserman
- Department of Engineering Science, University of Oxford, Oxford OX1 2JD, UK
| |
Collapse
|
11
|
|
12
|
Boykov Y, Kahl F, Lempitsky V, Schmidt FR. Guest Editorial: Energy Optimization Methods. Int J Comput Vis 2013. [DOI: 10.1007/s11263-013-0637-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Abstract
Hough transform-based methods for detecting multiple objects use nonmaxima suppression or mode seeking to locate and distinguish peaks in Hough images. Such postprocessing requires the tuning of many parameters and is often fragile, especially when objects are located spatially close to each other. In this paper, we develop a new probabilistic framework for object detection which is related to the Hough transform. It shares the simplicity and wide applicability of the Hough transform but, at the same time, bypasses the problem of multiple peak identification in Hough images and permits detection of multiple objects without invoking nonmaximum suppression heuristics. Our experiments demonstrate that this method results in a significant improvement in detection accuracy both for the classical task of straight line detection and for a more modern category-level (pedestrian) detection problem.
Collapse
Affiliation(s)
- Olga Barinova
- Lomonosov Moscow State University, Molodezhnaya str. 111, 119296 Moscow, Russia.
| | | | | |
Collapse
|
14
|
Gall J, Yao A, Razavi N, Van Gool L, Lempitsky V. Hough forests for object detection, tracking, and action recognition. IEEE Trans Pattern Anal Mach Intell 2011; 33:2188-2202. [PMID: 21464503 DOI: 10.1109/tpami.2011.70] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Abstract—The paper introduces Hough forests, which are random forests adapted to perform a generalized Hough transform in an efficient way. Compared to previous Hough-based systems such as implicit shape models, Hough forests improve the performance of the generalized Hough transform for object detection on a categorical level. At the same time, their flexibility permits extensions of the Hough transform to new domains such as object tracking and action recognition. Hough forests can be regarded as task-adapted codebooks of local appearance that allow fast supervised training and fast matching at test time. They achieve high detection accuracy since the entries of such codebooks are optimized to cast Hough votes with small variance and since their efficiency permits dense sampling of local image patches or video cuboids during detection. The efficacy of Hough forests for a set of computer vision tasks is validated through experiments on a large set of publicly available benchmark data sets and comparisons with the state-of-the-art.
Collapse
Affiliation(s)
- Juergen Gall
- Department of Information Technology and Electrical Engineering, Computer Vision Laboratory, ETH Zurich, Zurich, Switzerland.
| | | | | | | | | |
Collapse
|
15
|
Abstract
The efficient application of graph cuts to Markov Random Fields (MRFs) with multiple discrete or continuous labels remains an open question. In this paper, we demonstrate one possible way of achieving this by using graph cuts to combine pairs of suboptimal labelings or solutions. We call this combination process the fusion move. By employing recently developed graph-cut-based algorithms (so-called QPBO-graph cut), the fusion move can efficiently combine two proposal labelings in a theoretically sound way, which is in practice often globally optimal. We demonstrate that fusion moves generalize many previous graph-cut approaches, which allows them to be used as building blocks within a broader variety of optimization schemes than were considered before. In particular, we propose new optimization schemes for computer vision MRFs with applications to image restoration, stereo, and optical flow, among others. Within these schemes the fusion moves are used 1) for the parallelization of MRF optimization into several threads, 2) for fast MRF optimization by combining cheap-to-compute solutions, and 3) for the optimization of highly nonconvex continuous-labeled MRFs with 2D labels. Our final example is a nonvision MRF concerned with cartographic label placement, where fusion moves can be used to improve the performance of a standard inference method (loopy belief propagation).
Collapse
|
16
|
Lempitsky V, Verhoek M, Noble JA, Blake A. Random Forest Classification for Automatic Delineation of Myocardium in Real-Time 3D Echocardiography. Functional Imaging and Modeling of the Heart 2009. [DOI: 10.1007/978-3-642-01932-6_48] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
17
|
|