1
|
Vo H, Kong J, Teng D, Liang Y, Aji A, Teodoro G, Wang F. MaReIA: A Cloud MapReduce Based High Performance Whole Slide Image Analysis Framework. DISTRIBUTED AND PARALLEL DATABASES 2019; 37:251-272. [PMID: 31217669 PMCID: PMC6583906 DOI: 10.1007/s10619-018-7237-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Recent advancements in systematic analysis of high resolution whole slide images have increase efficiency of diagnosis, prognosis and prediction of cancer and important diseases. Due to the enormous sizes and dimensions of whole slide images, the analysis requires extensive computing resources which are not commonly available. Images have to be tiled for processing due to computer memory limitations, which lead to inaccurate results due to the ignorance of boundary crossing objects. Thus, we propose a generic and highly scalable cloud-based image analysis framework for whole slide images. The framework enables parallelized integration of image analysis steps, such as segmentation and aggregation of micro-structures in a single pipeline, and generation of final objects manageable by databases. The core concept relies on the abstraction of objects in whole slide images as different classes of spatial geometries, which in turn can be handled as text based records in MapReduce. The framework applies an overlapping partitioning scheme on images, and provides parallelization of tiling and image segmentation based on MapReduce architecture. It further provides robust object normalization, graceful handling of boundary objects with an efficient spatial indexing based matching method to generate accurate results. Our experiments on Amazon EMR show that MaReIA is highly scalable, generic and extremely cost effective by benchmark tests.
Collapse
Affiliation(s)
- Hoang Vo
- Department of Computer Science, Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY
| | - Jun Kong
- Department of Biomedical Informatics, Emory University, Atlanta, GA
| | - Dejun Teng
- Department of Computer Science and Engineering, Ohio State University, Columbus, OH
| | - Yanhui Liang
- Department of Computer Science, Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY
| | | | - George Teodoro
- Department of Computer Science, University of Brasília, Brasília, DF, Brazil
| | - Fusheng Wang
- Department of Computer Science, Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY
| |
Collapse
|
2
|
LAI BOCHENG, WU TUNGYU, CHIU TSOUHAN, LI KUNCHUN, LEE CHIAYING, CHIEN WEICHEN, WONG WINGHUNG. Towards High Performance Data Analytic on Heterogeneous Many-core Systems: A Study on Bayesian Sequential Partitioning. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 2018; 122:36-50. [PMID: 30872894 PMCID: PMC6411309 DOI: 10.1016/j.jpdc.2018.07.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Bayesian Sequential Partitioning (BSP) is a statistically effective density estimation method to comprehend the characteristics of a high dimensional data space. The intensive computation of the statistical model and the counting of enormous data have caused serious design challenges for BSP to handle the growing volume of the data. This paper proposes a high performance design of BSP by leveraging a heterogeneous CPU/GPGPU system that consists of a host CPU and a K80 GPGPU. A series of techniques, on both data structures and execution management policies, is implemented to extensively exploit the computation capability of the heterogeneous many-core system and alleviate system bottlenecks. When compared with a parallel design on a high-end CPU, the proposed techniques achieve 48x average runtime enhancement while the maximum speedup can reach 78.76x.
Collapse
Affiliation(s)
- BO-CHENG LAI
- Nation Chiao-Tung University, 1001 Da-Hsueh Rd., Hsinchu, Taiwan 30010
| | | | - TSOU-HAN CHIU
- MediaTek Incorporation, No.1, Dusing 1st Rd., Hsinchu Science, Taiwan 300
| | - KUN-CHUN LI
- HTC Corporation, No. 23, Xinghua Rd., Taoyuan Dist., Taoyuan City, Taiwan 330
| | - CHIA-YING LEE
- MediaTek Incorporation, No.1, Dusing 1st Rd., Hsinchu Science, Taiwan 300
| | - WEI-CHEN CHIEN
- Skymizer Corporation, No.408, Ruiguang Rd., Neihu Dist., Taipei City, Taiwan 114
| | | |
Collapse
|
3
|
Bremer E, Kurc T, Gao Y, Saltz J, Almeida JS. Safe "cloudification" of large images through picker APIs. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017; 2016:342-351. [PMID: 28269829 PMCID: PMC5333212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The "Box model" allows users with no particular training in informatics, or access to specialized infrastructure, operate generic cloud computing resources through a temporary URI dereferencing mechanism known as "drop-file-picker API" ("picker API" for sort). This application programming interface (API) was popularized in the web app development community by DropBox, and is now a consumer-facing feature of all major cloud computing platforms such as Box.com, Google Drive and Amazon S3. This reports describes a prototype web service application that uses picker APIs to expose a new, "cloudified", API tailored for image analysis, without compromising the private governance of the data exposed. In order to better understand this cross-platform cloud computing landscape, we first measured the time for both transfer and traversing of large image files generated by whole slide imaging (WSI) in Digital Pathology. The verification that there is extensive interconnectivity between cloud resources let to the development of a prototype software application that exposes an image-traversing REST API to image files stored in any of the consumer-facing "boxes". In summary, an image file can be upload/synchronized into a any cloud resource with a file picker API and the prototype service described here will expose an HTTP REST API that remains within the safety of the user's own governance. The open source prototype is publicly available at sbu-bmi.github.io/imagebox. Availability The accompanying prototype application is made publicly available, fully functional, with open source, at http://sbu-bmi.github.io/imagebox://sbu-bmi.github.io/imagebox. An illustrative webcasted use of this Web App is included with the project codebase at https://github.com/SBU-BMI/imageboxs://github.com/SBU-BMI/imagebox.
Collapse
Affiliation(s)
- Erich Bremer
- Dept Biomedical Informatics, Stony Brook University (SUNY), NY 11794
| | - Tahsin Kurc
- Dept Biomedical Informatics, Stony Brook University (SUNY), NY 11794
| | - Yi Gao
- Dept Biomedical Informatics, Stony Brook University (SUNY), NY 11794
| | - Joel Saltz
- Dept Biomedical Informatics, Stony Brook University (SUNY), NY 11794
| | - Jonas S Almeida
- Dept Biomedical Informatics, Stony Brook University (SUNY), NY 11794
| |
Collapse
|
4
|
Teodoro G, Kurc T, Andrade G, Kong J, Ferreira R, Saltz J. Application Performance Analysis and Efficient Execution on Systems with multi-core CPUs, GPUs and MICs: A Case Study with Microscopy Image Analysis. THE INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2017; 31:32-51. [PMID: 28239253 PMCID: PMC5319667 DOI: 10.1177/1094342015594519] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core-MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application performance compared to classic strategies in hybrid configurations.
Collapse
Affiliation(s)
- George Teodoro
- Department of Computer Science, University of Brasília, Brasília, DF, Brazil
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
- Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Guilherme Andrade
- Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, MG, Brazil
| | - Jun Kong
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Renato Ferreira
- Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, MG, Brazil
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
- Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| |
Collapse
|
5
|
Liang Y, Wang F, Treanor D, Magee D, Roberts N, Teodoro G, Zhu Y, Kong J. A Framework for 3D Vessel Analysis using Whole Slide Images of Liver Tissue Sections. INTERNATIONAL JOURNAL OF COMPUTATIONAL BIOLOGY AND DRUG DESIGN 2016; 9:102-119. [PMID: 27034719 PMCID: PMC4809644 DOI: 10.1504/ijcbdd.2016.074983] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Three-dimensional (3D) high resolution microscopic images have high potential for improving the understanding of both normal and disease processes where structural changes or spatial relationship of disease features are significant. In this paper, we develop a complete framework applicable to 3D pathology analytical imaging, with an application to whole slide images of sequential liver slices for 3D vessel structure analysis. The analysis workflow consists of image registration, segmentation, vessel cross-section association, interpolation, and volumetric rendering. To identify biologically-meaningful correspondence across adjacent slides, we formulate a similarity function for four association cases. The optimal solution is then obtained by constrained Integer Programming. We quantitatively and qualitatively compare our vessel reconstruction results with human annotations. Validation results indicate a satisfactory concordance as measured both by region-based and distance-based metrics. These results demonstrate a promising 3D vessel analysis framework for whole slide images of liver tissue sections.
Collapse
Affiliation(s)
- Yanhui Liang
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Fusheng Wang
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Darren Treanor
- Department of Pathology Leeds Teaching Hospitals NHS Trust Leeds Institute of Cancer and Pathology The University of Leeds, Leeds LS9 7TF, United Kingdom
| | - Derek Magee
- School of Computing, The University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Nick Roberts
- Leeds Institute of Cancer and Pathology The University of Leeds, Leeds LS9 7TF, United Kingdom
| | - George Teodoro
- Department of Computer Science, University of Brasília, Brasília, DF, Brazil
| | - Yangyang Zhu
- Department of Mathematics and Computer Science, Emory University, Atlanta, GA, USA
| | - Jun Kong
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| |
Collapse
|
6
|
Kurc T, Qi X, Wang D, Wang F, Teodoro G, Cooper L, Nalisnik M, Yang L, Saltz J, Foran DJ. Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies. BMC Bioinformatics 2015; 16:399. [PMID: 26627175 PMCID: PMC4667532 DOI: 10.1186/s12859-015-0831-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 11/16/2015] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND We describe a suite of tools and methods that form a core set of capabilities for researchers and clinical investigators to evaluate multiple analytical pipelines and quantify sensitivity and variability of the results while conducting large-scale studies in investigative pathology and oncology. The overarching objective of the current investigation is to address the challenges of large data sizes and high computational demands. RESULTS The proposed tools and methods take advantage of state-of-the-art parallel machines and efficient content-based image searching strategies. The content based image retrieval (CBIR) algorithms can quickly detect and retrieve image patches similar to a query patch using a hierarchical analysis approach. The analysis component based on high performance computing can carry out consensus clustering on 500,000 data points using a large shared memory system. CONCLUSIONS Our work demonstrates efficient CBIR algorithms and high performance computing can be leveraged for efficient analysis of large microscopy images to meet the challenges of clinically salient applications in pathology. These technologies enable researchers and clinical investigators to make more effective use of the rich informational content contained within digitized microscopy specimens.
Collapse
Affiliation(s)
- Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
| | - Xin Qi
- Department of Pathology & Laboratory Medicine, Rutgers -- Robert Wood Johnson Medical School, New Brunswick, USA.
- Rutgers Cancer Institute of New Jersey, New Brunswick, USA.
| | - Daihou Wang
- Department of Electrical and Computer Engineering, Rutgers University, New Brunswick, USA.
| | - Fusheng Wang
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
- Department of Computer Science, Stony Brook University, Stony Brook, USA.
| | - George Teodoro
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
- Department of Computer Science, University of Brasilia, Brasília, Brazil.
| | - Lee Cooper
- Department of Biomedical Informatics, Emory University, Atlanta, USA.
| | - Michael Nalisnik
- Department of Biomedical Informatics, Emory University, Atlanta, USA.
| | - Lin Yang
- Department of Biomedical Engineering, University of Florida, Gainesville, USA.
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA.
| | - David J Foran
- Department of Pathology & Laboratory Medicine, Rutgers -- Robert Wood Johnson Medical School, New Brunswick, USA.
- Rutgers Cancer Institute of New Jersey, New Brunswick, USA.
| |
Collapse
|
7
|
Teodoro G, Pan T, Kurc T, Kong J, Cooper L, Klasky S, Saltz J. Region Templates: Data Representation and Management for High-Throughput Image Analysis. PARALLEL COMPUTING 2014; 40:589-610. [PMID: 26139953 PMCID: PMC4484879 DOI: 10.1016/j.parco.2014.09.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
We introduce a region template abstraction and framework for the efficient storage, management and processing of common data types in analysis of large datasets of high resolution images on clusters of hybrid computing nodes. The region template abstraction provides a generic container template for common data structures, such as points, arrays, regions, and object sets, within a spatial and temporal bounding box. It allows for different data management strategies and I/O implementations, while providing a homogeneous, unified interface to applications for data storage and retrieval. A region template application is represented as a hierarchical dataflow in which each computing stage may be represented as another dataflow of finer-grain tasks. The execution of the application is coordinated by a runtime system that implements optimizations for hybrid machines, including performance-aware scheduling for maximizing the utilization of computing devices and techniques to reduce the impact of data transfers between CPUs and GPUs. An experimental evaluation on a state-of-the-art hybrid cluster using a microscopy imaging application shows that the abstraction adds negligible overhead (about 3%) and achieves good scalability and high data transfer rates. Optimizations in a high speed disk based storage implementation of the abstraction to support asynchronous data transfers and computation result in an application performance gain of about 1.13×. Finally, a processing rate of 11,730 4K×4K tiles per minute was achieved for the microscopy imaging application on a cluster with 100 nodes (300 GPUs and 1,200 CPU cores). This computation rate enables studies with very large datasets.
Collapse
Affiliation(s)
- George Teodoro
- Department of Computer Science, University of Brasília, Brasília, DF, Brazil
| | - Tony Pan
- Biomedical Informatics Department, Emory University, Atlanta, GA, USA
| | - Tahsin Kurc
- Biomedical Informatics Department, Stony Brook University, Stony Brook, NY, USA
- Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jun Kong
- Biomedical Informatics Department, Emory University, Atlanta, GA, USA
| | - Lee Cooper
- Biomedical Informatics Department, Emory University, Atlanta, GA, USA
| | - Scott Klasky
- Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Joel Saltz
- Biomedical Informatics Department, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
8
|
Andrade G, Ferreira R, Teodoro G, Rocha L, Saltz JH, Kurc T. Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems. PROCEEDINGS. SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING 2014; 2014:89-96. [PMID: 26640423 PMCID: PMC4670037 DOI: 10.1109/sbac-pad.2014.15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
High performance computing is experiencing a major paradigm shift with the introduction of accelerators, such as graphics processing units (GPUs) and Intel Xeon Phi (MIC). These processors have made available a tremendous computing power at low cost, and are transforming machines into hybrid systems equipped with CPUs and accelerators. Although these systems can deliver a very high peak performance, making full use of its resources in real-world applications is a complex problem. Most current applications deployed to these machines are still being executed in a single processor, leaving other devices underutilized. In this paper we explore a scenario in which applications are composed of hierarchical data flow tasks which are allocated to nodes of a distributed memory machine in coarse-grain, but each of them may be composed of several finer-grain tasks which can be allocated to different devices within the node. We propose and implement novel performance aware scheduling techniques that can be used to allocate tasks to devices. We evaluate our techniques using a pathology image analysis application used to investigate brain cancer morphology, and our experimental evaluation shows that the proposed scheduling strategies significantly outperforms other efficient scheduling techniques, such as Heterogeneous Earliest Finish Time - HEFT, in cooperative executions using CPUs, GPUs, and MICs. We also experimentally show that our strategies are less sensitive to inaccuracy in the scheduling input data and that the performance gains are maintained as the application scales.
Collapse
|
9
|
Teodoro G, Kurc T, Kong J, Cooper L, Saltz J. Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU: A Case Study from Microscopy Image Analysis. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS : A PUBLICATION OF THE IEEE COMPUTER SOCIETY 2014; 2014:1063-1072. [PMID: 25419088 PMCID: PMC4240026 DOI: 10.1109/ipdps.2014.111] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).
Collapse
Affiliation(s)
- George Teodoro
- Department of Computer Science, University of Brasília, Brasília, DF, Brazil
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA ; Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jun Kong
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Lee Cooper
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
10
|
Kong J, Wang F, Teodoro G, Cooper L, Moreno CS, Kurc T, Pan T, Saltz J, Brat D. High-Performance Computational Analysis of Glioblastoma Pathology Images with Database Support Identifies Molecular and Survival Correlates. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2013:229-236. [PMID: 25098236 DOI: 10.1109/bibm.2013.6732495] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
In this paper, we present a novel framework for microscopic image analysis of nuclei, data management, and high performance computation to support translational research involving nuclear morphometry features, molecular data, and clinical outcomes. Our image analysis pipeline consists of nuclei segmentation and feature computation facilitated by high performance computing with coordinated execution in multi-core CPUs and Graphical Processor Units (GPUs). All data derived from image analysis are managed in a spatial relational database supporting highly efficient scientific queries. We applied our image analysis workflow to 159 glioblastomas (GBM) from The Cancer Genome Atlas dataset. With integrative studies, we found statistics of four specific nuclear features were significantly associated with patient survival. Additionally, we correlated nuclear features with molecular data and found interesting results that support pathologic domain knowledge. We found that Proneural subtype GBMs had the smallest mean of nuclear Eccentricity and the largest mean of nuclear Extent, and MinorAxisLength. We also found gene expressions of stem cell marker MYC and cell proliferation maker MKI67 were correlated with nuclear features. To complement and inform pathologists of relevant diagnostic features, we queried the most representative nuclear instances from each patient population based on genetic and transcriptional classes. Our results demonstrate that specific nuclear features carry prognostic significance and associations with transcriptional and genetic classes, highlighting the potential of high throughput pathology image analysis as a complementary approach to human-based review and translational research.
Collapse
Affiliation(s)
- Jun Kong
- Department of Biomedical Informatics, Emory University
| | - Fusheng Wang
- Department of Biomedical Informatics, Emory University
| | - George Teodoro
- Department of Biomedical Informatics, Emory University ; College of Computing, Georgia Institute of Technology
| | - Lee Cooper
- Department of Biomedical Informatics, Emory University
| | - Carlos S Moreno
- Department of Pathology and Laboratory Medicine, Emory University
| | - Tahsin Kurc
- Department of Biomedical Informatics, Emory University
| | - Tony Pan
- Department of Biomedical Informatics, Emory University
| | - Joel Saltz
- Department of Biomedical Informatics, Emory University
| | - Daniel Brat
- Department of Pathology and Laboratory Medicine, Emory University
| |
Collapse
|
11
|
Saltz J, Teodoro G, Pan T, Cooper L, Kong J, Klasky S, Kurc T. Feature-based Analysis of Large-scale Spatio-Temporal Sensor Data on Hybrid Architectures. THE INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2013; 27:263-272. [PMID: 28496298 PMCID: PMC5423684 DOI: 10.1177/1094342013488260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Analysis of large sensor datasets for structural and functional features has applications in many domains, including weather and climate modeling, characterization of subsurface reservoirs, and biomedicine. The vast amount of data obtained from state-of-the-art sensors and the computational cost of analysis operations create a barrier to such analyses. In this paper, we describe middleware system support to take advantage of large clusters of hybrid CPU-GPU nodes to address the data and compute-intensive requirements of feature-based analyses in large spatio-temporal datasets.
Collapse
Affiliation(s)
- Joel Saltz
- Center for Comprehensive Informatics and Biomedical Informatics Department Emory University
| | - George Teodoro
- Center for Comprehensive Informatics and Biomedical Informatics Department Emory University
| | - Tony Pan
- Center for Comprehensive Informatics and Biomedical Informatics Department Emory University
| | - Lee Cooper
- Center for Comprehensive Informatics and Biomedical Informatics Department Emory University
| | - Jun Kong
- Center for Comprehensive Informatics and Biomedical Informatics Department Emory University
| | - Scott Klasky
- Scientific Data Group, Oak Ridge National Laboratory
| | - Tahsin Kurc
- Center for Comprehensive Informatics and Biomedical Informatics Department Emory University
- Scientific Data Group, Oak Ridge National Laboratory
| |
Collapse
|
12
|
Teodoro G, Pan T, Kurc T, Kong J, Cooper L, Saltz J. Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines. PARALLEL COMPUTING 2013; 39:189-211. [PMID: 23908562 PMCID: PMC3727669 DOI: 10.1016/j.parco.2013.03.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We address the problem of efficient execution of a computation pattern, referred to here as the irregular wavefront propagation pattern (IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in several image processing operations. In the IWPP, data elements in the wavefront propagate waves to their neighboring elements on a grid if a propagation condition is satisfied. Elements receiving the propagated waves become part of the wavefront. This pattern results in irregular data accesses and computations. We develop and evaluate strategies for efficient computation and propagation of wavefronts using a multi-level queue structure. This queue structure improves the utilization of fast memories in a GPU and reduces synchronization overheads. We also develop a tile-based parallelization strategy to support execution on multiple CPUs and GPUs. We evaluate our approaches on a state-of-the-art GPU accelerated machine (equipped with 3 GPUs and 2 multicore CPUs) using the IWPP implementations of two widely used image processing operations: morphological reconstruction and euclidean distance transform. Our results show significant performance improvements on GPUs. The use of multiple CPUs and GPUs cooperatively attains speedups of 50× and 85× with respect to single core CPU executions for morphological reconstruction and euclidean distance transform, respectively.
Collapse
Affiliation(s)
- George Teodoro
- Center for Comprehensive Informatics and Biomedical Informatics Department, Emory University, Atlanta, GA 30322
| | - Tony Pan
- Center for Comprehensive Informatics and Biomedical Informatics Department, Emory University, Atlanta, GA 30322
| | - Tahsin Kurc
- Center for Comprehensive Informatics and Biomedical Informatics Department, Emory University, Atlanta, GA 30322
| | - Jun Kong
- Center for Comprehensive Informatics and Biomedical Informatics Department, Emory University, Atlanta, GA 30322
| | - Lee Cooper
- Center for Comprehensive Informatics and Biomedical Informatics Department, Emory University, Atlanta, GA 30322
| | - Joel Saltz
- Center for Comprehensive Informatics and Biomedical Informatics Department, Emory University, Atlanta, GA 30322
| |
Collapse
|