1
|
Chen J, Yang W, Jia Z, Xiao L, Liu S. Dynamic Color Assignment for Hierarchical Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:338-348. [PMID: 39250387 DOI: 10.1109/tvcg.2024.3456386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Assigning discriminable and harmonic colors to samples according to their class labels and spatial distribution can generate attractive visualizations and facilitate data exploration. However, as the number of classes increases, it is challenging to generate a high-quality color assignment result that accommodates all classes simultaneously. A practical solution is to organize classes into a hierarchy and then dynamically assign colors during exploration. However, existing color assignment methods fall short in generating high-quality color assignment results and dynamically aligning them with hierarchical structures. To address this issue, we develop a dynamic color assignment method for hierarchical data, which is formulated as a multi-objective optimization problem. This method simultaneously considers color discriminability, color harmony, and spatial distribution at each hierarchical level. By using the colors of parent classes to guide the color assignment of their child classes, our method further promotes both consistency and clarity across hierarchical levels. We demonstrate the effectiveness of our method in generating dynamic color assignment results with quantitative experiments and a user study.
Collapse
|
2
|
Quadri GJ, Nieves JA, Wiernik BM, Rosen P. Automatic Scatterplot Design Optimization for Clustering Identification. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:4312-4327. [PMID: 35816525 DOI: 10.1109/tvcg.2022.3189883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Scatterplots are among the most widely used visualization techniques. Compelling scatterplot visualizations improve understanding of data by leveraging visual perception to boost awareness when performing specific visual analytic tasks. Design choices in scatterplots, such as graphical encodings or data aspects, can directly impact decision-making quality for low-level tasks like clustering. Hence, constructing frameworks that consider both the perceptions of the visual encodings and the task being performed enables optimizing visualizations to maximize efficacy. In this article, we propose an automatic tool to optimize the design factors of scatterplots to reveal the most salient cluster structure. Our approach leverages the merge tree data structure to identify the clusters and optimize the choice of subsampling algorithm, sampling rate, marker size, and marker opacity used to generate a scatterplot image. We validate our approach with user and case studies that show it efficiently provides high-quality scatterplot designs from a large parameter space.
Collapse
|
3
|
Li Z, Shi R, Liu Y, Long S, Guo Z, Jia S, Zhang J. Dual Space Coupling Model Guided Overlap-Free Scatterplot. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:657-667. [PMID: 36260569 DOI: 10.1109/tvcg.2022.3209459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The overdraw problem of scatterplots seriously interferes with the visual tasks. Existing methods, such as data sampling, node dispersion, subspace mapping, and visual abstraction, cannot guarantee the correspondence and consistency between the data points that reflect the intrinsic original data distribution and the corresponding visual units that reveal the presented data distribution, thus failing to obtain an overlap-free scatterplot with unbiased and lossless data distribution. A dual space coupling model is proposed in this paper to represent the complex bilateral relationship between data space and visual space theoretically and analytically. Under the guidance of the model, an overlap-free scatterplot method is developed through integration of the following: a geometry-based data transformation algorithm, namely DistributionTranscriptor; an efficient spatial mutual exclusion guided view transformation algorithm, namely PolarPacking; an overlap-free oriented visual encoding configuration model and a radius adjustment tool, namely frdraw. Our method can ensure complete and accurate information transfer between the two spaces, maintaining consistency between the newly created scatterplot and the original data distribution on global and local features. Quantitative evaluation proves our remarkable progress on computational efficiency compared with the state-of-the-art methods. Three applications involving pattern enhancement, interaction improvement, and overdraw mitigation of trajectory visualization demonstrate the broad prospects of our method.
Collapse
|
4
|
Pu J, Shao H, Gao B, Zhu Z, Zhu Y, Rao Y, Xiang Y. matExplorer: Visual Exploration on Predicting Ionic Conductivity for Solid-state Electrolytes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:65-75. [PMID: 34587048 DOI: 10.1109/tvcg.2021.3114812] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Lithium ion batteries (LIBs) are widely used as important energy sources for mobile phones, electric vehicles, and drones. Experts have attempted to replace liquid electrolytes with solid electrolytes that have wider electrochemical window and higher stability due to the potential safety risks, such as electrolyte leakage, flammable solvents, poor thermal stability, and many side reactions caused by liquid electrolytes. However, finding suitable alternative materials using traditional approaches is very difficult due to the incredibly high cost in searching. Machine learning (ML)-based methods are currently introduced and used for material prediction. However, learning tools designed for domain experts to conduct intuitive performance comparison and analysis of ML models are rare. In this case, we propose an interactive visualization system for experts to select suitable ML models and understand and explore the predication results comprehensively. Our system uses a multifaceted visualization scheme designed to support analysis from various perspectives, such as feature distribution, data similarity, model performance, and result presentation. Case studies with actual lab experiments have been conducted by the experts, and the final results confirmed the effectiveness and helpfulness of our system.
Collapse
|
5
|
Chen X, Zhang J, Fu CW, Fekete JD, Wang Y. Pyramid-based Scatterplots Sampling for Progressive and Streaming Data Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:593-603. [PMID: 34587089 DOI: 10.1109/tvcg.2021.3114880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We present a pyramid-based scatterplot sampling technique to avoid overplotting and enable progressive and streaming visualization of large data. Our technique is based on a multiresolution pyramid-based decomposition of the underlying density map and makes use of the density values in the pyramid to guide the sampling at each scale for preserving the relative data densities and outliers. We show that our technique is competitive in quality with state-of-the-art methods and runs faster by about an order of magnitude. Also, we have adapted it to deliver progressive and streaming data visualization by processing the data in chunks and updating the scatterplot areas with visible changes in the density map. A quantitative evaluation shows that our approach generates stable and faithful progressive samples that are comparable to the state-of-the-art method in preserving relative densities and superior to it in keeping outliers and stability when switching frames. We present two case studies that demonstrate the effectiveness of our approach for exploring large data.
Collapse
|
6
|
Construct boundaries and place labels for multi-class scatterplots. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-021-00791-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
7
|
Zheng F, Wen J, Zhang X, Chen Y, Zhang X, Liu Y, Xu T, Chen X, Wang Y, Su W, Zhou Z. Visual abstraction of large-scale geographical point data with credible spatial interpolation. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-021-00777-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
Visual selection of standard wells for large scale logging data via discrete choice model. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.01.105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Tao W, Hou X, Sah A, Battle L, Chang R, Stonebraker M. Kyrix-S: Authoring Scalable Scatterplot Visualizations of Big Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:401-411. [PMID: 33048700 DOI: 10.1109/tvcg.2020.3030372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Static scatterplots often suffer from the overdraw problem on big datasets where object overlap causes undesirable visual clutter. The use of zooming in scatterplots can help alleviate this problem. With multiple zoom levels, more screen real estate is available, allowing objects to be placed in a less crowded way. We call this type of visualization scalable scatterplot visualizations, or SSV for short. Despite the potential of SSVs, existing systems and toolkits fall short in supporting the authoring of SSVs due to three limitations. First, many systems have limited scalability, assuming that data fits in the memory of one computer. Second, too much developer work, e.g., using custom code to generate mark layouts or render objects, is required. Third, many systems focus on only a small subset of the SSV design space (e.g. supporting a specific type of visual marks). To address these limitations, we have developed Kyrix-S, a system for easy authoring of SSVs at scale. Kyrix-S derives a declarative grammar that enables specification of a variety of SSVs in a few tens of lines of code, based on an existing survey of scatterplot tasks and designs. The declarative grammar is supported by a distributed layout algorithm which automatically places visual marks onto zoom levels. We store data in a multi-node database and use multi-node spatial indexes to achieve interactive browsing of large SSVs. Extensive experiments show that 1) Kyrix-S enables interactive browsing of SSVs of billions of objects, with response times under 500ms and 2) Kyrix-S achieves 4X-9X reduction in specification compared to a state-of-the-art authoring system.
Collapse
|
10
|
Lu K, Feng M, Chen X, Sedlmair M, Deussen O, Lischinski D, Cheng Z, Wang Y. Palettailor: Discriminable Colorization for Categorical Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:475-484. [PMID: 33048720 DOI: 10.1109/tvcg.2020.3030406] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We present an integrated approach for creating and assigning color palettes to different visualizations such as multi-class scatterplots, line, and bar charts. While other methods separate the creation of colors from their assignment, our approach takes data characteristics into account to produce color palettes, which are then assigned in a way that fosters better visual discrimination of classes. To do so, we use a customized optimization based on simulated annealing to maximize the combination of three carefully designed color scoring functions: point distinctness, name difference, and color discrimination. We compare our approach to state-of-the-art palettes with a controlled user study for scatterplots and line charts, furthermore we performed a case study. Our results show that Palettailor, as a fully-automated approach, generates color palettes with a higher discrimination quality than existing approaches. The efficiency of our optimization allows us also to incorporate user modifications into the color selection process.
Collapse
|
11
|
Yuan J, Xiang S, Xia J, Yu L, Liu S. Evaluation of Sampling Methods for Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1720-1730. [PMID: 33074820 DOI: 10.1109/tvcg.2020.3030432] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Given a scatterplot with tens of thousands of points or even more, a natural question is which sampling method should be used to create a small but "good" scatterplot for a better abstraction. We present the results of a user study that investigates the influence of different sampling strategies on multi-class scatterplots. The main goal of this study is to understand the capability of sampling methods in preserving the density, outliers, and overall shape of a scatterplot. To this end, we comprehensively review the literature and select seven typical sampling strategies as well as eight representative datasets. We then design four experiments to understand the performance of different strategies in maintaining: 1) region density; 2) class density; 3) outliers; and 4) overall shape in the sampling results. The results show that: 1) random sampling is preferred for preserving region density; 2) blue noise sampling and random sampling have comparable performance with the three multi-class sampling strategies in preserving class density; 3) outlier biased density based sampling, recursive subdivision based sampling, and blue noise sampling perform the best in keeping outliers; and 4) blue noise sampling outperforms the others in maintaining the overall shape of a scatterplot.
Collapse
|
12
|
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R. A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1385-1395. [PMID: 33035164 DOI: 10.1109/tvcg.2020.3028888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Collapse
|
13
|
Quadri GJ, Rosen P. Modeling the Influence of Visual Density on Cluster Perception in Scatterplots Using Topology. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1829-1839. [PMID: 33048695 DOI: 10.1109/tvcg.2020.3030365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on four factors-distribution size of clusters, number of points, size of points, and opacity of points-that influence cluster identification in scatterplots. From these parameters, we have constructed two models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.
Collapse
|
14
|
Xiao Y, Wang C, Li K, Liu B, Guo J, Wang W. A new nonlinear dot plots visualization based on an undirected reassignment algorithm. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00711-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
15
|
Ma Y, Maciejewski R. Visual Analysis of Class Separations With Locally Linear Segments. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:241-253. [PMID: 32746282 DOI: 10.1109/tvcg.2020.3011155] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
High-dimensional labeled data widely exists in many real-world applications such as classification and clustering. One main task in analyzing such datasets is to explore class separations and class boundaries derived from machine learning models. Dimension reduction techniques are commonly applied to support analysts in exploring the underlying decision boundary structures by depicting a low-dimensional representation of the data distributions from multiple classes. However, such projection-based analyses are limited due to their lack of ability to show separations in complex non-linear decision boundary structures and can suffer from heavy distortion and low interpretability. To overcome these issues of separability and interpretability, we propose a visual analysis approach that utilizes the power of explainability from linear projections to support analysts when exploring non-linear separation structures. Our approach is to extract a set of locally linear segments that approximate the original non-linear separations. Unlike traditional projection-based analysis where the data instances are mapped to a single scatterplot, our approach supports the exploration of complex class separations through multiple local projection results. We conduct case studies on two labeled datasets to demonstrate the effectiveness of our approach.
Collapse
|
16
|
|
17
|
Ma Y, Tung AKH, Wang W, Gao X, Pan Z, Chen W. ScatterNet: A Deep Subjective Similarity Model for Visual Analysis of Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1562-1576. [PMID: 30334762 DOI: 10.1109/tvcg.2018.2875702] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Similarity measuring methods are widely adopted in a broad range of visualization applications. In this work, we address the challenge of representing human perception in the visual analysis of scatterplots by introducing a novel deep-learning-based approach, ScatterNet, captures perception-driven similarities of such plots. The approach exploits deep neural networks to extract semantic features of scatterplot images for similarity calculation. We create a large labeled dataset consisting of similar and dissimilar images of scatterplots to train the deep neural network. We conduct a set of evaluations including performance experiments and a user study to demonstrate the effectiveness and efficiency of our approach. The evaluations confirm that the learned features capture the human perception of scatterplot similarity effectively. We describe two scenarios to show how ScatterNet can be applied in visual analysis applications.
Collapse
|
18
|
|
19
|
Lu M, Wang S, Lanir J, Fish N, Yue Y, Cohen-Or D, Huang H. Winglets: Visualizing Association with Uncertainty in Multi-class Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:770-779. [PMID: 31562094 DOI: 10.1109/tvcg.2019.2934811] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This work proposes Winglets, an enhancement to the classic scatterplot to better perceptually pronounce multiple classes by improving the perception of association and uncertainty of points to their related cluster. Designed as a pair of dual-sided strokes belonging to a data point, Winglets leverage the Gestalt principle of Closure to shape the perception of the form of the clusters, rather than use an explicit divisive encoding. Through a subtle design of two dominant attributes, length and orientation, Winglets enable viewers to perform a mental completion of the clusters. A controlled user study was conducted to examine the efficiency of Winglets in perceiving the cluster association and the uncertainty of certain points. The results show Winglets form a more prominent association of points into clusters and improve the perception of associating uncertainty.
Collapse
|
20
|
Mei H, Chen W, Wei Y, Hu Y, Zhou S, Lin B, Zhao Y, Xia J. RSATree: Distribution-Aware Data Representation of Large-Scale Tabular Datasets for Flexible Visual Query. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1161-1171. [PMID: 31443022 DOI: 10.1109/tvcg.2019.2934800] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Analysts commonly investigate the data distributions derived from statistical aggregations of data that are represented by charts, such as histograms and binned scatterplots, to visualize and analyze a large-scale dataset. Aggregate queries are implicitly executed through such a process. Datasets are constantly extremely large; thus, the response time should be accelerated by calculating predefined data cubes. However, the queries are limited to the predefined binning schema of preprocessed data cubes. Such limitation hinders analysts' flexible adjustment of visual specifications to investigate the implicit patterns in the data effectively. Particularly, RSATree enables arbitrary queries and flexible binning strategies by leveraging three schemes, namely, an R-tree-based space partitioning scheme to catch the data distribution, a locality-sensitive hashing technique to achieve locality-preserving random access to data items, and a summed area table scheme to support interactive query of aggregated values with a linear computational complexity. This study presents and implements a web-based visual query system that supports visual specification, query, and exploration of large-scale tabular data with user-adjustable granularities. We demonstrate the efficiency and utility of our approach by performing various experiments on real-world datasets and analyzing time and space complexity.
Collapse
|
21
|
Chen C, Wang C, Bai X, Zhang P, Li C. GenerativeMap: Visualization and Exploration of Dynamic Density Maps via Generative Learning Model. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:216-226. [PMID: 31443026 DOI: 10.1109/tvcg.2019.2934806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The density map is widely used for data sampling, time-varying detection, ensemble representation, etc. The visualization of dynamic evolution is a challenging task when exploring spatiotemporal data. Many approaches have been provided to explore the variation of data patterns over time, which commonly need multiple parameters and preprocessing works. Image generation is a well-known topic in deep learning, and a variety of generating models have been promoted in recent years. In this paper, we introduce a general pipeline called GenerativeMap to extract dynamics of density maps by generating interpolation information. First, a trained generative model comprises an important part of our approach, which can generate nonlinear and natural results by implementing a few parameters. Second, a visual presentation is proposed to show the density change, which is combined with the level of detail and blue noise sampling for a better visual effect. Third, for dynamic visualization of large-scale density maps, we extend this approach to show the evolution in regions of interest, which costs less to overcome the drawback of the learning-based generative model. We demonstrate our method on different types of cases, and we evaluate and compare the approach from multiple aspects. The results help identify the effectiveness of our approach and confirm its applicability in different scenarios.
Collapse
|
22
|
Chen X, Ge T, Zhang J, Chen B, Fu CW, Deussen O, Wang Y. A Recursive Subdivision Technique for Sampling Multi-class Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:729-738. [PMID: 31442987 DOI: 10.1109/tvcg.2019.2934541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We present a non-uniform recursive sampling technique for multi-class scatterplots, with the specific goal of faithfully presenting relative data and class densities, while preserving major outliers in the plots. Our technique is based on a customized binary kd-tree, in which leaf nodes are created by recursively subdividing the underlying multi-class density map. By backtracking, we merge leaf nodes until they encompass points of all classes for our subsequently applied outlier-aware multi-class sampling strategy. A quantitative evaluation shows that our approach can better preserve outliers and at the same time relative densities in multi-class scatterplots compared to the previous approaches, several case studies demonstrate the effectiveness of our approach in exploring complex and real world data.
Collapse
|
23
|
Ma Y, Xie T, Li J, Maciejewski R. Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1075-1085. [PMID: 31478859 DOI: 10.1109/tvcg.2019.2934631] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Machine learning models are currently being deployed in a variety of real-world applications where model predictions are used to make decisions about healthcare, bank loans, and numerous other critical tasks. As the deployment of artificial intelligence technologies becomes ubiquitous, it is unsurprising that adversaries have begun developing methods to manipulate machine learning models to their advantage. While the visual analytics community has developed methods for opening the black box of machine learning models, little work has focused on helping the user understand their model vulnerabilities in the context of adversarial attacks. In this paper, we present a visual analytics framework for explaining and exploring model vulnerabilities to adversarial attacks. Our framework employs a multi-faceted visualization scheme designed to support the analysis of data poisoning attacks from the perspective of models, data instances, features, and local structures. We demonstrate our framework through two case studies on binary classifiers and illustrate model vulnerabilities with respect to varying attack strategies.
Collapse
|
24
|
Wei Y, Mei H, Zhao Y, Zhou S, Lin B, Jiang H, Chen W. Evaluating Perceptual Bias During Geometric Scaling of Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:321-331. [PMID: 31403425 DOI: 10.1109/tvcg.2019.2934208] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Scatterplots are frequently scaled to fit display areas in multi-view and multi-device data analysis environments. A common method used for scaling is to enlarge or shrink the entire scatterplot together with the inside points synchronously and proportionally. This process is called geometric scaling. However, geometric scaling of scatterplots may cause a perceptual bias, that is, the perceived and physical values of visual features may be dissociated with respect to geometric scaling. For example, if a scatterplot is projected from a laptop to a large projector screen, then observers may feel that the scatterplot shown on the projector has fewer points than that viewed on the laptop. This paper presents an evaluation study on the perceptual bias of visual features in scatterplots caused by geometric scaling. The study focuses on three fundamental visual features (i.e., numerosity, correlation, and cluster separation) and three hypotheses that are formulated on the basis of our experience. We carefully design three controlled experiments by using well-prepared synthetic data and recruit participants to complete the experiments on the basis of their subjective experience. With a detailed analysis of the experimental results, we obtain a set of instructive findings. First, geometric scaling causes a bias that has a linear relationship with the scale ratio. Second, no significant difference exists between the biases measured from normally and uniformly distributed scatterplots. Third, changing the point radius can correct the bias to a certain extent. These findings can be used to inspire the design decisions of scatterplots in various scenarios.
Collapse
|
25
|
Hu R, Sha T, Van Kaick O, Deussen O, Huang H. Data Sampling in Multi-view and Multi-class Scatterplots via Set Cover Optimization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:739-748. [PMID: 31443021 DOI: 10.1109/tvcg.2019.2934799] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We present a method for data sampling in scatterplots by jointly optimizing point selection for different views or classes. Our method uses space-filling curves (Z-order curves) that partition a point set into subsets that, when covered each by one sample, provide a sampling or coreset with good approximation guarantees in relation to the original point set. For scatterplot matrices with multiple views, different views provide different space-filling curves, leading to different partitions of the given point set. For multi-class scatterplots, the focus on either per-class distribution or global distribution provides two different partitions of the given point set that need to be considered in the selection of the coreset. For both cases, we convert the coreset selection problem into an Exact Cover Problem (ECP), and demonstrate with quantitative and qualitative evaluations that an approximate solution that solves the ECP efficiently is able to provide high-quality samplings.
Collapse
|
26
|
Xie X, Cai X, Zhou J, Cao N, Wu Y. A Semantic-Based Method for Visualizing Large Image Collections. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2362-2377. [PMID: 29993720 DOI: 10.1109/tvcg.2018.2835485] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Interactive visualization of large image collections is important and useful in many applications, such as personal album management and user profiling on images. However, most prior studies focus on using low-level visual features of images, such as texture and color histogram, to create visualizations without considering the more important semantic information embedded in images. This paper proposes a novel visual analytic system to analyze images in a semantic-aware manner. The system mainly comprises two components: a semantic information extractor and a visual layout generator. The semantic information extractor employs an image captioning technique based on convolutional neural network (CNN) to produce descriptive captions for images, which can be transformed into semantic keywords. The layout generator employs a novel co-embedding model to project images and the associated semantic keywords to the same 2D space. Inspired by the galaxy metaphor, we further turn the projected 2D space to a galaxy visualization of images, in which semantic keywords and images are visually encoded as stars and planets. Our system naturally supports multi-scale visualization and navigation, in which users can immediately see a semantic overview of an image collection and drill down for detailed inspection of a certain group of images. Users can iteratively refine the visual layout by integrating their domain knowledge into the co-embedding process. Two task-based evaluations are conducted to demonstrate the effectiveness of our system.
Collapse
|
27
|
Luo X, Yuan Y, Zhang K, Xia J, Zhou Z, Chang L, Gu T. Enhancing statistical charts: toward better data visualization and analysis. J Vis (Tokyo) 2019. [DOI: 10.1007/s12650-019-00569-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
28
|
Raidou RG, Groller ME, Eisemann M. Relaxing Dense Scatter Plots with Pixel-Based Mappings. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2205-2216. [PMID: 30892214 DOI: 10.1109/tvcg.2019.2903956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Scatter plots are the most commonly employed technique for the visualization of bivariate data. Despite their versatility and expressiveness in showing data aspects, such as clusters, correlations, and outliers, scatter plots face a main problem. For large and dense data, the representation suffers from clutter due to overplotting. This is often partially solved with the use of density plots. Yet, data overlap may occur in certain regions of a scatter or density plot, while other regions may be partially, or even completely empty. Adequate pixel-based techniques can be employed for effectively filling the plotting space, giving an additional notion of the numerosity of data motifs or clusters. We propose the Pixel-Relaxed Scatter Plots, a new and simple variant, to improve the display of dense scatter plots, using pixel-based, space-filling mappings. Our Pixel-Relaxed Scatter Plots make better use of the plotting canvas, while avoiding data overplotting, and optimizing space coverage and insight in the presence and size of data motifs. We have employed different methods to map scatter plot points to pixels and to visually present this mapping. We demonstrate our approach on several synthetic and realistic datasets, and we discuss the suitability of our technique for different tasks. Our conducted user evaluation shows that our Pixel-Relaxed Scatter Plots can be a useful enhancement to traditional scatter plots.
Collapse
|
29
|
Abstract
The contact center industry represents a large proportion of many country’s economies. For example, 4% of the entire United States and UK’s working population is employed in this sector. As in most modern industries, contact centers generate gigabytes of operational data that require analysis to provide insight and to improve efficiency. Visualization is a valuable approach to data analysis, enabling trends and correlations to be discovered, particularly when using scatterplots. We present a feature-rich application that visualizes large call center data sets using scatterplots that support millions of points. The application features a scatterplot matrix to provide an overview of the call center data attributes, animation of call start and end times, and utilizes both the CPU and GPU acceleration for processing and filtering. We illustrate the use of the Open Computing Language (OpenCL) to utilize a commodity graphics card for the fast filtering of fields with multiple attributes. We demonstrate the use of the application with millions of call events from a month’s worth of real-world data and report domain expert feedback from our industry partner.
Collapse
|
30
|
Zhang J, Wang Y, Molino P, Li L, Ebert DS. Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:364-373. [PMID: 30130197 DOI: 10.1109/tvcg.2018.2864499] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Interpretation and diagnosis of machine learning models have gained renewed interest in recent years with breakthroughs in new approaches. We present Manifold, a framework that utilizes visual analysis techniques to support interpretation, debugging, and comparison of machine learning models in a more transparent and interactive manner. Conventional techniques usually focus on visualizing the internal logic of a specific model type (i.e., deep neural networks), lacking the ability to extend to a more complex scenario where different model types are integrated. To this end, Manifold is designed as a generic framework that does not rely on or access the internal logic of the model and solely observes the input (i.e., instances or features) and the output (i.e., the predicted result and probability distribution). We describe the workflow of Manifold as an iterative process consisting of three major phases that are commonly involved in the model development and diagnosis process: inspection (hypothesis), explanation (reasoning), and refinement (verification). The visual components supporting these tasks include a scatterplot-based visual summary that overviews the models' outcome and a customizable tabular view that reveals feature discrimination. We demonstrate current applications of the framework on the classification and regression tasks and discuss other potential machine learning use scenarios where Manifold can be applied.
Collapse
|
31
|
Exploring linear projections for revealing clusters, outliers, and trends in subsets of multi-dimensional datasets. JOURNAL OF VISUAL LANGUAGES AND COMPUTING 2018. [DOI: 10.1016/j.jvlc.2018.08.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
32
|
Zhou Z, Shi C, Hu M, Liu Y. Visual ranking of academic influence via paper citation. JOURNAL OF VISUAL LANGUAGES AND COMPUTING 2018. [DOI: 10.1016/j.jvlc.2018.08.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
33
|
Weng D, Chen R, Deng Z, Wu F, Chen J, Wu Y. SRVis: Towards Better Spatial Integration in Ranking Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:459-469. [PMID: 30188825 DOI: 10.1109/tvcg.2018.2865126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Interactive ranking techniques have substantially promoted analysts' ability in making judicious and informed decisions effectively based on multiple criteria. However, the existing techniques cannot satisfactorily support the analysis tasks involved in ranking large-scale spatial alternatives, such as selecting optimal locations for chain stores, where the complex spatial contexts involved are essential to the decision-making process. Limitations observed in the prior attempts of integrating rankings with spatial contexts motivate us to develop a context-integrated visual ranking technique. Based on a set of generic design requirements we summarized by collaborating with domain experts, we propose SRVis, a novel spatial ranking visualization technique that supports efficient spatial multi-criteria decision-making processes by addressing three major challenges in the aforementioned context integration, namely, a) the presentation of spatial rankings and contexts, b) the scalability of rankings' visual representations, and c) the analysis of context-integrated spatial rankings. Specifically, we encode massive rankings and their cause with scalable matrix-based visualizations and stacked bar charts based on a novel two-phase optimization framework that minimizes the information loss, and the flexible spatial filtering and intuitive comparative analysis are adopted to enable the in-depth evaluation of the rankings and assist users in selecting the best spatial alternative. The effectiveness of the proposed technique has been evaluated and demonstrated with an empirical study of optimization methods, two case studies, and expert interviews.
Collapse
|
34
|
Liao H, Wu Y, Chen L, Chen W. Cluster-Based Visual Abstraction for Multivariate Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:2531-2545. [PMID: 28945596 DOI: 10.1109/tvcg.2017.2754480] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The use of scatterplots is an important method for multivariate data visualization. The point distribution on the scatterplot, along with variable values represented by each point, can help analyze underlying patterns in data. However, determining the multivariate data variation on a scatterplot generated using projection methods, such as multidimensional scaling, is difficult. Furthermore, the point distribution becomes unclear when the data scale is large and clutter problems occur. These conditions can significantly decrease the usability of scatterplots on multivariate data analysis. In this study, we present a cluster-based visual abstraction method to enhance the visualization of multivariate scatterplots. Our method leverages an adapted multilabel clustering method to provide abstractions of high quality for scatterplots. An image-based method is used to deal with large scale data problem. Furthermore, a suite of glyphs is designed to visualize the data at different levels of detail and support data exploration. The view coordination between the glyph-based visualization and the table lens can effectively enhance the multivariate data analysis. Through numerical evaluations for data abstraction quality, case studies and a user study, we demonstrate the effectiveness and usability of the proposed techniques for multivariate data analysis on scatterplots.
Collapse
|
35
|
Liu S, Chen C, Lu Y, Ouyang F, Wang B. An Interactive Method to Improve Crowdsourced Annotations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:235-245. [PMID: 30130224 DOI: 10.1109/tvcg.2018.2864843] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In order to effectively infer correct labels from noisy crowdsourced annotations, learning-from-crowds models have introduced expert validation. However, little research has been done on facilitating the validation procedure. In this paper, we propose an interactive method to assist experts in verifying uncertain instance labels and unreliable workers. Given the instance labels and worker reliability inferred from a learning-from-crowds model, candidate instances and workers are selected for expert validation. The influence of verified results is propagated to relevant instances and workers through the learning-from-crowds model. To facilitate the validation of annotations, we have developed a confusion visualization to indicate the confusing classes for further exploration, a constrained projection method to show the uncertain labels in context, and a scatter-plot-based visualization to illustrate worker reliability. The three visualizations are tightly integrated with the learning-from-crowds model to provide an iterative and progressive environment for data validation. Two case studies were conducted that demonstrate our approach offers an efficient method for validating and improving crowdsourced annotations.
Collapse
|
36
|
Zhou Z, Meng L, Tang C, Zhao Y, Guo Z, Hu M, Chen W. Visual Abstraction of Large Scale Geospatial Origin-Destination Movement Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:43-53. [PMID: 30130199 DOI: 10.1109/tvcg.2018.2864503] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A variety of human movement datasets are represented in an Origin-Destination(OD) form, such as taxi trips, mobile phone locations, etc. As a commonly-used method to visualize OD data, flow map always fails to discover patterns of human mobility, due to massive intersections and occlusions of lines on a 2D geographical map. A large number of techniques have been proposed to reduce visual clutter of flow maps, such as filtering, clustering and edge bundling, but the correlations of OD flows are often neglected, which makes the simplified OD flow map present little semantic information. In this paper, a characterization of OD flows is established based on an analogy between OD flows and natural language processing (NPL) terms. Then, an iterative multi-objective sampling scheme is designed to select OD flows in a vectorized representation space. To enhance the readability of sampled OD flows, a set of meaningful visual encodings are designed to present the interactions of OD flows. We design and implement a visual exploration system that supports visual inspection and quantitative evaluation from a variety of perspectives. Case studies based on real-world datasets and interviews with domain experts have demonstrated the effectiveness of our system in reducing the visual clutter and enhancing correlations of OD flows.
Collapse
|
37
|
Li C, Baciu G, Han Y. StreamMap: Smooth Dynamic Visualization of High-Density Streaming Points. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:1381-1393. [PMID: 28212089 DOI: 10.1109/tvcg.2017.2668409] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Interactive visualization of streaming points for real-time scatterplots and linear blending of correlation patterns is increasingly becoming the dominant mode of visual analytics for both big data and streaming data from active sensors and broadcasting media. To better visualize and interact with inter-stream patterns, it is generally necessary to smooth out gaps or distortions in the streaming data. Previous approaches either animate the points directly or present a sampled static heat-map. We propose a new approach, called StreamMap, to smoothly blend high-density streaming points and create a visual flow that emphasizes the density pattern distributions. In essence, we present three new contributions for the visualization of high-density streaming points. The first contribution is a density-based method called super kernel density estimation that aggregates streaming points using an adaptive kernel to solve the overlapping problem. The second contribution is a robust density morphing algorithm that generates several smooth intermediate frames for a given pair of frames. The third contribution is a trend representation design that can help convey the flow directions of the streaming points. The experimental results on three datasets demonstrate the effectiveness of StreamMap when dynamic visualization and visual analysis of trend patterns on streaming points are required.
Collapse
|
38
|
Liu M, Shi J, Cao K, Zhu J, Liu S. Analyzing the Training Processes of Deep Generative Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:77-87. [PMID: 28866564 DOI: 10.1109/tvcg.2017.2744938] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Among the many types of deep models, deep generative models (DGMs) provide a solution to the important problem of unsupervised and semi-supervised learning. However, training DGMs requires more skill, experience, and know-how because their training is more complex than other types of deep models such as convolutional neural networks (CNNs). We develop a visual analytics approach for better understanding and diagnosing the training process of a DGM. To help experts understand the overall training process, we first extract a large amount of time series data that represents training dynamics (e.g., activation changes over time). A blue-noise polyline sampling scheme is then introduced to select time series samples, which can both preserve outliers and reduce visual clutter. To further investigate the root cause of a failed training process, we propose a credit assignment algorithm that indicates how other neurons contribute to the output of the neuron causing the training failure. Two case studies are conducted with machine learning experts to demonstrate how our approach helps understand and diagnose the training processes of DGMs. We also show how our approach can be directly used to analyze other types of deep models, such as CNNs.
Collapse
|
39
|
Sarikaya A, Gleicher M. Scatterplots: Tasks, Data, and Designs. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:402-412. [PMID: 28866528 DOI: 10.1109/tvcg.2017.2744184] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Traditional scatterplots fail to scale as the complexity and amount of data increases. In response, there exist many design options that modify or expand the traditional scatterplot design to meet these larger scales. This breadth of design options creates challenges for designers and practitioners who must select appropriate designs for particular analysis goals. In this paper, we help designers in making design choices for scatterplot visualizations. We survey the literature to catalog scatterplot-specific analysis tasks. We look at how data characteristics influence design decisions. We then survey scatterplot-like designs to understand the range of design options. Building upon these three organizations, we connect data characteristics, analysis tasks, and design choices in order to generate challenges, open questions, and example best practices for the effective design of scatterplots.
Collapse
|
40
|
Shen Q, Zeng W, Ye Y, Arisona SM, Schubiger S, Burkhard R, Qu H. StreetVizor: Visual Exploration of Human-Scale Urban Forms Based on Street Views. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:1004-1013. [PMID: 28866527 DOI: 10.1109/tvcg.2017.2744159] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Urban forms at human-scale, i.e., urban environments that individuals can sense (e.g., sight, smell, and touch) in their daily lives, can provide unprecedented insights on a variety of applications, such as urban planning and environment auditing. The analysis of urban forms can help planners develop high-quality urban spaces through evidence-based design. However, such analysis is complex because of the involvement of spatial, multi-scale (i.e., city, region, and street), and multivariate (e.g., greenery and sky ratios) natures of urban forms. In addition, current methods either lack quantitative measurements or are limited to a small area. The primary contribution of this work is the design of StreetVizor, an interactive visual analytics system that helps planners leverage their domain knowledge in exploring human-scale urban forms based on street view images. Our system presents two-stage visual exploration: 1) an AOI Explorer for the visual comparison of spatial distributions and quantitative measurements in two areas-of-interest (AOIs) at city- and region-scales; 2) and a Street Explorer with a novel parallel coordinate plot for the exploration of the fine-grained details of the urban forms at the street-scale. We integrate visualization techniques with machine learning models to facilitate the detection of street view patterns. We illustrate the applicability of our approach with case studies on the real-world datasets of four cities, i.e., Hong Kong, Singapore, Greater London and New York City. Interviews with domain experts demonstrate the effectiveness of our system in facilitating various analytical tasks.
Collapse
|
41
|
Wenskovitch J, Crandell I, Ramakrishnan N, House L, Leman S, North C. Towards a Systematic Combination of Dimension Reduction and Clustering in Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:131-141. [PMID: 28866581 DOI: 10.1109/tvcg.2017.2745258] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Dimension reduction algorithms and clustering algorithms are both frequently used techniques in visual analytics. Both families of algorithms assist analysts in performing related tasks regarding the similarity of observations and finding groups in datasets. Though initially used independently, recent works have incorporated algorithms from each family into the same visualization systems. However, these algorithmic combinations are often ad hoc or disconnected, working independently and in parallel rather than integrating some degree of interdependence. A number of design decisions must be addressed when employing dimension reduction and clustering algorithms concurrently in a visualization system, including the selection of each algorithm, the order in which they are processed, and how to present and interact with the resulting projection. This paper contributes an overview of combining dimension reduction and clustering into a visualization system, discussing the challenges inherent in developing a visualization system that makes use of both families of algorithms.
Collapse
|
42
|
Liu L, Boone AP, Ruginski IT, Padilla L, Hegarty M, Creem-Regehr SH, Thompson WB, Yuksel C, House DH. Uncertainty Visualization by Representative Sampling from Prediction Ensembles. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:2165-2178. [PMID: 28113666 DOI: 10.1109/tvcg.2016.2607204] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Data ensembles are often used to infer statistics to be used for a summary display of an uncertain prediction. In a spatial context, these summary displays have the drawback that when uncertainty is encoded via a spatial spread, display glyph area increases in size with prediction uncertainty. This increase can be easily confounded with an increase in the size, strength or other attribute of the phenomenon being presented. We argue that by directly displaying a carefully chosen subset of a prediction ensemble, so that uncertainty is conveyed implicitly, such misinterpretations can be avoided. Since such a display does not require uncertainty annotation, an information channel remains available for encoding additional information about the prediction. We demonstrate these points in the context of hurricane prediction visualizations, showing how we avoid occlusion of selected ensemble elements while preserving the spatial statistics of the original ensemble, and how an explicit encoding of uncertainty can also be constructed from such a selection. We conclude with the results of a cognitive experiment demonstrating that the approach can be used to construct storm prediction displays that significantly reduce the confounding of uncertainty with storm size, and thus improve viewers' ability to estimate potential for storm damage.
Collapse
|
43
|
Micallef L, Palmas G, Oulasvirta A, Weinkauf T. Towards Perceptual Optimization of the Visual Design of Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:1588-1599. [PMID: 28252407 DOI: 10.1109/tvcg.2017.2674978] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Designing a good scatterplot can be difficult for non-experts in visualization, because they need to decide on many parameters, such as marker size and opacity, aspect ratio, color, and rendering order. This paper contributes to research exploring the use of perceptual models and quality metrics to set such parameters automatically for enhanced visual quality of a scatterplot. A key consideration in this paper is the construction of a cost function to capture several relevant aspects of the human visual system, examining a scatterplot design for some data analysis task. We show how the cost function can be used in an optimizer to search for the optimal visual design for a user's dataset and task objectives (e.g., "reliable linear correlation estimation is more important than class separation"). The approach is extensible to different analysis tasks. To test its performance in a realistic setting, we pre-calibrated it for correlation estimation, class separation, and outlier detection. The optimizer was able to produce designs that achieved a level of speed and success comparable to that of those using human-designed presets (e.g., in R or MATLAB). Case studies demonstrate that the approach can adapt a design to the data, to reveal patterns without user intervention.
Collapse
|
44
|
Correll M, Heer J. Surprise! Bayesian Weighting for De-Biasing Thematic Maps. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:651-660. [PMID: 27875180 DOI: 10.1109/tvcg.2016.2598618] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Thematic maps are commonly used for visualizing the density of events in spatial data. However, these maps can mislead by giving visual prominence to known base rates (such as population densities) or to artifacts of sample size and normalization (such as outliers arising from smaller, and thus more variable, samples). In this work, we adapt Bayesian surprise to generate maps that counter these biases. Bayesian surprise, which has shown promise for modeling human visual attention, weights information with respect to how it updates beliefs over a space of models. We introduce Surprise Maps, a visualization technique that weights event data relative to a set of spatia-temporal models. Unexpected events (those that induce large changes in belief over the model space) are visualized more prominently than those that follow expected patterns. Using both synthetic and real-world datasets, we demonstrate how Surprise Maps overcome some limitations of traditional event maps.
Collapse
|
45
|
Continuous Learning Graphical Knowledge Unit for Cluster Identification in High Density Data Sets. Symmetry (Basel) 2016. [DOI: 10.3390/sym8120152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
46
|
Guiding the exploration of scatter plot data using motif-based interest measures. JOURNAL OF VISUAL LANGUAGES AND COMPUTING 2016. [DOI: 10.1016/j.jvlc.2016.07.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|