1
|
Montambault B, Appleby G, Rogers J, Brumar CD, Li M, Chang R. DimBridge: Interactive Explanation of Visual Patterns in Dimensionality Reductions with Predicate Logic. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:207-217. [PMID: 39312423 DOI: 10.1109/tvcg.2024.3456391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Dimensionality reduction techniques are widely used for visualizing high-dimensional data. However, support for interpreting patterns of dimension reduction results in the context of the original data space is often insufficient. Consequently, users may struggle to extract insights from the projections. In this paper, we introduce DimBridge, a visual analytics tool that allows users to interact with visual patterns in a projection and retrieve corresponding data patterns. DimBridge supports several interactions, allowing users to perform various analyses, from contrasting multiple clusters to explaining complex latent structures. Leveraging first-order predicate logic, DimBridge identifies subspaces in the original dimensions relevant to a queried pattern and provides an interface for users to visualize and interact with them. We demonstrate how DimBridge can help users overcome the challenges associated with interpreting visual patterns in projections.
Collapse
|
2
|
Eckelt K, Hinterreiter A, Adelberger P, Walchshofer C, Dhanoa V, Humer C, Heckmann M, Steinparz C, Streit M. Visual Exploration of Relationships and Structure in Low-Dimensional Embeddings. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:3312-3326. [PMID: 35254984 DOI: 10.1109/tvcg.2022.3156760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
In this work, we propose an interactive visual approach for the exploration and formation of structural relationships in embeddings of high-dimensional data. These structural relationships, such as item sequences, associations of items with groups, and hierarchies between groups of items, are defining properties of many real-world datasets. Nevertheless, most existing methods for the visual exploration of embeddings treat these structures as second-class citizens or do not take them into account at all. In our proposed analysis workflow, users explore enriched scatterplots of the embedding, in which relationships between items and/or groups are visually highlighted. The original high-dimensional data for single items, groups of items, or differences between connected items and groups are accessible through additional summary visualizations. We carefully tailored these summary and difference visualizations to the various data types and semantic contexts. During their exploratory analysis, users can externalize their insights by setting up additional groups and relationships between items and/or groups. We demonstrate the utility and potential impact of our approach by means of two use cases and multiple examples from various domains.
Collapse
|
3
|
Li J, Zhou CQ. Incorporation of Human Knowledge into Data Embeddings to Improve Pattern Significance and Interpretability. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:723-733. [PMID: 36155441 DOI: 10.1109/tvcg.2022.3209382] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Embedding is a common technique for analyzing multi-dimensional data. However, the embedding projection cannot always form significant and interpretable visual structures that foreshadow underlying data patterns. We propose an approach that incorporates human knowledge into data embeddings to improve pattern significance and interpretability. The core idea is (1) externalizing tacit human knowledge as explicit sample labels and (2) adding a classification loss in the embedding network to encode samples' classes. The approach pulls samples of the same class with similar data features closer in the projection, leading to more compact (significant) and class-consistent (interpretable) visual structures. We give an embedding network with a customized classification loss to implement the idea and integrate the network into a visualization system to form a workflow that supports flexible class creation and pattern exploration. Patterns found on open datasets in case studies, subjects' performance in a user study, and quantitative experiment results illustrate the general usability and effectiveness of the approach.
Collapse
|
4
|
Huang R, Li Q, Chen L, Yuan X. A Probability Density-Based Visual Analytics Approach to Forecast Bias Calibration. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1732-1744. [PMID: 32946394 DOI: 10.1109/tvcg.2020.3025072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Biases inevitably occur in numerical weather prediction (NWP) due to an idealized numerical assumption for modeling chaotic atmospheric systems. Therefore, the rapid and accurate identification and calibration of biases is crucial for NWP in weather forecasting. Conventional approaches, such as various analog post-processing forecast methods, have been designed to aid in bias calibration. However, these approaches fail to consider the spatiotemporal correlations of forecast bias, which can considerably affect calibration efficacy. In this article, we propose a novel bias pattern extraction approach based on forecasting-observation probability density by merging historical forecasting and observation datasets. Given a spatiotemporal scope, our approach extracts and fuses bias patterns and automatically divides regions with similar bias patterns. Termed BicaVis, our spatiotemporal bias pattern visual analytics system is proposed to assist experts in drafting calibration curves on the basis of these bias patterns. To verify the effectiveness of our approach, we conduct two case studies with real-world reanalysis datasets. The feedback collected from domain experts confirms the efficacy of our approach.
Collapse
|
5
|
Pu J, Shao H, Gao B, Zhu Z, Zhu Y, Rao Y, Xiang Y. matExplorer: Visual Exploration on Predicting Ionic Conductivity for Solid-state Electrolytes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:65-75. [PMID: 34587048 DOI: 10.1109/tvcg.2021.3114812] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Lithium ion batteries (LIBs) are widely used as important energy sources for mobile phones, electric vehicles, and drones. Experts have attempted to replace liquid electrolytes with solid electrolytes that have wider electrochemical window and higher stability due to the potential safety risks, such as electrolyte leakage, flammable solvents, poor thermal stability, and many side reactions caused by liquid electrolytes. However, finding suitable alternative materials using traditional approaches is very difficult due to the incredibly high cost in searching. Machine learning (ML)-based methods are currently introduced and used for material prediction. However, learning tools designed for domain experts to conduct intuitive performance comparison and analysis of ML models are rare. In this case, we propose an interactive visualization system for experts to select suitable ML models and understand and explore the predication results comprehensively. Our system uses a multifaceted visualization scheme designed to support analysis from various perspectives, such as feature distribution, data similarity, model performance, and result presentation. Case studies with actual lab experiments have been conducted by the experts, and the final results confirmed the effectiveness and helpfulness of our system.
Collapse
|
6
|
Sun L, Zhang X, Pan X, Liu Y, Yu W, Xu T, Liu F, Chen W, Wang Y, Su W, Zhou Z. Visual analytics of genealogy with attribute-enhanced topological clustering. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-021-00802-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
7
|
Tao W, Hou X, Sah A, Battle L, Chang R, Stonebraker M. Kyrix-S: Authoring Scalable Scatterplot Visualizations of Big Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:401-411. [PMID: 33048700 DOI: 10.1109/tvcg.2020.3030372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Static scatterplots often suffer from the overdraw problem on big datasets where object overlap causes undesirable visual clutter. The use of zooming in scatterplots can help alleviate this problem. With multiple zoom levels, more screen real estate is available, allowing objects to be placed in a less crowded way. We call this type of visualization scalable scatterplot visualizations, or SSV for short. Despite the potential of SSVs, existing systems and toolkits fall short in supporting the authoring of SSVs due to three limitations. First, many systems have limited scalability, assuming that data fits in the memory of one computer. Second, too much developer work, e.g., using custom code to generate mark layouts or render objects, is required. Third, many systems focus on only a small subset of the SSV design space (e.g. supporting a specific type of visual marks). To address these limitations, we have developed Kyrix-S, a system for easy authoring of SSVs at scale. Kyrix-S derives a declarative grammar that enables specification of a variety of SSVs in a few tens of lines of code, based on an existing survey of scatterplot tasks and designs. The declarative grammar is supported by a distributed layout algorithm which automatically places visual marks onto zoom levels. We store data in a multi-node database and use multi-node spatial indexes to achieve interactive browsing of large SSVs. Extensive experiments show that 1) Kyrix-S enables interactive browsing of SSVs of billions of objects, with response times under 500ms and 2) Kyrix-S achieves 4X-9X reduction in specification compared to a state-of-the-art authoring system.
Collapse
|
8
|
Zhang M, Chen L, Li Q, Yuan X, Yong J. Uncertainty-Oriented Ensemble Data Visualization and Exploration using Variable Spatial Spreading. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1808-1818. [PMID: 33048703 DOI: 10.1109/tvcg.2020.3030377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
As an important method of handling potential uncertainties in numerical simulations, ensemble simulation has been widely applied in many disciplines. Visualization is a promising and powerful ensemble simulation analysis method. However, conventional visualization methods mainly aim at data simplification and highlighting important information based on domain expertise instead of providing a flexible data exploration and intervention mechanism. Trial-and-error procedures have to be repeatedly conducted by such approaches. To resolve this issue, we propose a new perspective of ensemble data analysis using the attribute variable dimension as the primary analysis dimension. Particularly, we propose a variable uncertainty calculation method based on variable spatial spreading. Based on this method, we design an interactive ensemble analysis framework that provides a flexible interactive exploration of the ensemble data. Particularly, the proposed spreading curve view, the region stability heat map view, and the temporal analysis view, together with the commonly used 2D map view, jointly support uncertainty distribution perception, region selection, and temporal analysis, as well as other analysis requirements. We verify our approach by analyzing a real-world ensemble simulation dataset. Feedback collected from domain experts confirms the efficacy of our framework.
Collapse
|
9
|
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R. A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1385-1395. [PMID: 33035164 DOI: 10.1109/tvcg.2020.3028888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Collapse
|
10
|
Quadri GJ, Rosen P. Modeling the Influence of Visual Density on Cluster Perception in Scatterplots Using Topology. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1829-1839. [PMID: 33048695 DOI: 10.1109/tvcg.2020.3030365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on four factors-distribution size of clusters, number of points, size of points, and opacity of points-that influence cluster identification in scatterplots. From these parameters, we have constructed two models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.
Collapse
|
11
|
A visual uncertainty analytics approach for weather forecast similarity measurement based on fuzzy clustering. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00709-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Ma Y, Maciejewski R. Visual Analysis of Class Separations With Locally Linear Segments. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:241-253. [PMID: 32746282 DOI: 10.1109/tvcg.2020.3011155] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
High-dimensional labeled data widely exists in many real-world applications such as classification and clustering. One main task in analyzing such datasets is to explore class separations and class boundaries derived from machine learning models. Dimension reduction techniques are commonly applied to support analysts in exploring the underlying decision boundary structures by depicting a low-dimensional representation of the data distributions from multiple classes. However, such projection-based analyses are limited due to their lack of ability to show separations in complex non-linear decision boundary structures and can suffer from heavy distortion and low interpretability. To overcome these issues of separability and interpretability, we propose a visual analysis approach that utilizes the power of explainability from linear projections to support analysts when exploring non-linear separation structures. Our approach is to extract a set of locally linear segments that approximate the original non-linear separations. Unlike traditional projection-based analysis where the data instances are mapped to a single scatterplot, our approach supports the exploration of complex class separations through multiple local projection results. We conduct case studies on two labeled datasets to demonstrate the effectiveness of our approach.
Collapse
|
13
|
Ma Y, Tung AKH, Wang W, Gao X, Pan Z, Chen W. ScatterNet: A Deep Subjective Similarity Model for Visual Analysis of Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1562-1576. [PMID: 30334762 DOI: 10.1109/tvcg.2018.2875702] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Similarity measuring methods are widely adopted in a broad range of visualization applications. In this work, we address the challenge of representing human perception in the visual analysis of scatterplots by introducing a novel deep-learning-based approach, ScatterNet, captures perception-driven similarities of such plots. The approach exploits deep neural networks to extract semantic features of scatterplot images for similarity calculation. We create a large labeled dataset consisting of similar and dissimilar images of scatterplots to train the deep neural network. We conduct a set of evaluations including performance experiments and a user study to demonstrate the effectiveness and efficiency of our approach. The evaluations confirm that the learned features capture the human perception of scatterplot similarity effectively. We describe two scenarios to show how ScatterNet can be applied in visual analysis applications.
Collapse
|
14
|
Ma Y, Xie T, Li J, Maciejewski R. Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1075-1085. [PMID: 31478859 DOI: 10.1109/tvcg.2019.2934631] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Machine learning models are currently being deployed in a variety of real-world applications where model predictions are used to make decisions about healthcare, bank loans, and numerous other critical tasks. As the deployment of artificial intelligence technologies becomes ubiquitous, it is unsurprising that adversaries have begun developing methods to manipulate machine learning models to their advantage. While the visual analytics community has developed methods for opening the black box of machine learning models, little work has focused on helping the user understand their model vulnerabilities in the context of adversarial attacks. In this paper, we present a visual analytics framework for explaining and exploring model vulnerabilities to adversarial attacks. Our framework employs a multi-faceted visualization scheme designed to support the analysis of data poisoning attacks from the perspective of models, data instances, features, and local structures. We demonstrate our framework through two case studies on binary classifiers and illustrate model vulnerabilities with respect to varying attack strategies.
Collapse
|
15
|
Wei Y, Mei H, Zhao Y, Zhou S, Lin B, Jiang H, Chen W. Evaluating Perceptual Bias During Geometric Scaling of Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:321-331. [PMID: 31403425 DOI: 10.1109/tvcg.2019.2934208] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Scatterplots are frequently scaled to fit display areas in multi-view and multi-device data analysis environments. A common method used for scaling is to enlarge or shrink the entire scatterplot together with the inside points synchronously and proportionally. This process is called geometric scaling. However, geometric scaling of scatterplots may cause a perceptual bias, that is, the perceived and physical values of visual features may be dissociated with respect to geometric scaling. For example, if a scatterplot is projected from a laptop to a large projector screen, then observers may feel that the scatterplot shown on the projector has fewer points than that viewed on the laptop. This paper presents an evaluation study on the perceptual bias of visual features in scatterplots caused by geometric scaling. The study focuses on three fundamental visual features (i.e., numerosity, correlation, and cluster separation) and three hypotheses that are formulated on the basis of our experience. We carefully design three controlled experiments by using well-prepared synthetic data and recruit participants to complete the experiments on the basis of their subjective experience. With a detailed analysis of the experimental results, we obtain a set of instructive findings. First, geometric scaling causes a bias that has a linear relationship with the scale ratio. Second, no significant difference exists between the biases measured from normally and uniformly distributed scatterplots. Third, changing the point radius can correct the bias to a certain extent. These findings can be used to inspire the design decisions of scatterplots in various scenarios.
Collapse
|
16
|
Luo X, Yuan Y, Zhang K, Xia J, Zhou Z, Chang L, Gu T. Enhancing statistical charts: toward better data visualization and analysis. J Vis (Tokyo) 2019. [DOI: 10.1007/s12650-019-00569-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
17
|
Liu S, Chen C, Lu Y, Ouyang F, Wang B. An Interactive Method to Improve Crowdsourced Annotations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:235-245. [PMID: 30130224 DOI: 10.1109/tvcg.2018.2864843] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In order to effectively infer correct labels from noisy crowdsourced annotations, learning-from-crowds models have introduced expert validation. However, little research has been done on facilitating the validation procedure. In this paper, we propose an interactive method to assist experts in verifying uncertain instance labels and unreliable workers. Given the instance labels and worker reliability inferred from a learning-from-crowds model, candidate instances and workers are selected for expert validation. The influence of verified results is propagated to relevant instances and workers through the learning-from-crowds model. To facilitate the validation of annotations, we have developed a confusion visualization to indicate the confusing classes for further exploration, a constrained projection method to show the uncertain labels in context, and a scatter-plot-based visualization to illustrate worker reliability. The three visualizations are tightly integrated with the learning-from-crowds model to provide an iterative and progressive environment for data validation. Two case studies were conducted that demonstrate our approach offers an efficient method for validating and improving crowdsourced annotations.
Collapse
|
18
|
Chen Y, Dong Y, Sun Y, Liang J. A Multi-comparable visual analytic approach for complex hierarchical data. JOURNAL OF VISUAL LANGUAGES AND COMPUTING 2018. [DOI: 10.1016/j.jvlc.2018.02.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|