1
|
Wootton D, Fox AR, Peck E, Satyanarayan A. Charting EDA: Characterizing Interactive Visualization Use in Computational Notebooks with a Mixed-Methods Formalism. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1191-1201. [PMID: 39388331 DOI: 10.1109/tvcg.2024.3456217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Interactive visualizations are powerful tools for Exploratory Data Analysis (EDA), but how do they affect the observations analysts make about their data? We conducted a qualitative experiment with 13 professional data scientists analyzing two datasets with Jupyter notebooks, collecting a rich dataset of interaction traces and think-aloud utterances. By qualitatively coding participant utterances, we introduce a formalism that describes EDA as a sequence of analysis states, where each state is comprised of either a representation an analyst constructs (e.g., the output of a data frame, an interactive visualization, etc.) or an observation the analyst makes (e.g., about missing data, the relationship between variables, etc.). By applying our formalism to our dataset, we identify that interactive visualizations, on average, lead to earlier and more complex insights about relationships between dataset attributes compared to static visualizations. Moreover, by calculating metrics such as revisit count and representational diversity, we uncover that some representations serve more as "planning aids" during EDA rather than tools strictly for hypothesis-answering. We show how these measures help identify other patterns of analysis behavior, such as the "80-20 rule", where a small subset of representations drove the majority of observations. Based on these findings, we offer design guidelines for interactive exploratory analysis tooling and reflect on future directions for studying the role that visualizations play in EDA.
Collapse
|
2
|
Block JE, Esmaeili S, Ragan ED, Goodall JR, Richardson GD. The Influence of Visual Provenance Representations on Strategies in a Collaborative Hand-off Data Analysis Scenario. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1113-1123. [PMID: 36155463 DOI: 10.1109/tvcg.2022.3209495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Conducting data analysis tasks rarely occur in isolation. Especially in intelligence analysis scenarios where different experts contribute knowledge to a shared understanding, members must communicate how insights develop to establish common ground among collaborators. The use of provenance to communicate analytic sensemaking carries promise by describing the interactions and summarizing the steps taken to reach insights. Yet, no universal guidelines exist for communicating provenance in different settings. Our work focuses on the presentation of provenance information and the resulting conclusions reached and strategies used by new analysts. In an open-ended, 30-minute, textual exploration scenario, we qualitatively compare how adding different types of provenance information (specifically data coverage and interaction history) affects analysts' confidence in conclusions developed, propensity to repeat work, filtering of data, identification of relevant information, and typical investigation strategies. We see that data coverage (i.e., what was interacted with) provides provenance information without limiting individual investigation freedom. On the other hand, while interaction history (i.e., when something was interacted with) does not significantly encourage more mimicry, it does take more time to comfortably understand, as represented by less confident conclusions and less relevant information-gathering behaviors. Our results contribute empirical data towards understanding how provenance summarizations can influence analysis behaviors.
Collapse
|
3
|
Lee DJL, Setlur V, Tory M, Karahalios K, Parameswaran A. Deconstructing Categorization in Visualization Recommendation: A Taxonomy and Comparative Study. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4225-4239. [PMID: 34061748 DOI: 10.1109/tvcg.2021.3085751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Visualization recommendation (VisRec) systems provide users with suggestions for potentially interesting and useful next steps during exploratory data analysis. These recommendations are typically organized into categories based on their analytical actions, i.e., operations employed to transition from the current exploration state to a recommended visualization. However, despite the emergence of a plethora of VisRec systems in recent work, the utility of the categories employed by these systems in analytical workflows has not been systematically investigated. Our article explores the efficacy of recommendation categories by formalizing a taxonomy of common categories and developing a system, Frontier, that implements these categories. Using Frontier, we evaluate workflow strategies adopted by users and how categories influence those strategies. Participants found recommendations that add attributes to enhance the current visualization and recommendations that filter to sub-populations to be comparatively most useful during data exploration. Our findings pave the way for next-generation VisRec systems that are adaptive and personalized via carefully chosen, effective recommendation categories.
Collapse
|
4
|
Zhao J, Fan M, Feng M. ChartSeer: Interactive Steering Exploratory Visual Analysis With Machine Intelligence. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1500-1513. [PMID: 32833636 DOI: 10.1109/tvcg.2020.3018724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
During exploratory visual analysis (EVA), analysts need to continually determine which subsequent activities to perform, such as which data variables to explore or how to present data variables visually. Due to the vast combinations of data variables and visual encodings that are possible, it is often challenging to make such decisions. Further, while performing local explorations, analysts often fail to attend to the holistic picture that is emerging from their analysis, leading them to improperly steer their EVA. These issues become even more impactful in the real world analysis scenarios where EVA occurs in multiple asynchronous sessions that could be completed by one or more analysts. To address these challenges, this work proposes ChartSeer, a system that uses machine intelligence to enable analysts to visually monitor the current state of an EVA and effectively identify future activities to perform. ChartSeer utilizes deep learning techniques to characterize analyst-created data charts to generate visual summaries and recommend appropriate charts for further exploration based on user interactions. A case study was first conducted to demonstrate the usage of ChartSeer in practice, followed by a controlled study to compare ChartSeer's performance with a baseline during EVA tasks. The results demonstrated that ChartSeer enables analysts to adequately understand current EVA status and advance their analysis by creating charts with increased coverage and visual encoding diversity.
Collapse
|
5
|
Soure EJ, Kuang E, Fan M, Zhao J. CoUX: Collaborative Visual Analysis of Think-Aloud Usability Test Videos for Digital Interfaces. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:643-653. [PMID: 34587055 DOI: 10.1109/tvcg.2021.3114822] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Reviewing a think-aloud video is both time-consuming and demanding as it requires UX (user experience) professionals to attend to many behavioral signals of the user in the video. Moreover, challenges arise when multiple UX professionals need to collaborate to reduce bias and errors. We propose a collaborative visual analytics tool, CoUX, to facilitate UX evaluators collectively reviewing think-aloud usability test videos of digital interfaces. CoUX seamlessly supports usability problem identification, annotation, and discussion in an integrated environment. To ease the discovery of usability problems, CoUX visualizes a set of problem-indicators based on acoustic, textual, and visual features extracted from the video and audio of a think-aloud session with machine learning. CoUX further enables collaboration amongst UX evaluators for logging, commenting, and consolidating the discovered problems with a chatbox-like user interface. We designed CoUX based on a formative study with two UX experts and insights derived from the literature. We conducted a user study with six pairs of UX practitioners on collaborative think-aloud video analysis tasks. The results indicate that CoUX is useful and effective in facilitating both problem identification and collaborative teamwork. We provide insights into how different features of CoUX were used to support both independent analysis and collaboration. Furthermore, our work highlights opportunities to improve collaborative usability test video analysis.
Collapse
|
6
|
Narechania A, Coscia A, Wall E, Endert A. Lumos: Increasing Awareness of Analytic Behavior during Visual Data Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1009-1018. [PMID: 34587059 DOI: 10.1109/tvcg.2021.3114827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Visual data analysis tools provide people with the agency and flexibility to explore data using a variety of interactive functionalities. However, this flexibility may introduce potential consequences in situations where users unknowingly overemphasize or underemphasize specific subsets of the data or attribute space they are analyzing. For example, users may overemphasize specific attributes and/or their values (e.g., Gender is always encoded on the X axis), underemphasize others (e.g., Religion is never encoded), ignore a subset of the data (e.g., older people are filtered out), etc. In response, we present Lumos, a visual data analysis tool that captures and shows the interaction history with data to increase awareness of such analytic behaviors. Using in-situ (at the place of interaction) and ex-situ (in an external view) visualization techniques, Lumos provides real-time feedback to users for them to reflect on their activities. For example, Lumos highlights datapoints that have been previously examined in the same visualization (in-situ) and also overlays them on the underlying data distribution (i.e., baseline distribution) in a separate visualization (ex-situ). Through a user study with 24 participants, we investigate how Lumos helps users' data exploration and decision-making processes. We found that Lumos increases users' awareness of visual data analysis practices in real-time, promoting reflection upon and acknowledgement of their intentions and potentially influencing subsequent interactions.
Collapse
|
7
|
|
8
|
Monadjemi S, Garnett R, Ottley A. Competing Models: Inferring Exploration Patterns and Information Relevance via Bayesian Model Selection. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:412-421. [PMID: 33052859 DOI: 10.1109/tvcg.2020.3030430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Analyzing interaction data provides an opportunity to learn about users, uncover their underlying goals, and create intelligent visualization systems. The first step for intelligent response in visualizations is to enable computers to infer user goals and strategies through observing their interactions with a system. Researchers have proposed multiple techniques to model users, however, their frameworks often depend on the visualization design, interaction space, and dataset. Due to these dependencies, many techniques do not provide a general algorithmic solution to user exploration modeling. In this paper, we construct a series of models based on the dataset and pose user exploration modeling as a Bayesian model selection problem where we maintain a belief over numerous competing models that could explain user interactions. Each of these competing models represent an exploration strategy the user could adopt during a session. The goal of our technique is to make high-level and in-depth inferences about the user by observing their low-level interactions. Although our proposed idea is applicable to various probabilistic model spaces, we demonstrate a specific instance of encoding exploration patterns as competing models to infer information relevance. We validate our technique's ability to infer exploration bias, predict future interactions, and summarize an analytic session using user study datasets. Our results indicate that depending on the application, our method outperforms established baselines for bias detection and future interaction prediction. Finally, we discuss future research directions based on our proposed modeling paradigm and suggest how practitioners can use this method to build intelligent visualization systems that understand users' goals and adapt to improve the exploration process.
Collapse
|
9
|
Han Q, Thom D, John M, Koch S, Heimerl F, Ertl T. Visual Quality Guidance for Document Exploration with Focus+Context Techniques. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:2715-2731. [PMID: 30676964 DOI: 10.1109/tvcg.2019.2895073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Magic lens based focus+context techniques are powerful means for exploring document spatializations. Typically, they only offer additional summarized or abstracted views on focused documents. As a consequence, users might miss important information that is either not shown in aggregated form or that never happens to get focused. In this work, we present the design process and user study results for improving a magic lens based document exploration approach with exemplary visual quality cues to guide users in steering the exploration and support them in interpreting the summarization results. We contribute a thorough analysis of potential sources of information loss involved in these techniques, which include the visual spatialization of text documents, user-steered exploration, and the visual summarization. With lessons learned from previous research, we highlight the various ways those information losses could hamper the exploration. Furthermore, we formally define measures for the aforementioned different types of information losses and bias. Finally, we present the visual cues to depict these quality measures that are seamlessly integrated into the exploration approach. These visual cues guide users during the exploration and reduce the risk of misinterpretation and accelerate insight generation. We conclude with the results of a controlled user study and discuss the benefits and challenges of integrating quality guidance in exploration techniques.
Collapse
|
10
|
Borland D, Wang W, Gotz D. Contextual Visualization. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2018; 38:17-23. [PMID: 30668452 DOI: 10.1109/mcg.2018.2874782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Unseen information can lead to various "threats to validity" when analyzing complex datasets using visual tools, resulting in potentially biased findings. We enumerate sources of unseen information and argue that a new focus on contextual visualization methods is needed to inform users of these threats and to mitigate their effects.
Collapse
|
11
|
Feng M, Peck E, Harrison L. Patterns and Pace: Quantifying Diverse Exploration Behavior with Visualizations on the Web. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:501-511. [PMID: 30188824 DOI: 10.1109/tvcg.2018.2865117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The diverse and vibrant ecosystem of interactive visualizations on the web presents an opportunity for researchers and practitioners to observe and analyze how everyday people interact with data visualizations. However, existing metrics of visualization interaction behavior used in research do not fully reveal the breadth of peoples' open-ended explorations with visualizations. One possible way to address this challenge is to determine high-level goals for visualization interaction metrics, and infer corresponding features from user interaction data that characterize different aspects of peoples' explorations of visualizations. In this paper, we identify needs for visualization behavior measurement, and develop corresponding candidate features that can be inferred from users' interaction data. We then propose metrics that capture novel aspects of peoples' open-ended explorations, including exploration uniqueness and exploration pacing. We evaluate these metrics along with four other metrics recently proposed in visualization literature by applying them to interaction data from prior visualization studies. The results of these evaluations suggest that these new metrics 1) reveal new characteristics of peoples' use of visualizations, 2) can be used to evaluate statistical differences between visualization designs, and 3) are statistically independent of prior metrics used in visualization research. We discuss implications of these results for future studies, including the potential for applying these metrics in visualization interaction analysis, as well as emerging challenges in developing and selecting metrics depicting visualization explorations.
Collapse
|
12
|
Zhao J, Glueck M, Isenberg P, Chevalier F, Khan A. Supporting Handoff in Asynchronous Collaborative Sensemaking Using Knowledge-Transfer Graphs. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:340-350. [PMID: 28866583 DOI: 10.1109/tvcg.2017.2745279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
During asynchronous collaborative analysis, handoff of partial findings is challenging because externalizations produced by analysts may not adequately communicate their investigative process. To address this challenge, we developed techniques to automatically capture and help encode tacit aspects of the investigative process based on an analyst's interactions, and streamline explicit authoring of handoff annotations. We designed our techniques to mediate awareness of analysis coverage, support explicit communication of progress and uncertainty with annotation, and implicit communication through playback of investigation histories. To evaluate our techniques, we developed an interactive visual analysis system, KTGraph, that supports an asynchronous investigative document analysis task. We conducted a two-phase user study to characterize a set of handoff strategies and to compare investigative performance with and without our techniques. The results suggest that our techniques promote the use of more effective handoff strategies, help increase an awareness of prior investigative process and insights, as well as improve final investigative outcomes.
Collapse
|
13
|
Xia J, Ye F, Chen W, Wang Y, Chen W, Ma Y, Tung AKH. LDSScanner: Exploratory Analysis of Low-Dimensional Structures in High-Dimensional Datasets. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:236-245. [PMID: 28866522 DOI: 10.1109/tvcg.2017.2744098] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Many approaches for analyzing a high-dimensional dataset assume that the dataset contains specific structures, e.g., clusters in linear subspaces or non-linear manifolds. This yields a trial-and-error process to verify the appropriate model and parameters. This paper contributes an exploratory interface that supports visual identification of low-dimensional structures in a high-dimensional dataset, and facilitates the optimized selection of data models and configurations. Our key idea is to abstract a set of global and local feature descriptors from the neighborhood graph-based representation of the latent low-dimensional structure, such as pairwise geodesic distance (GD) among points and pairwise local tangent space divergence (LTSD) among pointwise local tangent spaces (LTS). We propose a new LTSD-GD view, which is constructed by mapping LTSD and GD to the axis and axis using 1D multidimensional scaling, respectively. Unlike traditional dimensionality reduction methods that preserve various kinds of distances among points, the LTSD-GD view presents the distribution of pointwise LTS ( axis) and the variation of LTS in structures (the combination of axis and axis). We design and implement a suite of visual tools for navigating and reasoning about intrinsic structures of a high-dimensional dataset. Three case studies verify the effectiveness of our approach.
Collapse
|
14
|
|