51
|
Knittel J, Lalama A, Koch S, Ertl T. Visual Neural Decomposition to Explain Multivariate Data Sets. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1374-1384. [PMID: 33048724 DOI: 10.1109/tvcg.2020.3030420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. More specifically, it is often valuable to understand which ranges of which input variables lead to particular values of a given target variable. Unfortunately, with an increasing number of independent variables, this process may become cumbersome and time-consuming due to the many possible combinations that have to be explored. In this paper, we propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables. We developed a visual model based on neural networks that can be explored in a guided way to help analysts find and understand such correlations. First, we train a neural network to predict the target from the input variables. Then, we visualize the inner workings of the resulting model to help understand relations within the data set. We further introduce a new regularization term for the backpropagation algorithm that encourages the neural network to learn representations that are easier to interpret visually. We apply our method to artificial and real-world data sets to show its utility.
Collapse
|
52
|
Lumina: an adaptive, automated and extensible prototype for exploring, enriching and visualizing data. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00718-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
53
|
Rubab S, Tang J, Wu Y. Examining interaction techniques in data visualization authoring tools from the perspective of goals and human cognition: a survey. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00705-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
54
|
Ben Lahmar H, Herschel M. Collaborative filtering over evolution provenance data for interactive visual data exploration. INFORM SYST 2021. [DOI: 10.1016/j.is.2020.101620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
55
|
Rahman P, Nandi A, Hebert C. Amplifying Domain Expertise in Clinical Data Pipelines. JMIR Med Inform 2020; 8:e19612. [PMID: 33151150 PMCID: PMC7677017 DOI: 10.2196/19612] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 07/07/2020] [Accepted: 07/22/2020] [Indexed: 11/28/2022] Open
Abstract
Digitization of health records has allowed the health care domain to adopt data-driven algorithms for decision support. There are multiple people involved in this process: a data engineer who processes and restructures the data, a data scientist who develops statistical models, and a domain expert who informs the design of the data pipeline and consumes its results for decision support. Although there are multiple data interaction tools for data scientists, few exist to allow domain experts to interact with data meaningfully. Designing systems for domain experts requires careful thought because they have different needs and characteristics from other end users. There should be an increased emphasis on the system to optimize the experts' interaction by directing them to high-impact data tasks and reducing the total task completion time. We refer to this optimization as amplifying domain expertise. Although there is active research in making machine learning models more explainable and usable, it focuses on the final outputs of the model. However, in the clinical domain, expert involvement is needed at every pipeline step: curation, cleaning, and analysis. To this end, we review literature from the database, human-computer information, and visualization communities to demonstrate the challenges and solutions at each of the data pipeline stages. Next, we present a taxonomy of expertise amplification, which can be applied when building systems for domain experts. This includes summarization, guidance, interaction, and acceleration. Finally, we demonstrate the use of our taxonomy with a case study.
Collapse
Affiliation(s)
| | - Arnab Nandi
- The Ohio State University, Columbus, OH, United States
| | | |
Collapse
|
56
|
Yang D, Yu S, Hao Y. Visual Analysis of Sorting and Classification of Multidimensional Data. INT J PATTERN RECOGN 2020. [DOI: 10.1142/s021800142155003x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
An important work of data analysis is to identify correlation structures and classify the data in unlabeled high-dimensional data, which usually requires iterative experiments on clustering parameters, attribute weights and instances. For a large dataset, the number of clusters may be huge, and it is a great challenge to explore in this huge space. People usually have a more comprehensive understanding of some data. For example, they think that data A is better than data B, but they do not know which attributes are important. Therefore, a powerful interactive analysis tool can help people greatly improve the effectiveness of exploratory clustering analysis. This paper provides a visual analysis method for sorting and classifying multivariate data. It can determine the weight of each attribute through user’s interaction, thus, generating sorting, and then complete classification according to sorting results. Through visual display, users can understand the characteristics of data as well as category characteristics intuitively and quickly, and it helps users improve sorting and classification results.
Collapse
Affiliation(s)
- Dongsheng Yang
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, P. R. China
| | - Shidong Yu
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, P. R. China
- Department of Electrical Engineering, Yingkou Institute of Technology, Yingkou 115014, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Ying Hao
- Department of Electrical Engineering, Yingkou Institute of Technology, Yingkou 115014, P. R. China
- Department of Information Science, Dalian Maritime University, Dalian 116026, P. R. China
| |
Collapse
|
57
|
Ivson P, Moreira A, Queiroz F, Santos W, Celes W. A Systematic Review of Visualization in Building Information Modeling. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:3109-3127. [PMID: 30932840 DOI: 10.1109/tvcg.2019.2907583] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Building Information Modeling (BIM) employs data-rich 3D CAD models for large-scale facility design, construction, and operation. These complex datasets contain a large amount and variety of information, ranging from design specifications to real-time sensor data. They are used by architects and engineers for various analysis and simulations throughout a facility's life cycle. Many techniques from different visualization fields could be used to analyze these data. However, the BIM domain still remains largely unexplored by the visualization community. The goal of this article is to encourage visualization researchers to increase their involvement with BIM. To this end, we present the results of a systematic review of visualization in current BIM practice. We use a novel taxonomy to identify main application areas and analyze commonly employed techniques. From this domain characterization, we highlight future research opportunities brought forth by the unique features of BIM. For instance, exploring the synergies between scientific and information visualization to integrate spatial and non-spatial data. We hope this article raises awareness to interesting new challenges the BIM domain brings to the visualization community.
Collapse
|
58
|
Abstract
Exploratory data analysis is a crucial part of data-driven scientific discovery. Yet, the process of discovering insights from visualization can be a manual and painstaking process. This article discusses some of the lessons we learned from working with scientists in designing visual data exploration system, along with design considerations for future tools.
Collapse
|
59
|
Chen Z, Su Y, Wang Y, Wang Q, Qu H, Wu Y. MARVisT: Authoring Glyph-Based Visualization in Mobile Augmented Reality. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:2645-2658. [PMID: 30640614 DOI: 10.1109/tvcg.2019.2892415] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Recent advances in mobile augmented reality (AR) techniques have shed new light on personal visualization for their advantages of fitting visualization within personal routines, situating visualization in a real-world context, and arousing users' interests. However, enabling non-experts to create data visualization in mobile AR environments is challenging given the lack of tools that allow in-situ design while supporting the binding of data to AR content. Most existing AR authoring tools require working on personal computers or manually creating each virtual object and modifying its visual attributes. We systematically study this issue by identifying the specificity of AR glyph-based visualization authoring tool and distill four design considerations. Following these design considerations, we design and implement MARVisT, a mobile authoring tool that leverages information from reality to assist non-experts in addressing relationships between data and virtual glyphs, real objects and virtual glyphs, and real objects and data. With MARVisT, users without visualization expertise can bind data to real-world objects to create expressive AR glyph-based visualizations rapidly and effortlessly, reshaping the representation of the real world with data. We use several examples to demonstrate the expressiveness of MARVisT. A user study with non-experts is also conducted to evaluate the authoring experience of MARVisT.
Collapse
|
60
|
Representing Data Visualization Goals and Tasks through Meta-Modeling to Tailor Information Dashboards. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10072306] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Information dashboards are everywhere. They support knowledge discovery in a huge variety of contexts and domains. Although powerful, these tools can be complex, not only for the end-users but also for developers and designers. Information dashboards encode complex datasets into different visual marks to ease knowledge discovery. Choosing a wrong design could compromise the entire dashboard’s effectiveness, selecting the appropriate encoding or configuration for each potential context, user, or data domain is a crucial task. For these reasons, there is a necessity to automatize the recommendation of visualizations and dashboard configurations to deliver tools adapted to their context. Recommendations can be based on different aspects, such as user characteristics, the data domain, or the goals and tasks that will be achieved or carried out through the visualizations. This work presents a dashboard meta-model that abstracts all these factors and the integration of a visualization task taxonomy to account for the different actions that can be performed with information dashboards. This meta-model has been used to design a domain specific language to specify dashboards requirements in a structured way. The ultimate goal is to obtain a dashboard generation pipeline to deliver dashboards adapted to any context, such as the educational context, in which a lot of data are generated, and there are several actors involved (students, teachers, managers, etc.) that would want to reach different insights regarding their learning performance or learning methodologies.
Collapse
|
61
|
Zhu-Tian C, Wang Y, Wang Q, Wang Y, Qu H. Towards Automated Infographic Design: Deep Learning-based Auto-Extraction of Extensible Timeline. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:917-926. [PMID: 31443028 DOI: 10.1109/tvcg.2019.2934810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Designers need to consider not only perceptual effectiveness but also visual styles when creating an infographic. This process can be difficult and time consuming for professional designers, not to mention non-expert users, leading to the demand for automated infographics design. As a first step, we focus on timeline infographics, which have been widely used for centuries. We contribute an end-to-end approach that automatically extracts an extensible timeline template from a bitmap image. Our approach adopts a deconstruction and reconstruction paradigm. At the deconstruction stage, we propose a multi-task deep neural network that simultaneously parses two kinds of information from a bitmap timeline: 1) the global information, i.e., the representation, scale, layout, and orientation of the timeline, and 2) the local information, i.e., the location, category, and pixels of each visual element on the timeline. At the reconstruction stage, we propose a pipeline with three techniques, i.e., Non-Maximum Merging, Redundancy Recover, and DL GrabCut, to extract an extensible template from the infographic, by utilizing the deconstruction results. To evaluate the effectiveness of our approach, we synthesize a timeline dataset (4296 images) and collect a real-world timeline dataset (393 images) from the Internet. We first report quantitative evaluation results of our approach over the two datasets. Then, we present examples of automatically extracted templates and timelines automatically generated based on these templates to qualitatively demonstrate the performance. The results confirm that our approach can effectively extract extensible templates from real-world timeline infographics.
Collapse
|
62
|
Smart S, Wu K, Szafir DA. Color Crafting: Automating the Construction of Designer Quality Color Ramps. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1215-1225. [PMID: 31425090 DOI: 10.1109/tvcg.2019.2934284] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Visualizations often encode numeric data using sequential and diverging color ramps. Effective ramps use colors that are sufficiently discriminable, align well with the data, and are aesthetically pleasing. Designers rely on years of experience to create high-quality color ramps. However, it is challenging for novice visualization developers that lack this experience to craft effective ramps as most guidelines for constructing ramps are loosely defined qualitative heuristics that are often difficult to apply. Our goal is to enable visualization developers to readily create effective color encodings using a single seed color. We do this using an algorithmic approach that models designer practices by analyzing patterns in the structure of designer-crafted color ramps. We construct these models from a corpus of 222 expert-designed color ramps, and use the results to automatically generate ramps that mimic designer practices. We evaluate our approach through an empirical study comparing the outputs of our approach with designer-crafted color ramps. Our models produce ramps that support accurate and aesthetically pleasing visualizations at least as well as designer ramps and that outperform conventional mathematical approaches.
Collapse
|
63
|
Mei H, Chen W, Wei Y, Hu Y, Zhou S, Lin B, Zhao Y, Xia J. RSATree: Distribution-Aware Data Representation of Large-Scale Tabular Datasets for Flexible Visual Query. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1161-1171. [PMID: 31443022 DOI: 10.1109/tvcg.2019.2934800] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Analysts commonly investigate the data distributions derived from statistical aggregations of data that are represented by charts, such as histograms and binned scatterplots, to visualize and analyze a large-scale dataset. Aggregate queries are implicitly executed through such a process. Datasets are constantly extremely large; thus, the response time should be accelerated by calculating predefined data cubes. However, the queries are limited to the predefined binning schema of preprocessed data cubes. Such limitation hinders analysts' flexible adjustment of visual specifications to investigate the implicit patterns in the data effectively. Particularly, RSATree enables arbitrary queries and flexible binning strategies by leveraging three schemes, namely, an R-tree-based space partitioning scheme to catch the data distribution, a locality-sensitive hashing technique to achieve locality-preserving random access to data items, and a summed area table scheme to support interactive query of aggregated values with a linear computational complexity. This study presents and implements a web-based visual query system that supports visual specification, query, and exploration of large-scale tabular data with user-adjustable granularities. We demonstrate the efficiency and utility of our approach by performing various experiments on real-world datasets and analyzing time and space complexity.
Collapse
|
64
|
Satyanarayan A, Lee B, Ren D, Heer J, Stasko J, Thompson J, Brehmer M, Liu Z. Critical Reflections on Visualization Authoring Systems. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:461-471. [PMID: 31442976 DOI: 10.1109/tvcg.2019.2934281] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
An emerging generation of visualization authoring systems support expressive information visualization without textual programming. As they vary in their visualization models, system architectures, and user interfaces, it is challenging to directly compare these systems using traditional evaluative methods. Recognizing the value of contextualizing our decisions in the broader design space, we present critical reflections on three systems we developed -Lyra, Data Illustrator, and Charticulator. This paper surfaces knowledge that would have been daunting within the constituent papers of these three systems. We compare and contrast their (previously unmentioned) limitations and trade-offs between expressivity and learnability. We also reflect on common assumptions that we made during the development of our systems, thereby informing future research directions in visualization authoring systems.
Collapse
|
65
|
Cui W, Zhang X, Wang Y, Huang H, Chen B, Fang L, Zhang H, Lou JG, Zhang D. Text-to-Viz: Automatic Generation of Infographics from Proportion-Related Natural Language Statements. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:906-916. [PMID: 31478860 DOI: 10.1109/tvcg.2019.2934785] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Combining data content with visual embellishments, infographics can effectively deliver messages in an engaging and memorable manner. Various authoring tools have been proposed to facilitate the creation of infographics. However, creating a professional infographic with these authoring tools is still not an easy task, requiring much time and design expertise. Therefore, these tools are generally not attractive to casual users, who are either unwilling to take time to learn the tools or lacking in proper design expertise to create a professional infographic. In this paper, we explore an alternative approach: to automatically generate infographics from natural language statements. We first conducted a preliminary study to explore the design space of infographics. Based on the preliminary study, we built a proof-of-concept system that automatically converts statements about simple proportion-related statistics to a set of infographics with pre-designed styles. Finally, we demonstrated the usability and usefulness of the system through sample results, exhibits, and expert reviews.
Collapse
|
66
|
Wang Y, Sun Z, Zhang H, Cui W, Xu K, Ma X, Zhang D. DataShot: Automatic Generation of Fact Sheets from Tabular Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:895-905. [PMID: 31425110 DOI: 10.1109/tvcg.2019.2934398] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Fact sheets with vivid graphical design and intriguing statistical insights are prevalent for presenting raw data. They help audiences understand data-related facts effectively and make a deep impression. However, designing a fact sheet requires both data and design expertise and is a laborious and time-consuming process. One needs to not only understand the data in depth but also produce intricate graphical representations. To assist in the design process, we present DataShot which, to the best of our knowledge, is the first automated system that creates fact sheets automatically from tabular data. First, we conduct a qualitative analysis of 245 infographic examples to explore general infographic design space at both the sheet and element levels. We identify common infographic structures, sheet layouts, fact types, and visualization styles during the study. Based on these findings, we propose a fact sheet generation pipeline, consisting of fact extraction, fact composition, and presentation synthesis, for the auto-generation workflow. To validate our system, we present use cases with three real-world datasets. We conduct an in-lab user study to understand the usage of our system. Our evaluation results show that DataShot can efficiently generate satisfactory fact sheets to support further customization and data presentation.
Collapse
|
67
|
Dibia V, Demiralp C. Data2Vis: Automatic Generation of Data Visualizations Using Sequence-to-Sequence Recurrent Neural Networks. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2019; 39:33-46. [PMID: 31247545 DOI: 10.1109/mcg.2019.2924636] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization. Even high-level, dedicated visualization tools often require users to manually select among data attributes, decide which transformations to apply, and specify mappings between visual encoding variables and raw or transformed attributes. In this paper we introduce Data2Vis, an end-to-end trainable neural translation model for automatically generating visualizations from given datasets. We formulate visualization generation as a language translation problem, where data specifications are mapped to visualization specifications in a declarative language (Vega-Lite). To this end, we train a multilayered attention-based encoder-decoder network with long short-term memory (LSTM) units on a corpus of visualization specifications. Qualitative results show that our model learns the vocabulary and syntax for a valid visualization specification, appropriate transformations (count, bins, mean), and how to use common data selection patterns that occur within data visualizations. We introduce two metrics for evaluating the task of automated visualization generation (language syntax validity, visualization grammar syntax validity) and demonstrate the efficacy of bidirectional models with attention mechanisms for this task. Data2Vis generates visualizations that are comparable to manually created visualizations in a fraction of the time, with potential to learn more complex visualization strategies at scale.
Collapse
|
68
|
Giuzio A, Mecca G, Quintarelli E, Roveri M, Santoro D, Tanca L. INDIANA: An interactive system for assisting database exploration. INFORM SYST 2019. [DOI: 10.1016/j.is.2019.01.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
69
|
Saket B, Endert A, Demiralp C. Task-Based Effectiveness of Basic Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2505-2512. [PMID: 29994001 DOI: 10.1109/tvcg.2018.2829750] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Visualizations of tabular data are widely used; understanding their effectiveness in different task and data contexts is fundamental to scaling their impact. However, little is known about how basic tabular data visualizations perform across varying data analysis tasks. In this paper, we report results from a crowdsourced experiment to evaluate the effectiveness of five small scale (5-34 data points) two-dimensional visualization types-Table, Line Chart, Bar Chart, Scatterplot, and Pie Chart-across ten common data analysis tasks using two datasets. We find the effectiveness of these visualization types significantly varies across task, suggesting that visualization design would benefit from considering context-dependent effectiveness. Based on our findings, we derive recommendations on which visualizations to choose based on different tasks. We finally train a decision tree on the data we collected to drive a recommender, showcasing how to effectively engineer experimental user data into practical visualization systems.
Collapse
|
70
|
Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessments. Proc Natl Acad Sci U S A 2019; 116:1857-1864. [PMID: 30718386 DOI: 10.1073/pnas.1807180116] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In the information age, the ability to read and construct data visualizations becomes as important as the ability to read and write text. However, while standard definitions and theoretical frameworks to teach and assess textual, mathematical, and visual literacy exist, current data visualization literacy (DVL) definitions and frameworks are not comprehensive enough to guide the design of DVL teaching and assessment. This paper introduces a data visualization literacy framework (DVL-FW) that was specifically developed to define, teach, and assess DVL. The holistic DVL-FW promotes both the reading and construction of data visualizations, a pairing analogous to that of both reading and writing in textual literacy and understanding and applying in mathematical literacy. Specifically, the DVL-FW defines a hierarchical typology of core concepts and details the process steps that are required to extract insights from data. Advancing the state of the art, the DVL-FW interlinks theoretical and procedural knowledge and showcases how both can be combined to design curricula and assessment measures for DVL. Earlier versions of the DVL-FW have been used to teach DVL to more than 8,500 residential and online students, and results from this effort have helped revise and validate the DVL-FW presented here.
Collapse
|
71
|
Agency plus automation: Designing artificial intelligence into interactive systems. Proc Natl Acad Sci U S A 2019; 116:1844-1850. [PMID: 30718389 DOI: 10.1073/pnas.1807184115] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Much contemporary rhetoric regards the prospects and pitfalls of using artificial intelligence techniques to automate an increasing range of tasks, especially those once considered the purview of people alone. These accounts are often wildly optimistic, understating outstanding challenges while turning a blind eye to the human labor that undergirds and sustains ostensibly "automated" services. This long-standing focus on purely automated methods unnecessarily cedes a promising design space: one in which computational assistance augments and enriches, rather than replaces, people's intellectual work. This tension between human agency and machine automation poses vital challenges for design and engineering. In this work, we consider the design of systems that enable rich, adaptive interaction between people and algorithms. We seek to balance the often-complementary strengths and weaknesses of each, while promoting human control and skillful action. We share case studies of interactive systems we have developed in three arenas-data wrangling, exploratory analysis, and natural language translation-that integrate proactive computational support into interactive systems. To improve outcomes and support learning by both people and machines, we describe the use of shared representations of tasks augmented with predictive models of human capabilities and actions. We conclude with a discussion of future prospects and scientific frontiers for intelligence augmentation research.
Collapse
|
72
|
Camisetty A, Chandurkar C, Sun M, Koop D. Enhancing Web-based Analytics Applications through Provenance. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:131-141. [PMID: 30346289 DOI: 10.1109/tvcg.2018.2865039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Visual analytics systems continue to integrate new technologies and leverage modern environments for exploration and collaboration, making tools and techniques available to a wide audience through web browsers. Many of these systems have been developed with rich interactions, offering users the opportunity to examine details and explore hypotheses that have not been directly encoded by a designer. Understanding is enhanced when users can replay and revisit the steps in the sensemaking process, and in collaborative settings, it is especially important to be able to review not only the current state but also what decisions were made along the way. Unfortunately, many web-based systems lack the ability to capture such reasoning, and the path to a result is transient, forgotten when a user moves to a new view. This paper explores the requirements to augment existing client-side web applications with support for capturing, reviewing, sharing, and reusing steps in the reasoning process. Furthermore, it considers situations where decisions are made with streaming data, and the insights gained from revisiting those choices when more data is available. It presents a proof of concept, the Shareable Interactive Manipulation Provenance framework (SIMProv.js), that addresses these requirements in a modern, client-side JavaScript library, and describes how it can be integrated with existing frameworks.
Collapse
|
73
|
Ruotsalo T, Peltonen J, Eugster MJA, Głowacka D, Floréen P, Myllymäki P, Jacucci G, Kaski S. Interactive Intent Modeling for Exploratory Search. ACM T INFORM SYST 2018. [DOI: 10.1145/3231593] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Exploratory search requires the system to assist the user in comprehending the information space and expressing evolving search intents for iterative exploration and retrieval of information. We introduce interactive intent modeling, a technique that models a user’s evolving search intents and visualizes them as keywords for interaction. The user can provide feedback on the keywords, from which the system learns and visualizes an improved intent estimate and retrieves information. We report experiments comparing variants of a system implementing interactive intent modeling to a control system. Data comprising search logs, interaction logs, essay answers, and questionnaires indicate significant improvements in task performance, information retrieval performance over the session, information comprehension performance, and user experience. The improvements in retrieval effectiveness can be attributed to the intent modeling and the effect on users’ task performance, breadth of information comprehension, and user experience are shown to be dependent on a richer visualization. Our results demonstrate the utility of combining interactive modeling of search intentions with interactive visualization of the models that can benefit both directing the exploratory search process and making sense of the information space. Our findings can help design personalized systems that support exploratory information seeking and discovery of novel information.
Collapse
|
74
|
Alspaugh S, Zokaei N, Liu A, Jin C, Hearst MA. Futzing and Moseying: Interviews with Professional Data Analysts on Exploration Practices. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:22-31. [PMID: 30136976 DOI: 10.1109/tvcg.2018.2865040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We report the results of interviewing thirty professional data analysts working in a range of industrial, academic, and regulatory environments. This study focuses on participants' descriptions of exploratory activities and tool usage in these activities. Highlights of the findings include: distinctions between exploration as a precursor to more directed analysis versus truly open-ended exploration; confirmation that some analysts see "finding something interesting" as a valid goal of data exploration while others explicitly disavow this goal; conflicting views about the role of intelligent tools in data exploration; and pervasive use of visualization for exploration, but with only a subset using direct manipulation interfaces. These findings provide guidelines for future tool development, as well as a better understanding of the meaning of the term "data exploration" based on the words of practitioners "in the wild."
Collapse
|
75
|
Kerzner E, Goodwin S, Dykes J, Jones S, Meyer M. A Framework for Creative Visualization-Opportunities Workshops. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:748-758. [PMID: 30137005 DOI: 10.1109/tvcg.2018.2865241] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Applied visualization researchers often work closely with domain collaborators to explore new and useful applications of visualization. The early stages of collaborations are typically time consuming for all stakeholders as researchers piece together an understanding of domain challenges from disparate discussions and meetings. A number of recent projects, however, report on the use of creative visualization-opportunities (CVO) workshops to accelerate the early stages of applied work, eliciting a wealth of requirements in a few days of focused work. Yet, there is no established guidance for how to use such workshops effectively. In this paper, we present the results of a 2-year collaboration in which we analyzed the use of 17 workshops in 10 visualization contexts. Its primary contribution is a framework for CVO workshops that: 1) identifies a process model for using workshops; 2) describes a structure of what happens within effective workshops; 3) recommends 25 actionable guidelines for future workshops; and 4) presents an example workshop and workshop methods. The creation of this framework exemplifies the use of critical reflection to learn about visualization in practice from diverse studies and experience.
Collapse
|
76
|
Moritz D, Wang C, Nelson GL, Lin H, Smith AM, Howe B, Heer J. Formalizing Visualization Design Knowledge as Constraints: Actionable and Extensible Models in Draco. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:438-448. [PMID: 30137004 DOI: 10.1109/tvcg.2018.2865240] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
There exists a gap between visualization design guidelines and their application in visualization tools. While empirical studies can provide design guidance, we lack a formal framework for representing design knowledge, integrating results across studies, and applying this knowledge in automated design tools that promote effective encodings and facilitate visual exploration. We propose modeling visualization design knowledge as a collection of constraints, in conjunction with a method to learn weights for soft constraints from experimental data. Using constraints, we can take theoretical design knowledge and express it in a concrete, extensible, and testable form: the resulting models can recommend visualization designs and can easily be augmented with additional constraints or updated weights. We implement our approach in Draco, a constraint-based system based on Answer Set Programming (ASP). We demonstrate how to construct increasingly sophisticated automated visualization design systems, including systems based on weights learned directly from the results of graphical perception experiments.
Collapse
|
77
|
Srinivasan A, Drucker SM, Endert A, Stasko J. Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:672-681. [PMID: 30136989 DOI: 10.1109/tvcg.2018.2865145] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recently, an increasing number of visualization systems have begun to incorporate natural language generation (NLG) capabilities into their interfaces. NLG-based visualization systems typically leverage a suite of statistical functions to automatically extract key facts about the underlying data and surface them as natural language sentences alongside visualizations. With current systems, users are typically required to read the system-generated sentences and mentally map them back to the accompanying visualization. However, depending on the features of the visualization (e.g., visualization type, data density) and the complexity of the data fact, mentally mapping facts to visualizations can be a challenging task. Furthermore, more than one visualization could be used to illustrate a single data fact. Unfortunately, current tools provide little or no support for users to explore such alternatives. In this paper, we explore how system-generated data facts can be treated as interactive widgets to help users interpret visualizations and communicate their findings. We present Voder, a system that lets users interact with automatically-generated data facts to explore both alternative visualizations to convey a data fact as well as a set of embellishments to highlight a fact within a visualization. Leveraging data facts as interactive widgets, Voder also facilitates data fact-based visualization search. To assess Voder's design and features, we conducted a preliminary user study with 12 participants having varying levels of experience with visualization tools. Participant feedback suggested that interactive data facts aided them in interpreting visualizations. Participants also stated that the suggestions surfaced through the facts helped them explore alternative visualizations and embellishments to communicate individual data facts.
Collapse
|
78
|
Cavallo M, Demiralp C. Clustrophile 2: Guided Visual Clustering Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:267-276. [PMID: 30130194 DOI: 10.1109/tvcg.2018.2864477] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Data clustering is a common unsupervised learning method frequently used in exploratory data analysis. However, identifying relevant structures in unlabeled, high-dimensional data is nontrivial, requiring iterative experimentation with clustering parameters as well as data features and instances. The number of possible clusterings for a typical dataset is vast, and navigating in this vast space is also challenging. The absence of ground-truth labels makes it impossible to define an optimal solution, thus requiring user judgment to establish what can be considered a satisfiable clustering result. Data scientists need adequate interactive tools to effectively explore and navigate the large clustering space so as to improve the effectiveness of exploratory clustering analysis. We introduce Clustrophile 2, a new interactive tool for guided clustering analysis. Clustrophile 2 guides users in clustering-based exploratory analysis, adapts user feedback to improve user guidance, facilitates the interpretation of clusters, and helps quickly reason about differences between clusterings. To this end, Clustrophile 2 contributes a novel feature, the Clustering Tour, to help users choose clustering parameters and assess the quality of different clustering results in relation to current analysis goals and user expectations. We evaluate Clustrophile 2 through a user study with 12 data scientists, who used our tool to explore and interpret sub-cohorts in a dataset of Parkinson's disease patients. Results suggest that Clustrophile 2 improves the speed and effectiveness of exploratory clustering analysis for both experts and non-experts.
Collapse
|
79
|
Correll M, Li M, Kindlmann G, Scheidegger C. Looks Good To Me: Visualizations As Sanity Checks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:830-839. [PMID: 30136960 DOI: 10.1109/tvcg.2018.2864907] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Famous examples such as Anscombe's Quartet highlight that one of the core benefits of visualizations is allowing people to discover visual patterns that might otherwise be hidden by summary statistics. This visual inspection is particularly important in exploratory data analysis, where analysts can use visualizations such as histograms and dot plots to identify data quality issues. Yet, these visualizations are driven by parameters such as histogram bin size or mark opacity that have a great deal of impact on the final visual appearance of the chart, but are rarely optimized to make important features visible. In this paper, we show that data flaws have varying impact on the visual features of visualizations, and that the adversarial or merely uncritical setting of design parameters of visualizations can obscure the visual signatures of these flaws. Drawing on the framework of Algebraic Visualization Design, we present the results of a crowdsourced study showing that common visualization types can appear to reasonably summarize distributional data while hiding large and important flaws such as missing data and extraneous modes. We make use of these results to propose additional best practices for visualizations of distributions for data quality tasks.
Collapse
|
80
|
Ren D, Lee B, Brehmer M. Charticulator: Interactive Construction of Bespoke Chart Layouts. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:789-799. [PMID: 30136992 DOI: 10.1109/tvcg.2018.2865158] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We present Charticulator, an interactive authoring tool that enables the creation of bespoke and reusable chart layouts. Charticulator is our response to most existing chart construction interfaces that require authors to choose from predefined chart layouts, thereby precluding the construction of novel charts. In contrast, Charticulator transforms a chart specification into mathematical layout constraints and automatically computes a set of layout attributes using a constraint-solving algorithm to realize the chart. It allows for the articulation of compound marks or glyphs as well as links between these glyphs, all without requiring any coding or knowledge of constraint satisfaction. Furthermore, thanks to the constraint-based layout approach, Charticulator can export chart designs into reusable templates that can be imported into other visualization tools. In addition to describing Charticulator's conceptual framework and design, we present three forms of evaluation: a gallery to illustrate its expressiveness, a user study to verify its usability, and a click-count comparison between Charticulator and three existing tools. Finally, we discuss the limitations and potentials of Charticulator as well as directions for future research. Charticulator is available with its source code at https://charticulator.com.
Collapse
|
81
|
Law PM, Basole RC, Wu Y. Duet: Helping Data Analysis Novices Conduct Pairwise Comparisons by Minimal Specification. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:427-437. [PMID: 30130204 DOI: 10.1109/tvcg.2018.2864526] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Data analysis novices often encounter barriers in executing low-level operations for pairwise comparisons. They may also run into barriers in interpreting the artifacts (e.g., visualizations) created as a result of the operations. We developed Duet, a visual analysis system designed to help data analysis novices conduct pairwise comparisons by addressing execution and interpretation barriers. To reduce the barriers in executing low-level operations during pairwise comparison, Duet employs minimal specification: when one object group (i.e. a group of records in a data table) is specified, Duet recommends object groups that are similar to or different from the specified one; when two object groups are specified, Duet recommends similar and different attributes between them. To lower the barriers in interpreting its recommendations, Duet explains the recommended groups and attributes using both visualizations and textual descriptions. We conducted a qualitative evaluation with eight participants to understand the effectiveness of Duet. The results suggest that minimal specification is easy to use and Duet's explanations are helpful for interpreting the recommendations despite some usability issues.
Collapse
|
82
|
Yalcin MA, Elmqvist N, Bederson BB. Keshif: Rapid and Expressive Tabular Data Exploration for Novices. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:2339-2352. [PMID: 28692978 DOI: 10.1109/tvcg.2017.2723393] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
General purpose graphical interfaces for data exploration are typically based on manual visualization and interaction specifications. While designing manual specification can be very expressive, it demands high efforts to make effective decisions, therefore reducing exploratory speed. Instead, principled automated designs can increase exploratory speed, decrease learning efforts, help avoid ineffective decisions, and therefore better support data analytics novices. Towards these goals, we present Keshif, a new systematic design for tabular data exploration. To summarize a given dataset, Keshif aggregates records by value within attribute summaries, and visualizes aggregate characteristics using a consistent design based on data types. To reveal data distribution details, Keshif features three complementary linked selections: highlighting, filtering, and comparison. Keshif further increases expressiveness through aggregate metrics, absolute/part-of scale modes, calculated attributes, and saved selections, all working in synchrony. Its automated design approach also simplifies authoring of dashboards composed of summaries and individual records from raw data using fluid interaction. We show examples selected from datasets from diverse domains. Our study with novices shows that after exploring raw data for 15 minutes, our participants reached close to 30 data insights on average, comparable to other studies with skilled users using more complex tools.
Collapse
|
83
|
Harrison DG, Efford ND, Fisher QJ, Ruddle RA. PETMiner-A Visual Analysis Tool for Petrophysical Properties of Core Sample Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:1728-1741. [PMID: 28320668 DOI: 10.1109/tvcg.2017.2682865] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The aim of the PETMiner software is to reduce the time and monetary cost of analysing petrophysical data that is obtained from reservoir sample cores. Analysis of these data requires tacit knowledge to fill 'gaps' so that predictions can be made for incomplete data. Through discussions with 30 industry and academic specialists, we identified three analysis use cases that exemplified the limitations of current petrophysics analysis tools. We used those use cases to develop nine core requirements for PETMiner, which is innovative because of its ability to display detailed images of the samples as data points, directly plot multiple sample properties and derived measures for comparison, and substantially reduce interaction cost. An 11-month evaluation demonstrated benefits across all three use cases by allowing a consultant to: (1) generate more accurate reservoir flow models, (2) discover a previously unknown relationship between one easy-to-measure property and another that is costly, and (3) make a 100-fold reduction in the time required to produce plots for a report.
Collapse
|
84
|
Harper J, Agrawala M. Converting Basic D3 Charts into Reusable Style Templates. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:1274-1286. [PMID: 28186898 DOI: 10.1109/tvcg.2017.2659744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We present a technique for converting a basic D3 chart into a reusable style template. Then, given a new data source we can apply the style template to generate a chart that depicts the new data, but in the style of the template. To construct the style template we first deconstruct the input D3 chart to recover its underlying structure: the data, the marks and the mappings that describe how the marks encode the data. We then rank the perceptual effectiveness of the deconstructed mappings. To apply the resulting style template to a new data source we first obtain importance ranks for each new data field. We then adjust the template mappings to depict the source data by matching the most important data fields to the most perceptually effective mappings. We show how the style templates can be applied to source data in the form of either a data table or another D3 chart. While our implementation focuses on generating templates for basic chart types (e.g., variants of bar charts, line charts, dot plots, scatterplots, etc.), these are the most commonly used chart types today. Users can easily find such basic D3 charts on the Web, turn them into templates, and immediately see how their own data would look in the visual style (e.g., colors, shapes, fonts, etc.) of the templates. We demonstrate the effectiveness of our approach by applying a diverse set of style templates to a variety of source datasets.
Collapse
|
85
|
Mei H, Ma Y, Wei Y, Chen W. The design space of construction tools for information visualization: A survey. ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.jvlc.2017.10.001] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
86
|
Arcia A, Woollen J, Bakken S. A Systematic Method for Exploring Data Attributes in Preparation for Designing Tailored Infographics of Patient Reported Outcomes. EGEMS (WASHINGTON, DC) 2018; 6:2. [PMID: 29881760 PMCID: PMC5983055 DOI: 10.5334/egems.190] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 11/21/2017] [Indexed: 12/01/2022]
Abstract
CONTEXT Tailored visualizations of patient reported outcomes (PROs) are valuable health communication tools to support shared decision making, health self-management, and engagement with research participants, such as cohorts in the NIH Precision Medicine Initiative. The automation of visualizations presents some unique design challenges. Efficient design processes depend upon gaining a thorough understanding of the data prior to prototyping. CASE DESCRIPTION We present a systematic method to exploring data attributes, with a specific focus on application to self-reported health data. The method entails a) determining the meaning of the variable to be visualized, b) identifying the possible and likely values, and c) understanding how values are interpreted. FINDINGS We present two case studies to illustrate how this method affected our design decisions particularly with respect to outlier and non-missing zero values. MAJOR THEMES The use of a systematic method made our process of exploring data attributes easily manageable. The limitations of the data can narrow design options but can also prompt creative solutions and innovative design opportunities. CONCLUSION A systematic method of exploration of data contributes to an efficient design process, uncovers design opportunities, and alerts the designer to design challenges.
Collapse
|
87
|
Unwin A, Valero-Mora P. Ensemble Graphics. J Comput Graph Stat 2018. [DOI: 10.1080/10618600.2017.1383264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Antony Unwin
- Institute of Mathematics, University of Augsburg, Augsburg, Germany
| | | |
Collapse
|
88
|
Qu Z, Hullman J. Keeping Multiple Views Consistent: Constraints, Validations, and Exceptions in Visualization Authoring. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:468-477. [PMID: 28866529 DOI: 10.1109/tvcg.2017.2744198] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Visualizations often appear in multiples, either in a single display (e.g., small multiples, dashboard) or across time or space (e.g., slideshow, set of dashboards). However, existing visualization design guidelines typically focus on single rather than multiple views. Solely following these guidelines can lead to effective yet inconsistent views (e.g., the same field has different axes domains across charts), making interpretation slow and error-prone. Moreover, little is known how consistency balances with other design considerations, making it difficult to incorporate consistency mechanisms in visualization authoring software. We present a wizard-of-oz study in which we observed how Tableau users achieve and sacrifice consistency in an exploration-to-presentation visualization design scenario. We extend (from our prior work) a set of encoding-specific constraints defining consistency across multiple views. Using the constraints as a checklist in our study, we observed cases where participants spontaneously maintained consistent encodings and warned cases where consistency was overlooked. In response to the warnings, participants either revised views for consistency or stated why they thought consistency should be overwritten. We categorize participants' actions and responses as constraint validations and exceptions, depicting the relative importance of consistency and other design considerations under various circumstances (e.g., data cardinality, available encoding resources, chart layout). We discuss automatic consistency checking as a constraint-satisfaction problem and provide design implications for communicating inconsistencies to users.
Collapse
|
89
|
Vuillemot R, Boy J. Structuring Visualization Mock-Ups at the Graphical Level by Dividing the Display Space. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:424-434. [PMID: 28866513 DOI: 10.1109/tvcg.2017.2743998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Mock-ups are rapid, low fidelity prototypes, that are used in many design-related fields to generate and share ideas. While their creation is supported by many mature methods and tools, surprisingly few are suited for the needs of information visualization. In this article, we introduce a novel approach to creating visualizations mock-ups, based on a dialogue between graphic design and parametric toolkit explorations. Our approach consists in iteratively subdividing the display space, while progressively informing each division with realistic data. We show that a wealth of mock-ups can easily be created using only temporary data attributes, as we wait for more realistic data to become available. We describe the implementation of this approach in a D3-based toolkit, which we use to highlight its generative power, and we discuss the potential for transitioning towards higher fidelity prototypes.
Collapse
|
90
|
Xia J, Ye F, Chen W, Wang Y, Chen W, Ma Y, Tung AKH. LDSScanner: Exploratory Analysis of Low-Dimensional Structures in High-Dimensional Datasets. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:236-245. [PMID: 28866522 DOI: 10.1109/tvcg.2017.2744098] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Many approaches for analyzing a high-dimensional dataset assume that the dataset contains specific structures, e.g., clusters in linear subspaces or non-linear manifolds. This yields a trial-and-error process to verify the appropriate model and parameters. This paper contributes an exploratory interface that supports visual identification of low-dimensional structures in a high-dimensional dataset, and facilitates the optimized selection of data models and configurations. Our key idea is to abstract a set of global and local feature descriptors from the neighborhood graph-based representation of the latent low-dimensional structure, such as pairwise geodesic distance (GD) among points and pairwise local tangent space divergence (LTSD) among pointwise local tangent spaces (LTS). We propose a new LTSD-GD view, which is constructed by mapping LTSD and GD to the axis and axis using 1D multidimensional scaling, respectively. Unlike traditional dimensionality reduction methods that preserve various kinds of distances among points, the LTSD-GD view presents the distribution of pointwise LTS ( axis) and the variation of LTS in structures (the combination of axis and axis). We design and implement a suite of visual tools for navigating and reasoning about intrinsic structures of a high-dimensional dataset. Three case studies verify the effectiveness of our approach.
Collapse
|
91
|
What you see is what you can change: Human-centered machine learning by interactive visualization. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.01.105] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
92
|
Kalinin AA, Palanimalai S, Dinov ID. SOCRAT Platform Design: A Web Architecture for Interactive Visual Analytics Applications. PROCEEDINGS OF THE 2ND WORKSHOP ON HUMAN-IN-THE-LOOP DATA ANALYTICS. WORKSHOP ON HUMAN-IN-THE-LOOP DATA ANALYTICS (2ND : 2017 : CHICAGO, ILL.) 2017; 2017. [PMID: 29630069 DOI: 10.1145/3077257.3077262] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
The modern web is a successful platform for large scale interactive web applications, including visualizations. However, there are no established design principles for building complex visual analytics (VA) web applications that could efficiently integrate visualizations with data management, computational transformation, hypothesis testing, and knowledge discovery. This imposes a time-consuming design and development process on many researchers and developers. To address these challenges, we consider the design requirements for the development of a module-based VA system architecture, adopting existing practices of large scale web application development. We present the preliminary design and implementation of an open-source platform for Statistics Online Computational Resource Analytical Toolbox (SOCRAT). This platform defines: (1) a specification for an architecture for building VA applications with multi-level modularity, and (2) methods for optimizing module interaction, re-usage, and extension. To demonstrate how this platform can be used to integrate a number of data management, interactive visualization, and analysis tools, we implement an example application for simple VA tasks including raw data input and representation, interactive visualization and analysis.
Collapse
|
93
|
Sarvghad A, Tory M, Mahyar N. Visualizing Dimension Coverage to Support Exploratory Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:21-30. [PMID: 27514052 DOI: 10.1109/tvcg.2016.2598466] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Data analysis involves constantly formulating and testing new hypotheses and questions about data. When dealing with a new dataset, especially one with many dimensions, it can be cumbersome for the analyst to clearly remember which aspects of the data have been investigated (i.e., visually examined for patterns, trends, outliers etc.) and which combinations have not. Yet this information is critical to help the analyst formulate new questions that they have not already answered. We observe that for tabular data, questions are typically comprised of varying combinations of data dimensions (e.g., what are the trends of Sales and Profit for different Regions?). We propose representing analysis history from the angle of dimension coverage (i.e., which data dimensions have been investigated and in which combinations). We use scented widgets [30] to incorporate dimension coverage of the analysts' past work into interaction widgets of a visualization tool. We demonstrate how this approach can assist analysts with the question formation process. Our approach extends the concept of scented widgets to reveal aspects of one's own analysis history, and offers a different perspective on one's past work than typical visualization history tools. Results of our empirical study showed that participants with access to embedded dimension coverage information relied on this information when formulating questions, asked more questions about the data, generated more top-level findings, and showed greater breadth of their analysis without sacrificing depth.
Collapse
|
94
|
Felix C, Pandey AV, Bertini E. TextTile: An Interactive Visualization Tool for Seamless Exploratory Analysis of Structured Data and Unstructured Text. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:161-170. [PMID: 27875139 DOI: 10.1109/tvcg.2016.2598447] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We describe TextTile, a data visualization tool for investigation of datasets and questions that require seamless and flexible analysis of structured data and unstructured text. TextTile is based on real-world data analysis problems gathered through our interaction with a number of domain experts and provides a general purpose solution to such problems. The system integrates a set of operations that can interchangeably be applied to the structured as well as to unstructured text part of the data to generate useful data summaries. Such summaries are then organized in visual tiles in a grid layout to allow their analysis and comparison. We validate TextTile with task analysis, use cases and a user study showing the system can be easily learned and proficiently used to carry out nontrivial tasks.
Collapse
|
95
|
Saket B, Kim H, Brown ET, Endert A. Visualization by Demonstration: An Interaction Paradigm for Visual Data Exploration. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:331-340. [PMID: 27875149 DOI: 10.1109/tvcg.2016.2598839] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Although data visualization tools continue to improve, during the data exploration process many of them require users to manually specify visualization techniques, mappings, and parameters. In response, we present the Visualization by Demonstration paradigm, a novel interaction method for visual data exploration. A system which adopts this paradigm allows users to provide visual demonstrations of incremental changes to the visual representation. The system then recommends potential transformations (Visual Representation, Data Mapping, Axes, and View Specification transformations) from the given demonstrations. The user and the system continue to collaborate, incrementally producing more demonstrations and refining the transformations, until the most effective possible visualization is created. As a proof of concept, we present VisExemplar, a mixed-initiative prototype that allows users to explore their data by recommending appropriate transformations in response to the given demonstrations.
Collapse
|
96
|
Ceneda D, Gschwandtner T, May T, Miksch S, Schulz HJ, Streit M, Tominski C. Characterizing Guidance in Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:111-120. [PMID: 27514054 DOI: 10.1109/tvcg.2016.2598468] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Visual analytics (VA) is typically applied in scenarios where complex data has to be analyzed. Unfortunately, there is a natural correlation between the complexity of the data and the complexity of the tools to study them. An adverse effect of complicated tools is that analytical goals are more difficult to reach. Therefore, it makes sense to consider methods that guide or assist users in the visual analysis process. Several such methods already exist in the literature, yet we are lacking a general model that facilitates in-depth reasoning about guidance. We establish such a model by extending van Wijk's model of visualization with the fundamental components of guidance. Guidance is defined as a process that gradually narrows the gap that hinders effective continuation of the data analysis. We describe diverse inputs based on which guidance can be generated and discuss different degrees of guidance and means to incorporate guidance into VA tools. We use existing guidance approaches from the literature to illustrate the various aspects of our model. As a conclusion, we identify research challenges and suggest directions for future studies. With our work we take a necessary step to pave the way to a systematic development of guidance techniques that effectively support users in the context of VA.
Collapse
|
97
|
Bigelow A, Drucker S, Fisher D, Meyer M. Iterating between Tools to Create and Edit Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2017; 23:481-490. [PMID: 27875164 DOI: 10.1109/tvcg.2016.2598609] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A common workflow for visualization designers begins with a generative tool, like D3 or Processing, to create the initial visualization; and proceeds to a drawing tool, like Adobe Illustrator or Inkscape, for editing and cleaning. Unfortunately, this is typically a one-way process: once a visualization is exported from the generative tool into a drawing tool, it is difficult to make further, data-driven changes. In this paper, we propose a bridge model to allow designers to bring their work back from the drawing tool to re-edit in the generative tool. Our key insight is to recast this iteration challenge as a merge problem - similar to when two people are editing a document and changes between them need to reconciled. We also present a specific instantiation of this model, a tool called Hanpuku, which bridges between D3 scripts and Illustrator. We show several examples of visualizations that are iteratively created using Hanpuku in order to illustrate the flexibility of the approach. We further describe several hypothetical tools that bridge between other visualization tools to emphasize the generality of the model.
Collapse
|
98
|
Abstract
Visualizations have a distinctive advantage when dealing with the information overload problem: Because they are grounded in basic visual cognition, many people understand them. However, creating proper visualizations requires specific expertise of the domain and underlying data. Our quest in this article is to study methods to suggest appropriate visualizations autonomously. To be appropriate, a visualization has to follow known guidelines to find and distinguish patterns visually and encode data therein. A visualization tells a story of the underlying data; yet, to be appropriate, it has to clearly represent those aspects of the data the viewer is interested in. Which aspects of a visualization are important to the viewer? Can we capture and use those aspects to recommend visualizations? This article investigates strategies to recommend visualizations considering different aspects of user preferences. A multi-dimensional scale is used to estimate aspects of quality for visualizations for collaborative filtering. Alternatively, tag vectors describing visualizations are used to recommend potentially interesting visualizations based on content. Finally, a hybrid approach combines information on what a visualization is about (tags) and how good it is (ratings). We present the design principles behind
VizRec
, our visual recommender. We describe its architecture, the data acquisition approach with a crowd sourced study, and the analysis of strategies for visualization recommendation.
Collapse
|
99
|
Epstein DA, Kang JH, Pina LR, Fogarty J, Munson SA. Reconsidering the Device in the Drawer: Lapses as a Design Opportunity in Personal Informatics. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING . UBICOMP (CONFERENCE) 2016; 2016:829-840. [PMID: 28516173 PMCID: PMC5432203 DOI: 10.1145/2971648.2971656] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
People stop using personal tracking tools over time, referred to as the lapsing stage of their tool use. We explore how designs can support people when they lapse in tracking, considering how to design data representations for a person who lapses in Fitbit use. Through a survey of 141 people who had lapsed in using Fitbit, we identified three use patterns and four perspectives on tracking. Participants then viewed seven visual representations of their Fitbit data and seven approaches to framing this data. Participant Fitbit use and perspective on tracking influenced their preference, which we surface in a series of contrasts. Specifically, our findings guide selecting appropriate aggregations from Fitbit use (e.g., aggregate more when someone has less data), choosing an appropriate framing technique from tracking perspective (e.g., ensure framing aligns with how the person feels about tracking), and creating appropriate social comparisons (e.g., portray the person positively compared to peers). We conclude by discussing how these contrasts suggest new designs and opportunities in other tracking domains.
Collapse
Affiliation(s)
- Daniel A Epstein
- Computer Science & Engineering, DUB Group, University of Washington
| | - Jennifer H Kang
- Computer Science & Engineering, DUB Group, University of Washington
- The Information School, DUB Group, University of Washington
| | - Laura R Pina
- Computer Science & Engineering, DUB Group, University of Washington
- Human Centered Design & Engineering, DUB Group, University of Washington
| | - James Fogarty
- Computer Science & Engineering, DUB Group, University of Washington
| | - Sean A Munson
- Human Centered Design & Engineering, DUB Group, University of Washington
| |
Collapse
|