1
|
Walchshofer C, Dhanoa V, Streit M, Meyer M. Transitioning to a Commercial Dashboarding System: Socio-Technical Observations and Opportunities. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:381-391. [PMID: 37878440 DOI: 10.1109/tvcg.2023.3326525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
Many long-established, traditional manufacturing businesses are becoming more digital and data-driven to improve their production. These companies are embracing visual analytics in these transitions through their adoption of commercial dashboarding systems. Although a number of studies have looked at the technical challenges of adopting these systems, very few have focused on the socio-technical issues that arise. In this paper, we report on the results of an interview study with 17 participants working in a range of roles at a long-established, traditional manufacturing company as they adopted Microsoft Power BI. The results highlight a number of socio-technical challenges the employees faced, including difficulties in training, using and creating dashboards, and transitioning to a modern digital company. Based on these results, we propose a number of opportunities for both companies and visualization researchers to improve these difficult transitions, as well as opportunities for rethinking how we design dashboarding systems for real-world use.
Collapse
|
2
|
Ramasamy D, Sarasua C, Bacchelli A, Bernstein A. Visualising data science workflows to support third-party notebook comprehension: an empirical study. EMPIRICAL SOFTWARE ENGINEERING 2023; 28:58. [PMID: 36968214 PMCID: PMC10034906 DOI: 10.1007/s10664-023-10289-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 01/05/2023] [Indexed: 05/28/2023]
Abstract
Data science is an exploratory and iterative process that often leads to complex and unstructured code. This code is usually poorly documented and, consequently, hard to understand by a third party. In this paper, we first collect empirical evidence for the non-linearity of data science code from real-world Jupyter notebooks, confirming the need for new approaches that aid in data science code interaction and comprehension. Second, we propose a visualisation method that elucidates implicit workflow information in data science code and assists data scientists in navigating the so-called garden of forking paths in non-linear code. The visualisation also provides information such as the rationale and the identification of the data science pipeline step based on cell annotations. We conducted a user experiment with data scientists to evaluate the proposed method, assessing the influence of (i) different workflow visualisations and (ii) cell annotations on code comprehension. Our results show that visualising the exploration helps the users obtain an overview of the notebook, significantly improving code comprehension. Furthermore, our qualitative analysis provides more insights into the difficulties faced during data science code comprehension.
Collapse
Affiliation(s)
| | - Cristina Sarasua
- Department of Informatics, University of Zurich, Zurich, Switzerland
| | - Alberto Bacchelli
- Department of Informatics, University of Zurich, Zurich, Switzerland
| | - Abraham Bernstein
- Department of Informatics, University of Zurich, Zurich, Switzerland
| |
Collapse
|
3
|
Xiong K, Luo Z, Fu S, Wang Y, Xu M, Wu Y. Revealing the Semantics of Data Wrangling Scripts With Comantics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:117-127. [PMID: 36166534 DOI: 10.1109/tvcg.2022.3209470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Data workers usually seek to understand the semantics of data wrangling scripts in various scenarios, such as code debugging, reusing, and maintaining. However, the understanding is challenging for novice data workers due to the variety of programming languages, functions, and parameters. Based on the observation that differences between input and output tables highly relate to the type of data transformation, we outline a design space including 103 characteristics to describe table differences. Then, we develop Comantics, a three-step pipeline that automatically detects the semantics of data transformation scripts. The first step focuses on the detection of table differences for each line of wrangling code. Second, we incorporate a characteristic-based component and a Siamese convolutional neural network-based component for the detection of transformation types. Third, we derive the parameters of each data transformation by employing a "slot filling" strategy. We design experiments to evaluate the performance of Comantics. Further, we assess its flexibility using three example applications in different domains.
Collapse
|
4
|
Wu A, Deng D, Cheng F, Wu Y, Liu S, Qu H. In Defence of Visual Analytics Systems: Replies to Critics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1026-1036. [PMID: 36179000 DOI: 10.1109/tvcg.2022.3209360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The last decade has witnessed many visual analytics (VA) systems that make successful applications to wide-ranging domains like urban analytics and explainable AI. However, their research rigor and contributions have been extensively challenged within the visualization community. We come in defence of VA systems by contributing two interview studies for gathering critics and responses to those criticisms. First, we interview 24 researchers to collect criticisms the review comments on their VA work. Through an iterative coding and refinement process, the interview feedback is summarized into a list of 36 common criticisms. Second, we interview 17 researchers to validate our list and collect their responses, thereby discussing implications for defending and improving the scientific values and rigor of VA systems. We highlight that the presented knowledge is deep, extensive, but also imperfect, provocative, and controversial, and thus recommend reading with an inclusive and critical eye. We hope our work can provide thoughts and foundations for conducting VA research and spark discussions to promote the research field forward more rigorously and vibrantly.
Collapse
|
5
|
Chen R, Weng D, Huang Y, Shu X, Zhou J, Sun G, Wu Y. Rigel: Transforming Tabular Data by Declarative Mapping. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:128-138. [PMID: 36191098 DOI: 10.1109/tvcg.2022.3209385] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
We present Rigel, an interactive system for rapid transformation of tabular data. Rigel implements a new declarative mapping approach that formulates the data transformation procedure as direct mappings from data to the row, column, and cell channels of the target table. To construct such mappings, Rigel allows users to directly drag data attributes from input data to these three channels and indirectly drag or type data values in a spreadsheet, and possible mappings that do not contradict these interactions are recommended to achieve efficient and straightforward data transformation. The recommended mappings are generated by enumerating and composing data variables based on the row, column, and cell channels, thereby revealing the possibility of alternative tabular forms and facilitating open-ended exploration in many data transformation scenarios, such as designing tables for presentation. In contrast to existing systems that transform data by composing operations (like transposing and pivoting), Rigel requires less prior knowledge on these operations, and constructing tables from the channels is more efficient and results in less ambiguity than generating operation sequences as done by the traditional by-example approaches. User study results demonstrated that Rigel is significantly less demanding in terms of time and interactions and suits more scenarios compared to the state-of-the-art by-example approach. A gallery of diverse transformation cases is also presented to show the potential of Rigel's expressiveness.
Collapse
|
6
|
Tory M, Bartram L, Fiore-Gartland B, Crisan A. Finding Their Data Voice: Practices and Challenges of Dashboard Users. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2023; 43:22-36. [PMID: 34928788 DOI: 10.1109/mcg.2021.3136545] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Dashboards are the ubiquitous means of data communication within organizations. Yet we have limited understanding of how they factor into data practices in the workplace, particularly for data workers who do not self-identify as professional analysts. We focus on data workers who use dashboards as a primary interface to data, reporting on an interview study that characterizes their data practices and the accompanying barriers to seamless data interaction. While dashboards are typically designed for data consumption, our findings show that dashboard users have far more diverse needs. To capture these activities, we frame data workers' practices as data conversations: conversations with data capture classic analysis (asking and answering data questions), while conversations through and around data involve constructing representations and narratives for sharing and communication. Dashboard users faced substantial barriers in their data conversations: their engagement with data was often intermittent, dependent on experts, and involved an awkward assembly of tools. We challenge the visualization and analytics community to embrace dashboard users as a population and design tools that blend seamlessly into their work contexts.
Collapse
|
7
|
Panagiotidou G, Poblome J, Aerts J, Vande Moere A. Designing a Data Visualisation for Interdisciplinary Scientists. How to Transparently Convey Data Frictions? Comput Support Coop Work 2022. [DOI: 10.1007/s10606-022-09432-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
8
|
Nowak S, Rosin M, Stuerzlinger W, Bartram L. Visual Analytics: A Method to Explore Natural Histories of Oral Epithelial Dysplasia. FRONTIERS IN ORAL HEALTH 2022; 2:703874. [PMID: 35048041 PMCID: PMC8757761 DOI: 10.3389/froh.2021.703874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/02/2021] [Indexed: 11/17/2022] Open
Abstract
Risk assessment and follow-up of oral potentially malignant disorders in patients with mild or moderate oral epithelial dysplasia is an ongoing challenge for improved oral cancer prevention. Part of the challenge is a lack of understanding of how observable features of such dysplasia, gathered as data by clinicians during follow-up, relate to underlying biological processes driving progression. Current research is at an exploratory phase where the precise questions to ask are not known. While traditional statistical and the newer machine learning and artificial intelligence methods are effective in well-defined problem spaces with large datasets, these are not the circumstances we face currently. We argue that the field is in need of exploratory methods that can better integrate clinical and scientific knowledge into analysis to iteratively generate viable hypotheses. In this perspective, we propose that visual analytics presents a set of methods well-suited to these needs. We illustrate how visual analytics excels at generating viable research hypotheses by describing our experiences using visual analytics to explore temporal shifts in the clinical presentation of epithelial dysplasia. Visual analytics complements existing methods and fulfills a critical and at-present neglected need in the formative stages of inquiry we are facing.
Collapse
Affiliation(s)
- Stan Nowak
- School of Interactive Arts and Technology, Simon Fraser University, Burnaby, BC, Canada
| | - Miriam Rosin
- BC Oral Cancer Prevention Program, Cancer Control Research, BC Cancer, Vancouver, BC, Canada.,Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
| | - Wolfgang Stuerzlinger
- School of Interactive Arts and Technology, Simon Fraser University, Burnaby, BC, Canada
| | - Lyn Bartram
- School of Interactive Arts and Technology, Simon Fraser University, Burnaby, BC, Canada
| |
Collapse
|
9
|
Bartram L, Correll M, Tory M. Untidy Data: The Unreasonable Effectiveness of Tables. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:686-696. [PMID: 34591767 DOI: 10.1109/tvcg.2021.3114830] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Working with data in table form is usually considered a preparatory and tedious step in the sensemaking pipeline; a way of getting the data ready for more sophisticated visualization and analytical tools. But for many people, spreadsheets - the quintessential table tool - remain a critical part of their information ecosystem, allowing them to interact with their data in ways that are hidden or abstracted in more complex tools. This is particularly true for data workers [61], people who work with data as part of their job but do not identify as professional analysts or data scientists. We report on a qualitative study of how these workers interact with and reason about their data. Our findings show that data tables serve a broader purpose beyond data cleanup at the initial stage of a linear analytic flow: users want to see and "get their hands on" the underlying data throughout the analytics process, reshaping and augmenting it to support sensemaking. They reorganize, mark up, layer on levels of detail, and spawn alternatives within the context of the base data. These direct interactions and human-readable table representations form a rich and cognitively important part of building understanding of what the data mean and what they can do with it. We argue that interactive tables are an important visualization idiom in their own right; that the direct data interaction they afford offers a fertile design space for visual analytics; and that sense making can be enriched by more flexible human-data interaction than is currently supported in visual analytics tools.
Collapse
|
10
|
Kasica S, Berret C, Munzner T. Table Scraps: An Actionable Framework for Multi-Table Data Wrangling From An Artifact Study of Computational Journalism. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:957-966. [PMID: 33074823 DOI: 10.1109/tvcg.2020.3030462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
For the many journalists who use data and computation to report the news, data wrangling is an integral part of their work. Despite an abundance of literature on data wrangling in the context of enterprise data analysis, little is known about the specific operations, processes, and pain points journalists encounter while performing this tedious, time-consuming task. To better understand the needs of this user group, we conduct a technical observation study of 50 public repositories of data and analysis code authored by 33 professional journalists at 26 news organizations. We develop two detailed and cross-cutting taxonomies of data wrangling in computational journalism, for actions and for processes. We observe the extensive use of multiple tables, a notable gap in previous wrangling analyses. We develop a concise, actionable framework for general multi-table data wrangling that includes wrangling operations documented in our taxonomy that are without clear parallels in other work. This framework, the first to incorporate tables as first-class objects, will support future interactive wrangling tools for both computational journalism and general-purpose use. We assess the generative and descriptive power of our framework through discussion of its relationship to our set of taxonomies.
Collapse
|