1
|
Chen Q, Chen Y, Zou R, Shuai W, Guo Y, Wang J, Cao N. Chart2Vec: A Universal Embedding of Context-Aware Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:2167-2181. [PMID: 38551829 DOI: 10.1109/tvcg.2024.3383089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
The advances in AI-enabled techniques have accelerated the creation and automation of visualizations in the past decade. However, presenting visualizations in a descriptive and generative format remains a challenge. Moreover, current visualization embedding methods focus on standalone visualizations, neglecting the importance of contextual information for multi-view visualizations. To address this issue, we propose a new representation model, Chart2Vec, to learn a universal embedding of visualizations with context-aware information. Chart2Vec aims to support a wide range of downstream visualization tasks such as recommendation and storytelling. Our model considers both structural and semantic information of visualizations in declarative specifications. To enhance the context-aware capability, Chart2Vec employs multi-task learning on both supervised and unsupervised tasks concerning the cooccurrence of visualizations. We evaluate our method through an ablation study, a user study, and a quantitative comparison. The results verified the consistency of our embedding method with human cognition and showed its advantages over existing methods.
Collapse
|
2
|
Podo L, Prenkaj B, Velardi P. Agnostic Visual Recommendation Systems: Open Challenges and Future Directions. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1902-1917. [PMID: 38466597 DOI: 10.1109/tvcg.2024.3374571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Visualization Recommendation Systems (VRSs) are a novel and challenging field of study aiming to help generate insightful visualizations from data and support non-expert users in information discovery. Among the many contributions proposed in this area, some systems embrace the ambitious objective of imitating human analysts to identify relevant relationships in data and make appropriate design choices to represent these relationships with insightful charts. We denote these systems as "agnostic" VRSs since they do not rely on human-provided constraints and rules but try to learn the task autonomously. Despite the high application potential of agnostic VRSs, their progress is hindered by several obstacles, including the absence of standardized datasets to train recommendation algorithms, the difficulty of learning design rules, and defining quantitative criteria for evaluating the perceptual effectiveness of generated plots. This article summarizes the literature on agnostic VRSs and outlines promising future research directions.
Collapse
|
3
|
Tian Y, Cui W, Deng D, Yi X, Yang Y, Zhang H, Wu Y. ChartGPT: Leveraging LLMs to Generate Charts From Abstract Natural Language. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1731-1745. [PMID: 38386583 DOI: 10.1109/tvcg.2024.3368621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
The use of natural language interfaces (NLIs) to create charts is becoming increasingly popular due to the intuitiveness of natural language interactions. One key challenge in this approach is to accurately capture user intents and transform them to proper chart specifications. This obstructs the wide use of NLI in chart generation, as users' natural language inputs are generally abstract (i.e., ambiguous or under-specified), without a clear specification of visual encodings. Recently, pre-trained large language models (LLMs) have exhibited superior performance in understanding and generating natural language, demonstrating great potential for downstream tasks. Inspired by this major trend, we propose ChartGPT, generating charts from abstract natural language inputs. However, LLMs are struggling to address complex logic problems. To enable the model to accurately specify the complex parameters and perform operations in chart generation, we decompose the generation process into a step-by-step reasoning pipeline, so that the model only needs to reason a single and specific sub-task during each run. Moreover, LLMs are pre-trained on general datasets, which might be biased for the task of chart generation. To provide adequate visualization knowledge, we create a dataset consisting of abstract utterances and charts and improve model performance through fine-tuning. We further design an interactive interface for ChartGPT that allows users to check and modify the intermediate outputs of each step. The effectiveness of the proposed system is evaluated through quantitative evaluations and a user study.
Collapse
|
4
|
Tong W, Shigyo K, Yuan LP, Fan M, Pong TC, Qu H, Xia M. VisTellAR: Embedding Data Visualization to Short-Form Videos Using Mobile Augmented Reality. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1862-1874. [PMID: 38427541 DOI: 10.1109/tvcg.2024.3372104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
With the rise of short-form video platforms and the increasing availability of data, we see the potential for people to share short-form videos embedded with data in situ (e.g., daily steps when running) to increase the credibility and expressiveness of their stories. However, creating and sharing such videos in situ is challenging since it involves multiple steps and skills (e.g., data visualization creation and video editing), especially for amateurs. By conducting a formative study (N=10) using three design probes, we collected the motivations and design requirements. We then built VisTellAR, a mobile AR authoring tool, to help amateur video creators embed data visualizations in short-form videos in situ. A two-day user study shows that participants (N=12) successfully created various videos with data visualizations in situ and they confirmed the ease of use and learning. AR pre-stage authoring was useful to assist people in setting up data visualizations in reality with more designs in camera movements and interaction with gestures and physical objects to storytelling.
Collapse
|
5
|
Shen Y, Zhao Y, Wang Y, Ge T, Shi H, Lee B. Authoring Data-Driven Chart Animations Through Direct Manipulation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1613-1630. [PMID: 39499609 DOI: 10.1109/tvcg.2024.3491504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2024]
Abstract
We present an authoring tool, called CAST+ (Canis Studio Plus), that enables the interactive creation of chart animations through the direct manipulation of keyframes. It introduces the visual specification of chart animations consisting of keyframes that can be played sequentially or simultaneously, and animation parameters (e.g., duration, delay). Building on Canis (Ge et al. 2020), a declarative chart animation grammar that leverages data-enriched SVG charts, CAST+ supports auto-completion for constructing both keyframes and keyframe sequences. It also enables users to refine the animation specification (e.g., aligning keyframes across tracks to play them together, adjusting delay) with direct manipulation. We report a user study conducted to assess the visual specification and system usability with its initial version. We enhanced the system's expressiveness and usability: CAST+ now supports the animation of multiple types of visual marks in the same keyframe group with new auto-completion algorithms based on generalized selection. This enables the creation of more expressive animations, while reducing the number of interactions needed to create comparable animations. We present a gallery of examples and four usage scenarios to demonstrate the expressiveness of CAST+. Finally, we discuss the limitations, comparison, and potentials of CAST+ as well as directions for future research.
Collapse
|
6
|
Wang HW, Hoffswell J, Thazin Thane SM, Bursztyn VS, Bearfield CX. How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:536-546. [PMID: 39283799 DOI: 10.1109/tvcg.2024.3456378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Large Language Models (LLMs) have been adopted for a variety of visualizations tasks, but how far are we from perceptually aware LLMs that can predict human takeaways? Graphical perception literature has shown that human chart takeaways are sensitive to visualization design choices, such as spatial layouts. In this work, we examine the extent to which LLMs exhibit such sensitivity when generating takeaways, using bar charts with varying spatial layouts as a case study. We conducted three experiments and tested four common bar chart layouts: vertically juxtaposed, horizontally juxtaposed, overlaid, and stacked. In Experiment 1, we identified the optimal configurations to generate meaningful chart takeaways by testing four LLMs, two temperature settings, nine chart specifications, and two prompting strategies. We found that even state-of-the-art LLMs struggled to generate semantically diverse and factually accurate takeaways. In Experiment 2, we used the optimal configurations to generate 30 chart takeaways each for eight visualizations across four layouts and two datasets in both zero-shot and one-shot settings. Compared to human takeaways, we found that the takeaways LLMs generated often did not match the types of comparisons made by humans. In Experiment 3, we examined the effect of chart context and data on LLM takeaways. We found that LLMs, unlike humans, exhibited variation in takeaway comparison types for different bar charts using the same bar layout. Overall, our case study evaluates the ability of LLMs to emulate human interpretations of data and points to challenges and opportunities in using LLMs to predict human chart takeaways.
Collapse
|
7
|
Guo Z, Kale A, Kay M, Hullman J. VMC: A Grammar for Visualizing Statistical Model Checks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:798-808. [PMID: 39348251 DOI: 10.1109/tvcg.2024.3456402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/02/2024]
Abstract
Visualizations play a critical role in validating and improving statistical models. However, the design space of model check visualizations is not well understood, making it difficult for authors to explore and specify effective graphical model checks. VMC defines a model check visualization using four components: (1) samples of distributions of checkable quantities generated from the model, including predictive distributions for new data and distributions of model parameters; (2) transformations on observed data to facilitate comparison; (3) visual representations of distributions; and (4) layouts to facilitate comparing model samples and observed data. We contribute an implementation of VMC as an R package. We validate VMC by reproducing a set of canonical model check examples, and show how using VMC to generate model checks reduces the edit distance between visualizations relative to existing visualization toolkits. The findings of an interview study with three expert modelers who used VMC highlight challenges and opportunities for encouraging exploration of correct, effective model check visualizations.
Collapse
|
8
|
Liu Z, Chen C, Hooker J. Manipulable Semantic Components: A Computational Representation of Data Visualization Scenes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:732-742. [PMID: 39255155 DOI: 10.1109/tvcg.2024.3456296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Various data visualization applications such as reverse engineering and interactive authoring require a vocabulary that describes the structure of visualization scenes and the procedure to manipulate them. A few scene abstractions have been proposed, but they are restricted to specific applications for a limited set of visualization types. A unified and expressive model of data visualization scenes for different applications has been missing. To fill this gap, we present Manipulable Semantic Components (MSC), a computational representation of data visualization scenes, to support applications in scene understanding and augmentation. MSC consists of two parts: a unified object model describing the structure of a visualization scene in terms of semantic components, and a set of operations to generate and modify the scene components. We demonstrate the benefits of MSC in three applications: visualization authoring, visualization deconstruction and reuse, and animation specification.
Collapse
|
9
|
Bako HK, Liu X, Ko G, Song H, Battle L, Liu Z. Unveiling How Examples Shape Visualization Design Outcomes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1137-1147. [PMID: 39255158 DOI: 10.1109/tvcg.2024.3456407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Visualization designers (e.g., journalists or data analysts) often rely on examples to explore the space of possible designs, yet we have little insight into how examples shape data visualization design outcomes. While the effects of examples have been studied in other disciplines, such as web design or engineering, the results are not readily applicable to visualization due to inconsistencies in findings and challenges unique to visualization design. Towards bridging this gap, we conduct an exploratory experiment involving 32 data visualization designers focusing on the influence of five factors (timing, quantity, diversity, data topic similarity, and data schema similarity) on objectively measurable design outcomes (e.g., numbers of designs and idea transfers). Our quantitative analysis shows that when examples are introduced after initial brainstorming, designers curate examples with topics less similar to the dataset they are working on and produce more designs with a high variation in visualization components. Also, designers copy more ideas from examples with higher data schema similarities. Our qualitative analysis of participants' thought processes provides insights into why designers incorporate examples into their designs, revealing potential factors that have not been previously investigated. Finally, we discuss how our results inform how designers may use examples during design ideation as well as future research on quantifying designs and supporting example-based visualization design. All supplemental materials are available in our OSF repo.
Collapse
|
10
|
Wootton D, Fox AR, Peck E, Satyanarayan A. Charting EDA: Characterizing Interactive Visualization Use in Computational Notebooks with a Mixed-Methods Formalism. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1191-1201. [PMID: 39388331 DOI: 10.1109/tvcg.2024.3456217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Interactive visualizations are powerful tools for Exploratory Data Analysis (EDA), but how do they affect the observations analysts make about their data? We conducted a qualitative experiment with 13 professional data scientists analyzing two datasets with Jupyter notebooks, collecting a rich dataset of interaction traces and think-aloud utterances. By qualitatively coding participant utterances, we introduce a formalism that describes EDA as a sequence of analysis states, where each state is comprised of either a representation an analyst constructs (e.g., the output of a data frame, an interactive visualization, etc.) or an observation the analyst makes (e.g., about missing data, the relationship between variables, etc.). By applying our formalism to our dataset, we identify that interactive visualizations, on average, lead to earlier and more complex insights about relationships between dataset attributes compared to static visualizations. Moreover, by calculating metrics such as revisit count and representational diversity, we uncover that some representations serve more as "planning aids" during EDA rather than tools strictly for hypothesis-answering. We show how these measures help identify other patterns of analysis behavior, such as the "80-20 rule", where a small subset of representations drove the majority of observations. Based on these findings, we offer design guidelines for interactive exploratory analysis tooling and reflect on future directions for studying the role that visualizations play in EDA.
Collapse
|
11
|
Chen N, Zhang Y, Xu J, Ren K, Yang Y. VisEval: A Benchmark for Data Visualization in the Era of Large Language Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1301-1311. [PMID: 39255134 DOI: 10.1109/tvcg.2024.3456320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Translating natural language to visualization (NL2VIS) has shown great promise for visual data analysis, but it remains a challenging task that requires multiple low-level implementations, such as natural language processing and visualization design. Recent advancements in pre-trained large language models (LLMs) are opening new avenues for generating visualizations from natural language. However, the lack of a comprehensive and reliable benchmark hinders our understanding of LLMs' capabilities in visualization generation. In this paper, we address this gap by proposing a new NL2VIS benchmark called VisEval. Firstly, we introduce a high-quality and large-scale dataset. This dataset includes 2,524 representative queries covering 146 databases, paired with accurately labeled ground truths. Secondly, we advocate for a comprehensive automated evaluation methodology covering multiple dimensions, including validity, legality, and readability. By systematically scanning for potential issues with a number of heterogeneous checkers, VisEval provides reliable and trustworthy evaluation outcomes. We run VisEval on a series of state-of-the-art LLMs. Our evaluation reveals prevalent challenges and delivers essential insights for future advancements.
Collapse
|
12
|
Narechania A, Odak K, El-Assady M, Endert A. ProvenanceWidgets: A Library of UI Control Elements to Track and Dynamically Overlay Analytic Provenance. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1235-1245. [PMID: 39250388 DOI: 10.1109/tvcg.2024.3456144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
We present ProvenanceWidgets, a Javascript library of UI control elements such as radio buttons, checkboxes, and dropdowns to track and dynamically overlay a user's analytic provenance. These in situ overlays not only save screen space but also minimize the amount of time and effort needed to access the same information from elsewhere in the UI. In this paper, we discuss how we design modular UI control elements to track how often and how recently a user interacts with them and design visual overlays showing an aggregated summary as well as a detailed temporal history. We demonstrate the capability of ProvenanceWidgets by recreating three prior widget libraries: (1) Scented Widgets, (2) Phosphor objects, and (3) Dynamic Query Widgets. We also evaluated its expressiveness and conducted case studies with visualization developers to evaluate its effectiveness. We find that ProvenanceWidgets enables developers to implement custom provenance-tracking applications effectively. ProvenanceWidgets is available as open-source software at https://github.com/ProvenanceWidgets to help application developers build custom provenance-based systems.
Collapse
|
13
|
Wang HW, Gordon M, Battle L, Heer J. DracoGPT: Extracting Visualization Design Preferences from Large Language Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:710-720. [PMID: 39283801 DOI: 10.1109/tvcg.2024.3456350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Trained on vast corpora, Large Language Models (LLMs) have the potential to encode visualization design knowledge and best practices. However, if they fail to do so, they might provide unreliable visualization recommendations. What visualization design preferences, then, have LLMs learned? We contribute DracoGPT, a method for extracting, modeling, and assessing visualization design preferences from LLMs. To assess varied tasks, we develop two pipelines-DracoGPT-Rank and DracoGPT-Recommend-to model LLMs prompted to either rank or recommend visual encoding specifications. We use Draco as a shared knowledge base in which to represent LLM design preferences and compare them to best practices from empirical research. We demonstrate that DracoGPT can accurately model the preferences expressed by LLMs, enabling analysis in terms of Draco design constraints. Across a suite of backing LLMs, we find that DracoGPT-Rank and DracoGPT-Recommend moderately agree with each other, but both substantially diverge from guidelines drawn from human subjects experiments. Future work can build on our approach to expand Draco's knowledge base to model a richer set of preferences and to provide a robust and cost-effective stand-in for LLMs.
Collapse
|
14
|
Li G, Mi H, Liu CH, Itoh T, Wang G. HiRegEx: Interactive Visual Query and Exploration of Multivariate Hierarchical Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:699-709. [PMID: 39255148 DOI: 10.1109/tvcg.2024.3456389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
When using exploratory visual analysis to examine multivariate hierarchical data, users often need to query data to narrow down the scope of analysis. However, formulating effective query expressions remains a challenge for multivariate hierarchical data, particularly when datasets become very large. To address this issue, we develop a declarative grammar, HiRegEx (Hierarchical data Regular Expression), for querying and exploring multivariate hierarchical data. Rooted in the extended multi-level task topology framework for tree visualizations (e-MLTT), HiRegEx delineates three query targets (node, path, and subtree) and two aspects for querying these targets (features and positions), and uses operators developed based on classical regular expressions for query construction. Based on the HiRegEx grammar, we develop an exploratory framework for querying and exploring multivariate hierarchical data and integrate it into the TreeQueryER prototype system. The exploratory framework includes three major components: top-down pattern specification, bottom-up data-driven inquiry, and context-creation data overview. We validate the expressiveness of HiRegEx with the tasks from the e-MLTT framework and showcase the utility and effectiveness of TreeQueryER system through a case study involving expert users in the analysis of a citation tree dataset.
Collapse
|
15
|
L'Yi S, van den Brandt A, Adams E, Nguyen HN, Gehlenborg N. Learnable and Expressive Visualization Authoring Through Blended Interfaces. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:459-469. [PMID: 39255109 PMCID: PMC11875996 DOI: 10.1109/tvcg.2024.3456598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
A wide range of visualization authoring interfaces enable the creation of highly customized visualizations. However, prioritizing expressiveness often impedes the learnability of the authoring interface. The diversity of users, such as varying computational skills and prior experiences in user interfaces, makes it even more challenging for a single authoring interface to satisfy the needs of a broad audience. In this paper, we introduce a framework to balance learnability and expressivity in a visualization authoring system. Adopting insights from learnability studies, such as multimodal interaction and visualization literacy, we explore the design space of blending multiple visualization authoring interfaces for supporting authoring tasks in a complementary and flexible manner. To evaluate the effectiveness of blending interfaces, we implemented a proof-of-concept system, Blace, that combines four common visualization authoring interfaces-template-based, shelf configuration, natural language, and code editor-that are tightly linked to one another to help users easily relate unfamiliar interfaces to more familiar ones. Using the system, we conducted a user study with 12 domain experts who regularly visualize genomics data as part of their analysis workflow. Participants with varied visualization and programming backgrounds were able to successfully reproduce unfamiliar visualization examples without a guided tutorial in the study. Feedback from a post-study qualitative questionnaire further suggests that blending interfaces enabled participants to learn the system easily and assisted them in confidently editing unfamiliar visualization grammar in the code editor, enabling expressive customization. Reflecting on our study results and the design of our system, we discuss the different interaction patterns that we identified and design implications for blending visualization authoring interfaces.
Collapse
|
16
|
Dhanoa V, Hinterreiter A, Fediuk V, Elmqvist N, Groller E, Streit M. D-Tour: Semi-Automatic Generation of Interactive Guided Tours for Visualization Dashboard Onboarding. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:721-731. [PMID: 39259628 DOI: 10.1109/tvcg.2024.3456347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
Onboarding a user to a visualization dashboard entails explaining its various components, including the chart types used, the data loaded, and the interactions available. Authoring such an onboarding experience is time-consuming and requires significant knowledge and little guidance on how best to complete this task. Depending on their levels of expertise, end users being onboarded to a new dashboard can be either confused and overwhelmed or disinterested and disengaged. We propose interactive dashboard tours (D-Tours) as semi-automated onboarding experiences that preserve the agency of users with various levels of expertise to keep them interested and engaged. Our interactive tours concept draws from open-world game design to give the user freedom in choosing their path through onboarding. We have implemented the concept in a tool called D-TOUR Prototype, which allows authors to craft custom interactive dashboard tours from scratch or using automatic templates. Automatically generated tours can still be customized to use different media (e.g., video, audio, and highlighting) or new narratives to produce an onboarding experience tailored to an individual user. We demonstrate the usefulness of interactive dashboard tours through use cases and expert interviews. Our evaluation shows that authors found the automation in the D-Tour Prototype helpful and time-saving, and users found the created tours engaging and intuitive. This paper and all supplemental materials are available at https://osf.io/6fbjp/.
Collapse
|
17
|
McNutt A, Stone MC, Heer J. Mixing Linters with GUIs: A Color Palette Design Probe. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:327-337. [PMID: 39259629 DOI: 10.1109/tvcg.2024.3456317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
Visualization linters are end-user facing evaluators that automatically identify potential chart issues. These spell-checker like systems offer a blend of interpretability and customization that is not found in other forms of automated assistance. However, existing linters do not model context and have primarily targeted users who do not need assistance, resulting in obvious-even annoying-advice. We investigate these issues within the domain of color palette design, which serves as a microcosm of visualization design concerns. We contribute a GUI-based color palette linter as a design probe that covers perception, accessibility, context, and other design criteria, and use it to explore visual explanations, integrated fixes, and user defined linting rules. Through a formative interview study and theory-driven analysis, we find that linters can be meaningfully integrated into graphical contexts thereby addressing many of their core issues. We discuss implications for integrating linters into visualization tools, developing improved assertion languages, and supporting end-user tunable advice-all laying the groundwork for more effective visualization linters in any context.
Collapse
|
18
|
Moreira G, Hosseini M, Veiga C, Alexandre L, Colaninno N, de Oliveira D, Ferreira N, Lage M, Miranda F. Curio: A Dataflow-Based Framework for Collaborative Urban Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1224-1234. [PMID: 39255103 DOI: 10.1109/tvcg.2024.3456353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Over the past decade, several urban visual analytics systems and tools have been proposed to tackle a host of challenges faced by cities, in areas as diverse as transportation, weather, and real estate. Many of these tools have been designed through collaborations with urban experts, aiming to distill intricate urban analysis workflows into interactive visualizations and interfaces. However, the design, implementation, and practical use of these tools still rely on siloed approaches, resulting in bespoke systems that are difficult to reproduce and extend. At the design level, these tools undervalue rich data workflows from urban experts, typically treating them only as data providers and evaluators. At the implementation level, they lack interoperability with other technical frameworks. At the practical use level, they tend to be narrowly focused on specific fields, inadvertently creating barriers to cross-domain collaboration. To address these gaps, we present Curio, a framework for collaborative urban visual analytics. Curio uses a dataflow model with multiple abstraction levels (code, grammar, GUI elements) to facilitate collaboration across the design and implementation of visual analytics components. The framework allows experts to intertwine data preprocessing, management, and visualization stages while tracking the provenance of code and visualizations. In collaboration with urban experts, we evaluate Curio through a diverse set of usage scenarios targeting urban accessibility, urban microclimate, and sunlight access. These scenarios use different types of data and domain methodologies to illustrate Curio's flexibility in tackling pressing societal challenges. Curio is available at urbantk.org/curio.
Collapse
|
19
|
van den Brandt A, L'Yi S, Nguyen HN, Vilanova A, Gehlenborg N. Understanding Visualization Authoring Techniques for Genomics Data in the Context of Personas and Tasks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1180-1190. [PMID: 39288066 PMCID: PMC11875953 DOI: 10.1109/tvcg.2024.3456298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/19/2024]
Abstract
Genomics experts rely on visualization to extract and share insights from complex and large-scale datasets. Beyond off-the-shelf tools for data exploration, there is an increasing need for platforms that aid experts in authoring customized visualizations for both exploration and communication of insights. A variety of interactive techniques have been proposed for authoring data visualizations, such as template editing, shelf configuration, natural language input, and code editors. However, it remains unclear how genomics experts create visualizations and which techniques best support their visualization tasks and needs. To address this gap, we conducted two user studies with genomics researchers: (1) semi-structured interviews (n=20) to identify the tasks, user contexts, and current visualization authoring techniques and (2) an exploratory study (n=13) using visual probes to elicit users' intents and desired techniques when creating visualizations. Our contributions include (1) a characterization of how visualization authoring is currently utilized in genomics visualization, identifying limitations and benefits in light of common criteria for authoring tools, and (2) generalizable design implications for genomics visualization authoring tools based on our findings on task- and user-specific usefulness of authoring techniques. All supplemental materials are available at https://osf.io/bdj4v/.
Collapse
|
20
|
Keller MS, Gold I, McCallum C, Manz T, Kharchenko PV, Gehlenborg N. Vitessce: integrative visualization of multimodal and spatially resolved single-cell data. Nat Methods 2025; 22:63-67. [PMID: 39333268 PMCID: PMC11725496 DOI: 10.1038/s41592-024-02436-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 08/16/2024] [Indexed: 09/29/2024]
Abstract
Multiomics technologies with single-cell and spatial resolution make it possible to measure thousands of features across millions of cells. However, visual analysis of high-dimensional transcriptomic, proteomic, genome-mapped and imaging data types simultaneously remains a challenge. Here we describe Vitessce, an interactive web-based visualization framework for exploration of multimodal and spatially resolved single-cell data. We demonstrate integrative visualization of millions of data points, including cell-type annotations, gene expression quantities, spatially resolved transcripts and cell segmentations, across multiple coordinated views. The open-source software is available at http://vitessce.io .
Collapse
Affiliation(s)
- Mark S Keller
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ilan Gold
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Chuck McCallum
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Trevor Manz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter V Kharchenko
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Altos Labs, San Diego, CA, USA
| | - Nils Gehlenborg
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
21
|
Choga WT, Gustani-Buss E, Tegally H, Maruapula D, Yu X, Moir M, Zuze BJL, James SE, Ndlovu NS, Seru K, Motshosi P, Blenkinsop A, Gobe I, Baxter C, Manasa J, Lockman S, Shapiro R, Makhema J, Wilkinson E, Blackard JT, Lemey P, Lessells RJ, Martin DP, de Oliveira T, Gaseitsiwe S, Moyo S. Emergence of Omicron FN.1 a descendent of BQ.1.1 in Botswana. Virus Evol 2024; 10:veae095. [PMID: 39720788 PMCID: PMC11666700 DOI: 10.1093/ve/veae095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 10/31/2024] [Accepted: 11/21/2024] [Indexed: 12/26/2024] Open
Abstract
Botswana, like the rest of the world, has been significantly impacted by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In December 2022, we detected a monophyletic cluster of genomes comprising a sublineage of the Omicron variant of concern (VOC) designated as B.1.1.529.5.3.1.1.1.1.1.1.74.1 (alias FN.1, clade 22E). These genomes were sourced from both epidemiologically linked and unlinked samples collected in three close locations within the district of Greater Gaborone. In this study, we assessed the worldwide prevalence of the FN.1 lineage, evaluated its mutational profile, and conducted a phylogeographic analysis to reveal its global dispersal dynamics. Among approximately 16 million publicly available SARS-CoV-2 sequences generated by 30 September 2023, only 87 were of the FN.1 lineage, including 22 from Botswana, 6 from South Africa, and 59 from the UK. The estimated time to the most recent common ancestor of the 87 FN.1 sequences was 22 October 2022 [95% highest posterior density: 2 September 2022-24 November 2022], with the earliest of the 22 Botswana sequences having been sampled on 7 December 2022. Discrete trait reconstruction of FN.1 identified Botswana as the most probable place of origin. The FN.1 lineage is derived from the BQ.1.1 lineage and carries two missense variants in the spike protein, S:K182E in NTD and S:T478R in RDB. Among the over 90 SARS-CoV-2 lineages circulating in Botswana between September 2020 and July 2023, FN.1 was most closely related to BQ.1.1.74 based on maximum likelihood phylogenetic inference, differing only by the S:K182E mutation found in FN.1. Given the early detection of numerous novel variants from Botswana and its neighbouring countries, our study underscores the necessity of continuous surveillance to monitor the emergence of potential VOCs, integrating molecular and spatial data to identify dissemination patterns enhancing preparedness efforts.
Collapse
Affiliation(s)
- Wonderful T Choga
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Faculty of Health Sciences, School of Allied Health Sciences, Gaborone, Private Bag UB 0022, Botswana
- Centre for Epidemic Response and Innovation (CERI), School of Data Science and Computational Thinking, Stellenbosch University, Stellenbosch 7600, South Africa
| | - Emanuele Gustani-Buss
- Laboratory for Clinical and Epidemiological Virology, Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven 3000, Belgium
| | - Houriiyah Tegally
- Centre for Epidemic Response and Innovation (CERI), School of Data Science and Computational Thinking, Stellenbosch University, Stellenbosch 7600, South Africa
| | - Dorcas Maruapula
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
| | - Xiaoyu Yu
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, Scotland, UK
| | - Monika Moir
- Centre for Epidemic Response and Innovation (CERI), School of Data Science and Computational Thinking, Stellenbosch University, Stellenbosch 7600, South Africa
| | - Boitumelo J L Zuze
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Faculty of Health Sciences, School of Allied Health Sciences, Gaborone, Private Bag UB 0022, Botswana
| | - San Emmanuel James
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), School of Laboratory. Medicine and Medical Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
| | - Nokuthula S Ndlovu
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
| | - Kedumetse Seru
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
| | - Patience Motshosi
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
| | - Alexandra Blenkinsop
- Department of Mathematics, Imperial College London, London, Westminster, SW7 2AZ, United Kingdom
| | - Irene Gobe
- Faculty of Health Sciences, School of Allied Health Sciences, Gaborone, Private Bag UB 0022, Botswana
| | - Cheryl Baxter
- Centre for Epidemic Response and Innovation (CERI), School of Data Science and Computational Thinking, Stellenbosch University, Stellenbosch 7600, South Africa
| | - Justen Manasa
- Faculty of Medicine and Health Sciences, Molecular Diagnostics and Investigative Sciences, University of Zimbabwe, Harare, P.O.Box MP167, Zimbabwe
| | - Shahin Lockman
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
- Division of Infectious Diseases, Brigham & Women’s Hospital, Boston, MA 02115, United States
- Harvard Medical School, Boston, MA, 02115, United States
| | - Roger Shapiro
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
| | - Joseph Makhema
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
| | - Eduan Wilkinson
- Centre for Epidemic Response and Innovation (CERI), School of Data Science and Computational Thinking, Stellenbosch University, Stellenbosch 7600, South Africa
| | - Jason T Blackard
- University of Cincinnati College of Medicine, Cincinnati, OH 45267, United States
| | - Phillipe Lemey
- Laboratory for Clinical and Epidemiological Virology, Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven 3000, Belgium
| | - Richard J Lessells
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), School of Laboratory. Medicine and Medical Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
| | - Darren P Martin
- Division of Computational Biology, Department of Integrative Biomedial Sciences, Institute of Infectious Diseases and Molecular Medicine, University of Cape Town, Cape Town 7925, South Africa
| | - Tulio de Oliveira
- Centre for Epidemic Response and Innovation (CERI), School of Data Science and Computational Thinking, Stellenbosch University, Stellenbosch 7600, South Africa
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), School of Laboratory. Medicine and Medical Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
- Department of Global Health, University of Washington, Seattle, WA 98105, United States
| | - Simani Gaseitsiwe
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
| | - Sikhulile Moyo
- Research Laboratory, Botswana Harvard Health Partnership, Gaborone, Private Bag BO 320, Botswana
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
- School of Health Systems and Public Health, University of Pretoria, Pretoria 0002, South Africa
- Division of Medical Virology, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, Cape Town 7602, South Africa
| |
Collapse
|
22
|
Warnking RP, Scheer J, Becker F, Siegel F, Trinkmann F, Nagel T. Designing interactive visualizations for analyzing chronic lung diseases in a user-centered approach. J Am Med Inform Assoc 2024; 31:2486-2495. [PMID: 38796836 PMCID: PMC11491598 DOI: 10.1093/jamia/ocae113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/22/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024] Open
Abstract
OBJECTIVES Medical practitioners analyze numerous types of data, often using archaic representations that do not meet their needs. Pneumologists who analyze lung function exams must often consult multiple exam records manually, making comparisons cumbersome. Such shortcomings can be addressed with interactive visualizations, but these must be designed carefully with practitioners' needs in mind. MATERIALS AND METHODS A workshop with experts was conducted to gather user requirements and common tasks. Based on the workshop results, we iteratively designed a web-based prototype, continuously consulting experts along the way. The resulting application was evaluated in a formative study via expert interviews with 3 medical practitioners. RESULTS Participants in our study were able to solve all tasks in accordance with experts' expectations and generally viewed our system positively, though there were some usability and utility issues in the initial prototype. An improved version of our system solves these issues and includes additional customization functionalities. DISCUSSION The study results showed that participants were able to use our system effectively to solve domain-relevant tasks, even though some shortcomings could be observed. Using a different framework with more fine-grained control over interactions and visual elements, we implemented design changes in an improved version of our prototype that needs to be evaluated in future work. CONCLUSION Employing a user-centered design approach, we developed a visual analytics system for lung function data that allows medical practitioners to more easily analyze the progression of several key parameters over time.
Collapse
Affiliation(s)
- René Pascal Warnking
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, 68167 Mannheim, Germany
- Human Data Interaction Lab, Mannheim University of Applied Sciences, 68163 Mannheim, Germany
| | - Jan Scheer
- Human Data Interaction Lab, Mannheim University of Applied Sciences, 68163 Mannheim, Germany
| | - Franziska Becker
- Institute for Visualization and Interactive Systems (VIS), University of Stuttgart, 70569 Stuttgart, Germany
| | - Fabian Siegel
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, 68167 Mannheim, Germany
| | - Frederik Trinkmann
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, 68167 Mannheim, Germany
- Department of Pneumology and Critical Care Medicine, Thoraxklinik, University of Heidelberg, Translational Lung Research Center Heidelberg (TLRC), German Center for Lung Research (DZL), 69126 Heidelberg, Germany
| | - Till Nagel
- Human Data Interaction Lab, Mannheim University of Applied Sciences, 68163 Mannheim, Germany
| |
Collapse
|
23
|
Jana B, Liu X, Dénéréaz J, Park H, Leshchiner D, Liu B, Gallay C, Zhu J, Veening JW, van Opijnen T. CRISPRi-TnSeq maps genome-wide interactions between essential and non-essential genes in bacteria. Nat Microbiol 2024; 9:2395-2409. [PMID: 39030344 PMCID: PMC11371651 DOI: 10.1038/s41564-024-01759-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 06/12/2024] [Indexed: 07/21/2024]
Abstract
Genetic interactions identify functional connections between genes and pathways, establishing gene functions or druggable targets. Here we use CRISPRi-TnSeq, CRISPRi-mediated knockdown of essential genes alongside TnSeq-mediated knockout of non-essential genes, to map genome-wide interactions between essential and non-essential genes in Streptococcus pneumoniae. Transposon-mutant libraries constructed in 13 CRISPRi strains enabled screening of ~24,000 gene pairs. This identified 1,334 genetic interactions, including 754 negative and 580 positive interactions. Network analyses show that 17 non-essential genes pleiotropically interact with more than half the essential genes tested. Validation experiments confirmed that a 7-gene subset protects against perturbations. Furthermore, we reveal hidden redundancies that compensate for essential gene loss, relationships between cell wall synthesis, integrity and cell division, and show that CRISPRi-TnSeq identifies synthetic and suppressor-type relationships between both functionally linked and disparate genes and pathways. Importantly, in species where CRISPRi and Tn-Seq are established, CRISPRi-TnSeq should be straightforward to implement.
Collapse
Affiliation(s)
- Bimal Jana
- Department of Biology, Boston College, Chestnut Hill, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Xue Liu
- Department of Pathogen Biology, Base for International Science and Technology Cooperation: Carson Cancer Stem Cell Vaccines R&D Center, International Cancer Center, Shenzhen University Health Science Center, Shenzhen, China
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Julien Dénéréaz
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Hongshik Park
- Department of Biology, Boston College, Chestnut Hill, MA, USA
| | | | - Bruce Liu
- Department of Biology, Boston College, Chestnut Hill, MA, USA
| | - Clément Gallay
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Junhao Zhu
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Jan-Willem Veening
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland.
| | - Tim van Opijnen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Boston Children's Hospital, Division of Infectious Diseases, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
24
|
Bearfield CX, Stokes C, Lovett A, Franconeri S. What Does the Chart Say? Grouping Cues Guide Viewer Comparisons and Conclusions in Bar Charts. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:5097-5110. [PMID: 37792647 DOI: 10.1109/tvcg.2023.3289292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
Reading a visualization is like reading a paragraph. Each sentence is a comparison: the mean of these is higher than those; this difference is smaller than that. What determines which comparisons are made first? The viewer's goals and expertise matter, but the way that values are visually grouped together within the chart also impacts those comparisons. Research from psychology suggests that comparisons involve multiple steps. First, the viewer divides the visualization into a set of units. This might include a single bar or a grouped set of bars. Then the viewer selects and compares two of these units, perhaps noting that one pair of bars is longer than another. Viewers might take an additional third step and perform a second-order comparison, perhaps determining that the difference between one pair of bars is greater than the difference between another pair. We create a visual comparison taxonomy that allows us to develop and test a sequence of hypotheses about which comparisons people are more likely to make when reading a visualization. We find that people tend to compare two groups before comparing two individual bars and that second-order comparisons are rare. Visual cues like spatial proximity and color can influence which elements are grouped together and selected for comparison, with spatial proximity being a stronger grouping cue. Interestingly, once the viewer grouped together and compared a set of bars, regardless of whether the group is formed by spatial proximity or color similarity, they no longer consider other possible groupings in their comparisons.
Collapse
|
25
|
Fu Y, Stasko J. More Than Data Stories: Broadening the Role of Visualization in Contemporary Journalism. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:5240-5259. [PMID: 37339040 DOI: 10.1109/tvcg.2023.3287585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2023]
Abstract
Data visualization and journalism are deeply connected. From early infographics to recent data-driven storytelling, visualization has become an integrated part of contemporary journalism, primarily as a communication artifact to inform the general public. Data journalism, harnessing the power of data visualization, has emerged as a bridge between the growing volume of data and our society. Visualization research that centers around data storytelling has sought to understand and facilitate such journalistic endeavors. However, a recent metamorphosis in journalism has brought broader challenges and opportunities that extend beyond mere communication of data. We present this article to enhance our understanding of such transformations and thus broaden visualization research's scope and practical contribution to this evolving field. We first survey recent significant shifts, emerging challenges, and computational practices in journalism. We then summarize six roles of computing in journalism and their implications. Based on these implications, we provide propositions for visualization research concerning each role. Ultimately, by mapping the roles and propositions onto a proposed ecological model and contextualizing existing visualization research, we surface seven general topics and a series of research agendas that can guide future visualization research at this intersection.
Collapse
|
26
|
Vohra SK, Harth P, Isoe Y, Bahl A, Fotowat H, Engert F, Hege HC, Baum D. A Visual Interface for Exploring Hypotheses About Neural Circuits. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:3945-3958. [PMID: 37022819 PMCID: PMC11252567 DOI: 10.1109/tvcg.2023.3243668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
One of the fundamental problems in neurobiological research is to understand how neural circuits generate behaviors in response to sensory stimuli. Elucidating such neural circuits requires anatomical and functional information about the neurons that are active during the processing of the sensory information and generation of the respective response, as well as an identification of the connections between these neurons. With modern imaging techniques, both morphological properties of individual neurons as well as functional information related to sensory processing, information integration and behavior can be obtained. Given the resulting information, neurobiologists are faced with the task of identifying the anatomical structures down to individual neurons that are linked to the studied behavior and the processing of the respective sensory stimuli. Here, we present a novel interactive tool that assists neurobiologists in the aforementioned tasks by allowing them to extract hypothetical neural circuits constrained by anatomical and functional data. Our approach is based on two types of structural data: brain regions that are anatomically or functionally defined, and morphologies of individual neurons. Both types of structural data are interlinked and augmented with additional information. The presented tool allows the expert user to identify neurons using Boolean queries. The interactive formulation of these queries is supported by linked views, using, among other things, two novel 2D abstractions of neural circuits. The approach was validated in two case studies investigating the neural basis of vision-based behavioral responses in zebrafish larvae. Despite this particular application, we believe that the presented tool will be of general interest for exploring hypotheses about neural circuits in other species, genera and taxa.
Collapse
|
27
|
Brooks CN, Field EK. Microbial community response to hydrocarbon exposure in iron oxide mats: an environmental study. Front Microbiol 2024; 15:1388973. [PMID: 38800754 PMCID: PMC11116660 DOI: 10.3389/fmicb.2024.1388973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 04/16/2024] [Indexed: 05/29/2024] Open
Abstract
Hydrocarbon pollution is a widespread issue in both groundwater and surface-water systems; however, research on remediation at the interface of these two systems is limited. This interface is the oxic-anoxic boundary, where hydrocarbon pollutant from contaminated groundwaters flows into surface waters and iron mats are formed by microaerophilic iron-oxidizing bacteria. Iron mats are highly chemically adsorptive and host a diverse community of microbes. To elucidate the effect of hydrocarbon exposure on iron mat geochemistry and microbial community structure and function, we sampled iron mats both upstream and downstream from a leaking underground storage tank. Hydrocarbon-exposed iron mats had significantly higher concentrations of oxidized iron and significantly lower dissolved organic carbon and total dissolved phosphate than unexposed iron mats. A strong negative correlation between dissolved phosphate and benzene was observed in the hydrocarbon-exposed iron mats and water samples. There were positive correlations between iron and other hydrocarbons with benzene in the hydrocarbon-exposed iron mats, which was unique from water samples. The hydrocarbon-exposed iron mats represented two types, flocculent and seep, which had significantly different concentrations of iron, hydrocarbons, and phosphate, indicating that iron mat is also an important context in studies of freshwater mats. Using constrained ordination, we found the best predictors for community structure to be dissolved oxygen, pH, and benzene. Alpha diversity and evenness were significantly lower in hydrocarbon-exposed iron mats than unexposed mats. Using 16S rDNA amplicon sequences, we found evidence of three putative nitrate-reducing iron-oxidizing taxa in microaerophile-dominated iron mats (Azospira, Paracoccus, and Thermomonas). 16S rDNA amplicons also indicated the presence of taxa that are associated with hydrocarbon degradation. Benzene remediation-associated genes were found using metagenomic analysis both in exposed and unexposed iron mats. Furthermore, the results indicated that season (summer vs. spring) exacerbates the negative effect of hydrocarbon exposure on community diversity and evenness and led to the increased abundance of numerous OTUs. This study represents the first of its kind to attempt to understand how contaminant exposure, specifically hydrocarbons, influences the geochemistry and microbial community of freshwater iron mats and further develops our understanding of hydrocarbon remediation at the land-water interface.
Collapse
Affiliation(s)
- Chequita N. Brooks
- Department of Biology, East Carolina University, Greenville, NC, United States
- Louisiana Universities Marine Consortium, Chauvin, LA, United States
| | - Erin K. Field
- Department of Biology, East Carolina University, Greenville, NC, United States
| |
Collapse
|
28
|
Henkin R, Goldmann K, Lewis M, Barnes MR. shinyExprPortal: a configurable 'shiny' portal for sharing analysis of molecular expression data. Bioinformatics 2024; 40:btae172. [PMID: 38552327 PMCID: PMC11021805 DOI: 10.1093/bioinformatics/btae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 02/13/2024] [Accepted: 03/27/2024] [Indexed: 04/18/2024] Open
Abstract
MOTIVATION The scale of omics research presents many obstacles to full sharing and access to analysis results. Current publication models impose limits on the number of pages and figures, requiring careful preparation and selection of content. At the same time, depositing data in open repositories significantly shifts the burden of access and reproduction to readers, who may include people who are not programmers or analysts. RESULTS We introduce shinyExprPortal, an R package that implements omics web portals with minimal coding effort. The portals allow exploration of transcriptomic or proteomic expression data and phenotypes, showcasing results of various types of analysis including differential expression, co-expression and pathways analysis. The integration with bioinformatics workflows enables researchers to focus on their results and share findings using interactive and publication-quality plots. AVAILABILITY AND IMPLEMENTATION The shinyExprPortal package is available to download and install from CRAN and https://github.com/C4TB/shinyExprPortal.
Collapse
Affiliation(s)
- Rafael Henkin
- Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, United Kingdom
- Digital Environment Research Institute, Queen Mary University of London, London E1 1HH, United Kingdom
| | - Katriona Goldmann
- Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, United Kingdom
- Alan Turing Institute, London NW1 2DB, United Kingdom
| | - Myles Lewis
- Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, United Kingdom
- Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, United Kingdom
| | - Michael R Barnes
- Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, United Kingdom
- Digital Environment Research Institute, Queen Mary University of London, London E1 1HH, United Kingdom
| |
Collapse
|
29
|
Ovchinnikova S, Anders S. Simple but powerful interactive data analysis in R with R/LinekdCharts. Genome Biol 2024; 25:43. [PMID: 38317238 PMCID: PMC10840235 DOI: 10.1186/s13059-024-03164-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 01/03/2024] [Indexed: 02/07/2024] Open
Abstract
In research involving data-rich assays, exploratory data analysis is a crucial step. Typically, this involves jumping back and forth between visualizations that provide overview of the whole data and others that dive into details. For example, it might be helpful to have one chart showing a summary statistic for all samples, while a second chart provides details for points selected in the first chart. We present R/LinkedCharts, a framework that renders this task radically simple, requiring very few lines of code to obtain complex and general visualization, which later can be polished to provide interactive data access of publication quality.
Collapse
Affiliation(s)
- Svetlana Ovchinnikova
- Center for Molecular Biology and BioQuant Center of the University of Heidelberg, Heidelberg, Germany
| | - Simon Anders
- Center for Molecular Biology and BioQuant Center of the University of Heidelberg, Heidelberg, Germany.
| |
Collapse
|
30
|
Lavikka K, Oikkonen J, Li Y, Muranen T, Micoli G, Marchi G, Lahtinen A, Huhtinen K, Lehtonen R, Hietanen S, Hynninen J, Virtanen A, Hautaniemi S. Deciphering cancer genomes with GenomeSpy: a grammar-based visualization toolkit. Gigascience 2024; 13:giae040. [PMID: 39101783 PMCID: PMC11299109 DOI: 10.1093/gigascience/giae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 05/13/2024] [Accepted: 06/19/2024] [Indexed: 08/06/2024] Open
Abstract
BACKGROUND Visualization is an indispensable facet of genomic data analysis. Despite the abundance of specialized visualization tools, there remains a distinct need for tailored solutions. However, their implementation typically requires extensive programming expertise from bioinformaticians and software developers, especially when building interactive applications. Toolkits based on visualization grammars offer a more accessible, declarative way to author new visualizations. Yet, current grammar-based solutions fall short in adequately supporting the interactive analysis of large datasets with extensive sample collections, a pivotal task often encountered in cancer research. FINDINGS We present GenomeSpy, a grammar-based toolkit for authoring tailored, interactive visualizations for genomic data analysis. By using combinatorial building blocks and a declarative language, users can implement new visualization designs easily and embed them in web pages or end-user-oriented applications. A distinctive element of GenomeSpy's architecture is its effective use of the graphics processing unit in all rendering, enabling a high frame rate and smoothly animated interactions, such as navigation within a genome. We demonstrate the utility of GenomeSpy by characterizing the genomic landscape of 753 ovarian cancer samples from patients in the DECIDER clinical trial. Our results expand the understanding of the genomic architecture in ovarian cancer, particularly the diversity of chromosomal instability. CONCLUSIONS GenomeSpy is a visualization toolkit applicable to a wide range of tasks pertinent to genome analysis. It offers high flexibility and exceptional performance in interactive analysis. The toolkit is open source with an MIT license, implemented in JavaScript, and available at https://genomespy.app/.
Collapse
Affiliation(s)
- Kari Lavikka
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Jaana Oikkonen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Yilin Li
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Taru Muranen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Giulia Micoli
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Giovanni Marchi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Alexandra Lahtinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Kaisa Huhtinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
- Cancer Research Unit, Institute of Biomedicine and FICAN West Cancer Centre, University of Turku, 20521 Turku, Finland
| | - Rainer Lehtonen
- Applied Tumor Genomics Research Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Sakari Hietanen
- Department of Obstetrics and Gynecology, University of Turku and Turku University Hospital, 20521 Turku, Finland
| | - Johanna Hynninen
- Department of Obstetrics and Gynecology, University of Turku and Turku University Hospital, 20521 Turku, Finland
| | - Anni Virtanen
- Department of Pathology, University of Helsinki and HUS Diagnostic Center, Helsinki University Hospital, 00260 Helsinki, Finland
| | - Sampsa Hautaniemi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| |
Collapse
|
31
|
Schmidt J, Pointner B, Miksch S. Visual Analytics for Understanding Draco's Knowledge Base. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:392-402. [PMID: 37874727 DOI: 10.1109/tvcg.2023.3326912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
Draco has been developed as an automated visualization recommendation system formalizing design knowledge as logical constraints in ASP (Answer-Set Programming). With an increasing set of constraints and incorporated design knowledge, even visualization experts lose overview in Draco and struggle to retrace the automated recommendation decisions made by the system. Our paper proposes an Visual Analytics (VA) approach to visualize and analyze Draco's constraints. Our VA approach is supposed to enable visualization experts to accomplish identified tasks regarding the knowledge base and support them in better understanding Draco. We extend the existing data extraction strategy of Draco with a data processing architecture capable of extracting features of interest from the knowledge base. A revised version of the ASP grammar provides the basis for this data processing strategy. The resulting incorporated and shared features of the constraints are then visualized using a hypergraph structure inside the radial-arranged constraints of the elaborated visualization. The hierarchical categories of the constraints are indicated by arcs surrounding the constraints. Our approach is supposed to enable visualization experts to interactively explore the design rules' violations based on highlighting respective constraints or recommendations. A qualitative and quantitative evaluation of the prototype confirms the prototype's effectiveness and value in acquiring insights into Draco's recommendation process and design constraints.
Collapse
|
32
|
Scott-Boyer MP, Dufour P, Belleau F, Ongaro-Carcy R, Plessis C, Périn O, Droit A. Use of Elasticsearch-based business intelligence tools for integration and visualization of biological data. Brief Bioinform 2023; 24:bbad348. [PMID: 37798252 DOI: 10.1093/bib/bbad348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 07/23/2023] [Accepted: 09/14/2023] [Indexed: 10/07/2023] Open
Abstract
The emergence of massive datasets exploring the multiple levels of molecular biology has made their analysis and knowledge transfer more complex. Flexible tools to manage big biological datasets could be of great help for standardizing the usage of developed data visualizations and integration methods. Business intelligence (BI) tools have been used in many fields as exploratory tools. They have numerous connectors to link numerous data repositories with a unified graphic interface, offering an overview of data and facilitating interpretation for decision makers. BI tools could be a flexible and user-friendly way of handling molecular biological data with interactive visualizations. However, it is rather uncommon to see such tools used for the exploration of massive and complex datasets in biological fields. We believe that two main obstacles could be the reason. Firstly, we posit that the way to import data into BI tools are not compatible with biological databases. Secondly, BI tools may not be adapted to certain particularities of complex biological data, namely, the size, the variability of datasets and the availability of specialized visualizations. This paper highlights the use of five BI tools (Elastic Kibana, Siren Investigate, Microsoft Power BI, Salesforce Tableau and Apache Superset) onto which the massive data management repository engine called Elasticsearch is compatible. Four case studies will be discussed in which these BI tools were applied on biological datasets with different characteristics. We conclude that the performance of the tools depends on the complexity of the biological questions and the size of the datasets.
Collapse
Affiliation(s)
- Marie-Pier Scott-Boyer
- Centre de Recherche du CHU de Québec-Université, Laval, Université Laval, G1V 4G2, Québec, Canada
| | - Pascal Dufour
- Centre de Recherche du CHU de Québec-Université, Laval, Université Laval, G1V 4G2, Québec, Canada
| | - François Belleau
- Centre de Recherche du CHU de Québec-Université, Laval, Université Laval, G1V 4G2, Québec, Canada
| | - Regis Ongaro-Carcy
- Centre de Recherche du CHU de Québec-Université, Laval, Université Laval, G1V 4G2, Québec, Canada
- Département de Médecine Moléculaire, G1V 0A6, Québec, Canada
| | - Clément Plessis
- Centre de Recherche du CHU de Québec-Université, Laval, Université Laval, G1V 4G2, Québec, Canada
| | - Olivier Périn
- L'Oréal Advance Research, Aulnay-sous-Bois, 93600, France
| | - Arnaud Droit
- Centre de Recherche du CHU de Québec-Université, Laval, Université Laval, G1V 4G2, Québec, Canada
- Département de Médecine Moléculaire, G1V 0A6, Québec, Canada
| |
Collapse
|
33
|
Enge K, Rind A, Iber M, Höldrich R, Aigner W. Towards a unified terminology for sonification and visualization. PERSONAL AND UBIQUITOUS COMPUTING 2023; 27:1949-1963. [PMID: 37869040 PMCID: PMC10589160 DOI: 10.1007/s00779-023-01720-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 03/19/2023] [Indexed: 10/24/2023]
Abstract
Both sonification and visualization convey information about data by effectively using our human perceptual system, but their ways to transform the data differ. Over the past 30 years, the sonification community has demanded a holistic perspective on data representation, including audio-visual analysis, several times. A design theory of audio-visual analysis would be a relevant step in this direction. An indispensable foundation for this endeavor is a terminology describing the combined design space. To build a bridge between the domains, we adopt three of the established theoretical constructs from visualization theory for the field of sonification. The three constructs are the spatial substrate, the visual mark, and the visual channel. In our model, we choose time to be the temporal substrate of sonification. Auditory marks are then positioned in time, such as visual marks are positioned in space. Auditory channels are encoded into auditory marks to convey information. The proposed definitions allow discussing visualization and sonification designs as well as multi-modal designs based on a common terminology. While the identified terminology can support audio-visual analytics research, it also provides a new perspective on sonification theory itself.
Collapse
Affiliation(s)
- Kajetan Enge
- Institute of Creative Media Technologies, FH St. Pölten, Campusplatz 1, St. Pölten, 3100 Austria
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Leonhardstraße 15, Graz, 8010 Austria
| | - Alexander Rind
- Institute of Creative Media Technologies, FH St. Pölten, Campusplatz 1, St. Pölten, 3100 Austria
| | - Michael Iber
- Institute of Creative Media Technologies, FH St. Pölten, Campusplatz 1, St. Pölten, 3100 Austria
| | - Robert Höldrich
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Leonhardstraße 15, Graz, 8010 Austria
| | - Wolfgang Aigner
- Institute of Creative Media Technologies, FH St. Pölten, Campusplatz 1, St. Pölten, 3100 Austria
| |
Collapse
|
34
|
Eckelt K, Hinterreiter A, Adelberger P, Walchshofer C, Dhanoa V, Humer C, Heckmann M, Steinparz C, Streit M. Visual Exploration of Relationships and Structure in Low-Dimensional Embeddings. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:3312-3326. [PMID: 35254984 DOI: 10.1109/tvcg.2022.3156760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
In this work, we propose an interactive visual approach for the exploration and formation of structural relationships in embeddings of high-dimensional data. These structural relationships, such as item sequences, associations of items with groups, and hierarchies between groups of items, are defining properties of many real-world datasets. Nevertheless, most existing methods for the visual exploration of embeddings treat these structures as second-class citizens or do not take them into account at all. In our proposed analysis workflow, users explore enriched scatterplots of the embedding, in which relationships between items and/or groups are visually highlighted. The original high-dimensional data for single items, groups of items, or differences between connected items and groups are accessible through additional summary visualizations. We carefully tailored these summary and difference visualizations to the various data types and semantic contexts. During their exploratory analysis, users can externalize their insights by setting up additional groups and relationships between items and/or groups. We demonstrate the utility and potential impact of our approach by means of two use cases and multiple examples from various domains.
Collapse
|
35
|
Hinterreiter A, Humer C, Kainz B, Streit M. ParaDime: A Framework for Parametric Dimensionality Reduction. COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS 2023; 42:337-348. [PMID: 38505300 PMCID: PMC10947012 DOI: 10.1111/cgf.14834] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
ParaDime is a framework for parametric dimensionality reduction (DR). In parametric DR, neural networks are trained to embed high-dimensional data items in a low-dimensional space while minimizing an objective function. ParaDime builds on the idea that the objective functions of several modern DR techniques result from transformed inter-item relationships. It provides a common interface for specifying these relations and transformations and for defining how they are used within the losses that govern the training process. Through this interface, ParaDime unifies parametric versions of DR techniques such as metric MDS, t-SNE, and UMAP. It allows users to fully customize all aspects of the DR process. We show how this ease of customization makes ParaDime suitable for experimenting with interesting techniques such as hybrid classification/embedding models and supervised DR. This way, ParaDime opens up new possibilities for visualizing high-dimensional data.
Collapse
Affiliation(s)
| | | | - Bernhard Kainz
- Friedrich-Alexander-University Erlangen-Nuremberg Germany
- Imperial College London UK
| | | |
Collapse
|
36
|
Jana B, Liu X, Dénéréaz J, Park H, Leshchiner D, Liu B, Gallay C, Veening JW, van Opijnen T. CRISPRi-TnSeq: A genome-wide high-throughput tool for bacterial essential-nonessential genetic interaction mapping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.31.543074. [PMID: 37398100 PMCID: PMC10312587 DOI: 10.1101/2023.05.31.543074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Genetic interaction networks can help identify functional connections between genes and pathways, which can be leveraged to establish (new) gene function, drug targets, and fill pathway gaps. Since there is no optimal tool that can map genetic interactions across many different bacterial strains and species, we develop CRISPRi-TnSeq, a genome-wide tool that maps genetic interactions between essential genes and nonessential genes through the knockdown of a targeted essential gene (CRISPRi) and the simultaneous knockout of individual nonessential genes (Tn-Seq). CRISPRi-TnSeq thereby identifies, on a genome-wide scale, synthetic and suppressor-type relationships between essential and nonessential genes, enabling the construction of essential-nonessential genetic interaction networks. To develop and optimize CRISPRi-TnSeq, CRISPRi strains were obtained for 13 essential genes in Streptococcus pneumoniae, involved in different biological processes including metabolism, DNA replication, transcription, cell division and cell envelope synthesis. Transposon-mutant libraries were constructed in each strain enabling screening of ∼24,000 gene-gene pairs, which led to the identification of 1,334 genetic interactions, including 754 negative and 580 positive genetic interactions. Through extensive network analyses and validation experiments we identify a set of 17 pleiotropic genes, of which a subset tentatively functions as genetic capacitors, dampening phenotypic outcomes and protecting against perturbations. Furthermore, we focus on the relationships between cell wall synthesis, integrity and cell division and highlight: 1) how essential gene knockdown can be compensated by rerouting flux through nonessential genes in a pathway; 2) the existence of a delicate balance between Z-ring formation and localization, and septal and peripheral peptidoglycan (PG) synthesis to successfully accomplish cell division; 3) the control of c-di-AMP over intracellular K + and turgor, and thereby modulation of the cell wall synthesis machinery; 4) the dynamic nature of cell wall protein CozEb and its effect on PG synthesis, cell shape morphology and envelope integrity; 5) functional dependency between chromosome decatenation and segregation, and the critical link with cell division, and cell wall synthesis. Overall, we show that CRISPRi-TnSeq uncovers genetic interactions between closely functionally linked genes and pathways, as well as disparate genes and pathways, highlighting pathway dependencies and valuable leads for gene function. Importantly, since both CRISPRi and Tn-Seq are widely used tools, CRISPRi-TnSeq should be relatively easy to implement to construct genetic interaction networks across many different microbial strains and species.
Collapse
|
37
|
Younesy H, Pober J, Möller T, Karimi MM. ModEx: a general purpose computer model exploration system. FRONTIERS IN BIOINFORMATICS 2023; 3:1153800. [PMID: 37304402 PMCID: PMC10249055 DOI: 10.3389/fbinf.2023.1153800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 05/09/2023] [Indexed: 06/13/2023] Open
Abstract
We present a general purpose visual analysis system that can be used for exploring parameters of a variety of computer models. Our proposed system offers key components of a visual parameter analysis framework including parameter sampling, deriving output summaries, and an exploration interface. It also provides an API for rapid development of parameter space exploration solutions as well as the flexibility to support custom workflows for different application domains. We evaluate the effectiveness of our system by demonstrating it in three domains: data mining, machine learning and specific application in bioinformatics.
Collapse
Affiliation(s)
- Hamid Younesy
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | | | - Torsten Möller
- Research Network Data Science and Faculty of Computer Science, University of Vienna, Vienna, Austria
| | - Mohammad M. Karimi
- Comprehensive Cancer Centre, School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King's College London, London, United Kingdom
| |
Collapse
|
38
|
Reina G, Basole RC, Ferrise F. Can Image Data Facilitate Reproducibility of Graphics and Visualizations? Toward a Trusted Scientific Practice. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2023; 43:89-100. [PMID: 37030835 DOI: 10.1109/mcg.2023.3241819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Reproducibility is a cornerstone of good scientific practice; however, the ongoing "reproducibility crisis" shows that we still need to improve the way we are doing research currently. Reproducibility is crucial because it enables both the comparison to existing techniques as well as the composition and improvement of existing approaches. It can also increase trust in the respective results, which is paramount for adoption in further research and applications. While there are already many initiatives and approaches with different complexity aimed at enabling reproducible research in the context of visualization, we argue for an alternative, lightweight approach that documents the most relevant parameters with minimal overhead. It still complements complex approaches well, and integration with any existing tool or system is simple. Our approach uses the images produced by visualizations and seamlessly piggy-backs everyday communication and research collaborations, publication authoring, public outreach, and internal note-taking. We exemplify how our approach supports day-to-day work and discuss limitations and how they can be countered.
Collapse
|
39
|
Zhao J, Xu S, Chandrasegaran S, Bryan C, Du F, Mishra A, Qian X, Li Y, Ma KL. ChartStory: Automated Partitioning, Layout, and Captioning of Charts into Comic-Style Narratives. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1384-1399. [PMID: 34559655 DOI: 10.1109/tvcg.2021.3114211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Visual data storytelling is gaining importance as a means of presenting data-driven information or analysis results, especially to the general public. This has resulted in design principles being proposed for data-driven storytelling, and new authoring tools being created to aid such storytelling. However, data analysts typically lack sufficient background in design and storytelling to make effective use of these principles and authoring tools. To assist this process, we present ChartStory for crafting data stories from a collection of user-created charts, using a style akin to comic panels to imply the underlying sequence and logic of data-driven narratives. Our approach is to operationalize established design principles into an advanced pipeline that characterizes charts by their properties and similarities to each other, and recommends ways to partition, layout, and caption story pieces to serve a narrative. ChartStory also augments this pipeline with intuitive user interactions for visual refinement of generated data comics. We extensively and holistically evaluate ChartStory via a trio of studies. We first assess how the tool supports data comic creation in comparison to a manual baseline tool. Data comics from this study are subsequently compared and evaluated to ChartStory's automated recommendations by a team of narrative visualization practitioners. This is followed by a pair of interview studies with data scientists using their own datasets and charts who provide an additional assessment of the system. We find that ChartStory provides cogent recommendations for narrative generation, resulting in data comics that compare favorably to manually-created ones.
Collapse
|
40
|
Manz T, L’Yi S, Gehlenborg N. Gos: a declarative library for interactive genomics visualization in Python. Bioinformatics 2023; 39:6998203. [PMID: 36688709 PMCID: PMC9891240 DOI: 10.1093/bioinformatics/btad050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 11/16/2022] [Accepted: 01/20/2023] [Indexed: 01/24/2023] Open
Abstract
SUMMARY Gos is a declarative Python library designed to create interactive multiscale visualizations of genomics and epigenomics data. It provides a consistent and simple interface to the flexible Gosling visualization grammar. Gos hides technical complexities involved with configuring web-based genome browsers and integrates seamlessly within computational notebooks environments to enable new interactive analysis workflows. AVAILABILITY AND IMPLEMENTATION Gos is released under the MIT License and available on the Python Package Index (PyPI). The source code is publicly available on GitHub (https://github.com/gosling-lang/gos), and documentation with examples can be found at https://gosling-lang.github.io/gos.
Collapse
Affiliation(s)
- Trevor Manz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Sehi L’Yi
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | | |
Collapse
|
41
|
Li S, Yu J, Li M, Liu L, Zhang XL, Yuan X. A Framework for Multiclass Contour Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:353-362. [PMID: 36194705 DOI: 10.1109/tvcg.2022.3209482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Multiclass contour visualization is often used to interpret complex data attributes in such fields as weather forecasting, computational fluid dynamics, and artificial intelligence. However, effective and accurate representations of underlying data patterns and correlations can be challenging in multiclass contour visualization, primarily due to the inevitable visual cluttering and occlusions when the number of classes is significant. To address this issue, visualization design must carefully choose design parameters to make visualization more comprehensible. With this goal in mind, we proposed a framework for multiclass contour visualization. The framework has two components: a set of four visualization design parameters, which are developed based on an extensive review of literature on contour visualization, and a declarative domain-specific language (DSL) for creating multiclass contour rendering, which enables a fast exploration of those design parameters. A task-oriented user study was conducted to assess how those design parameters affect users' interpretations of real-world data. The study results offered some suggestions on the value choices of design parameters in multiclass contour visualization.
Collapse
|
42
|
McNutt AM. No Grammar to Rule Them All: A Survey of JSON-style DSLs for Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:160-170. [PMID: 36166549 DOI: 10.1109/tvcg.2022.3209460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
There has been substantial growth in the use of JSON-based grammars, as well as other standard data serialization languages, to create visualizations. Each of these grammars serves a purpose: some focus on particular computational tasks (such as animation), some are concerned with certain chart types (such as maps), and some target specific data domains (such as ML). Despite the prominence of this interface form, there has been little detailed analysis of the characteristics of these languages. In this study, we survey and analyze the design and implementation of 57 JSON-style DSLs for visualization. We analyze these languages supported by a collected corpus of examples for each DSL (consisting of 4395 instances) across a variety of axes organized into concerns related to domain, conceptual model, language relationships, affordances, and general practicalities. We identify tensions throughout these areas, such as between formal and colloquial specifications, among types of users, and within the composition of languages. Through this work, we seek to support language implementers by elucidating the choices, opportunities, and tradeoffs in visualization DSL design.
Collapse
|
43
|
Sperrle F, Ceneda D, El-Assady M. Lotse: A Practical Framework for Guidance in Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1124-1134. [PMID: 36215348 DOI: 10.1109/tvcg.2022.3209393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Co-adaptive guidance aims to enable efficient human-machine collaboration in visual analytics, as proposed by multiple theoretical frameworks. This paper bridges the gap between such conceptual frameworks and practical implementation by introducing an accessible model of guidance and an accompanying guidance library, mapping theory into practice. We contribute a model of system-provided guidance based on design templates and derived strategies. We instantiate the model in a library called Lotse that allows specifying guidance strategies in definition files and generates running code from them. Lotse is the first guidance library using such an approach. It supports the creation of reusable guidance strategies to retrofit existing applications with guidance and fosters the creation of general guidance strategy patterns. We demonstrate its effectiveness through first-use case studies with VA researchers of varying guidance design expertise and find that they are able to effectively and quickly implement guidance with Lotse. Further, we analyze our framework's cognitive dimensions to evaluate its expressiveness and outline a summary of open research questions for aligning guidance practice with its intricate theory.
Collapse
|
44
|
Wang Y, Hou Z, Shen L, Wu T, Wang J, Huang H, Zhang H, Zhang D. Towards Natural Language-Based Visualization Authoring. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1222-1232. [PMID: 36197854 DOI: 10.1109/tvcg.2022.3209357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
A key challenge to visualization authoring is the process of getting familiar with the complex user interfaces of authoring tools. Natural Language Interface (NLI) presents promising benefits due to its learnability and usability. However, supporting NLIs for authoring tools requires expertise in natural language processing, while existing NLIs are mostly designed for visual analytic workflow. In this paper, we propose an authoring-oriented NLI pipeline by introducing a structured representation of users' visualization editing intents, called editing actions, based on a formative study and an extensive survey on visualization construction tools. The editing actions are executable, and thus decouple natural language interpretation and visualization applications as an intermediate layer. We implement a deep learning-based NL interpreter to translate NL utterances into editing actions. The interpreter is reusable and extensible across authoring tools. The authoring tools only need to map the editing actions into tool-specific operations. To illustrate the usages of the NL interpreter, we implement an Excel chart editor and a proof-of-concept authoring tool, VisTalk. We conduct a user study with VisTalk to understand the usage patterns of NL-based authoring systems. Finally, we discuss observations on how users author charts with natural language, as well as implications for future research.
Collapse
|
45
|
Huang J, Xi Y, Hu J, Tao J. FlowNL: Asking the Flow Data in Natural Languages. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1200-1210. [PMID: 36194710 DOI: 10.1109/tvcg.2022.3209453] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Flow visualization is essentially a tool to answer domain experts' questions about flow fields using rendered images. Static flow visualization approaches require domain experts to raise their questions to visualization experts, who develop specific techniques to extract and visualize the flow structures of interest. Interactive visualization approaches allow domain experts to ask the system directly through the visual analytic interface, which provides flexibility to support various tasks. However, in practice, the visual analytic interface may require extra learning effort, which often discourages domain experts and limits its usage in real-world scenarios. In this paper, we propose FlowNL, a novel interactive system with a natural language interface. FlowNL allows users to manipulate the flow visualization system using plain English, which greatly reduces the learning effort. We develop a natural language parser to interpret user intention and translate textual input into a declarative language. We design the declarative language as an intermediate layer between the natural language and the programming language specifically for flow visualization. The declarative language provides selection and composition rules to derive relatively complicated flow structures from primitive objects that encode various kinds of information about scalar fields, flow patterns, regions of interest, connectivities, etc. We demonstrate the effectiveness of FlowNL using multiple usage scenarios and an empirical evaluation.
Collapse
|
46
|
Li Y, Qi Y, Shi Y, Chen Q, Cao N, Chen S. Diverse Interaction Recommendation for Public Users Exploring Multi-view Visualization using Deep Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:95-105. [PMID: 36155443 DOI: 10.1109/tvcg.2022.3209461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Interaction is an important channel to offer users insights in interactive visualization systems. However, which interaction to operate and which part of data to explore are hard questions for public users facing a multi-view visualization for the first time. Making these decisions largely relies on professional experience and analytic abilities, which is a huge challenge for non-professionals. To solve the problem, we propose a method aiming to provide diverse, insightful, and real-time interaction recommendations for novice users. Building on the Long-Short Term Memory Model (LSTM) structure, our model captures users' interactions and visual states and encodes them in numerical vectors to make further recommendations. Through an illustrative example of a visualization system about Chinese poets in the museum scenario, the model is proven to be workable in systems with multi-views and multiple interaction types. A further user study demonstrates the method's capability to help public users conduct more insightful and diverse interactive explorations and gain more accurate data insights.
Collapse
|
47
|
Gaba A, Setlur V, Srinivasan A, Hoffswell J, Xiong C. Comparison Conundrum and the Chamber of Visualizations: An Exploration of How Language Influences Visual Design. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1211-1221. [PMID: 36155465 DOI: 10.1109/tvcg.2022.3209456] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The language for expressing comparisons is often complex and nuanced, making supporting natural language-based visual comparison a non-trivial task. To better understand how people reason about comparisons in natural language, we explore a design space of utterances for comparing data entities. We identified different parameters of comparison utterances that indicate what is being compared (i.e., data variables and attributes) as well as how these parameters are specified (i.e., explicitly or implicitly). We conducted a user study with sixteen data visualization experts and non-experts to investigate how they designed visualizations for comparisons in our design space. Based on the rich set of visualization techniques observed, we extracted key design features from the visualizations and synthesized them into a subset of sixteen representative visualization designs. We then conducted a follow-up study to validate user preferences for the sixteen representative visualizations corresponding to utterances in our design space. Findings from these studies suggest guidelines and future directions for designing natural language interfaces and recommendation tools to better support natural language comparisons in visual analytics.
Collapse
|
48
|
Zong J, Pollock J, Wootton D, Satyanarayan A. Animated Vega-Lite: Unifying Animation with a Grammar of Interactive Graphics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:149-159. [PMID: 36215347 DOI: 10.1109/tvcg.2022.3209369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
We present Animated Vega-Lite, a set of extensions to Vega-Lite that model animated visualizations as time-varying data queries. In contrast to alternate approaches for specifying animated visualizations, which prize a highly expressive design space, Animated Vega-Lite prioritizes unifying animation with the language's existing abstractions for static and interactive visualizations to enable authors to smoothly move between or combine these modalities. Thus, to compose animation with static visualizations, we represent time as an encoding channel. Time encodings map a data field to animation keyframes, providing a lightweight specification for animations without interaction. To compose animation and interaction, we also represent time as an event stream; Vega-Lite selections, which provide dynamic data queries, are now driven not only by input events but by timer ticks as well. We evaluate the expressiveness of our approach through a gallery of diverse examples that demonstrate coverage over taxonomies of both interaction and animation. We also critically reflect on the conceptual affordances and limitations of our contribution by interviewing five expert developers of existing animation grammars. These reflections highlight the key motivating role of in-the-wild examples, and identify three central tradeoffs: the language design process, the types of animated transitions supported, and how the systems model keyframes.
Collapse
|
49
|
South L, Borkin MA. Photosensitive Accessibility for Interactive Data Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:374-384. [PMID: 36166540 DOI: 10.1109/tvcg.2022.3209359] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Accessibility guidelines place restrictions on the use of animations and interactivity on webpages to lessen the likelihood of webpages inadvertently producing sequences with flashes, patterns, or color changes that may trigger seizures for individuals with photosensitive epilepsy. Online data visualizations often incorporate elements of animation and interactivity to create a narrative, engage users, or encourage exploration. These design guidelines have been empirically validated by perceptual studies in visualization literature, but the impact of animation and interaction in visualizations on users with photosensitivity, who may experience seizures in response to certain visual stimuli, has not been considered. We systematically gathered and tested 1,132 interactive and animated visualizations for seizure-inducing risk using established methods and found that currently available methods for determining photosensitive risk are not reliable when evaluating interactive visualizations, as risk scores varied significantly based on the individual interacting with the visualization. To address this issue, we introduce a theoretical model defining the degree of control visualization designers have over three determinants of photosensitive risk in potentially seizure-inducing sequences: the size, frequency, and color of flashing content. Using an analysis of 375 visualizations hosted on bl.ocks.org, we created a theoretical model of photosensitive risk in visualizations by arranging the photosensitive risk determinants according to the degree of control visualization authors have over whether content exceeds photosensitive accessibility thresholds. We then use this model to propose a new method of testing for photosensitive risk that focuses on elements of visualizations that are subject to greater authorial control - and are therefore more robust to variations in the individual user - producing more reliable risk assessments than existing methods when applied to interactive visualizations. A full copy of this paper and all study materials are available at https://osf.io/8kzmg/.
Collapse
|
50
|
Deng D, Wu A, Qu H, Wu Y. DashBot: Insight-Driven Dashboard Generation Based on Deep Reinforcement Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:690-700. [PMID: 36179003 DOI: 10.1109/tvcg.2022.3209468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Analytical dashboards are popular in business intelligence to facilitate insight discovery with multiple charts. However, creating an effective dashboard is highly demanding, which requires users to have adequate data analysis background and be familiar with professional tools, such as Power BI. To create a dashboard, users have to configure charts by selecting data columns and exploring different chart combinations to optimize the communication of insights, which is trial-and-error. Recent research has started to use deep learning methods for dashboard generation to lower the burden of visualization creation. However, such efforts are greatly hindered by the lack of large-scale and high-quality datasets of dashboards. In this work, we propose using deep reinforcement learning to generate analytical dashboards that can use well-established visualization knowledge and the estimation capacity of reinforcement learning. Specifically, we use visualization knowledge to construct a training environment and rewards for agents to explore and imitate human exploration behavior with a well-designed agent network. The usefulness of the deep reinforcement learning model is demonstrated through ablation studies and user studies. In conclusion, our work opens up new opportunities to develop effective ML-based visualization recommenders without beforehand training datasets.
Collapse
|