1
|
Tian Y, Cui W, Deng D, Yi X, Yang Y, Zhang H, Wu Y. ChartGPT: Leveraging LLMs to Generate Charts From Abstract Natural Language. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1731-1745. [PMID: 38386583 DOI: 10.1109/tvcg.2024.3368621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
The use of natural language interfaces (NLIs) to create charts is becoming increasingly popular due to the intuitiveness of natural language interactions. One key challenge in this approach is to accurately capture user intents and transform them to proper chart specifications. This obstructs the wide use of NLI in chart generation, as users' natural language inputs are generally abstract (i.e., ambiguous or under-specified), without a clear specification of visual encodings. Recently, pre-trained large language models (LLMs) have exhibited superior performance in understanding and generating natural language, demonstrating great potential for downstream tasks. Inspired by this major trend, we propose ChartGPT, generating charts from abstract natural language inputs. However, LLMs are struggling to address complex logic problems. To enable the model to accurately specify the complex parameters and perform operations in chart generation, we decompose the generation process into a step-by-step reasoning pipeline, so that the model only needs to reason a single and specific sub-task during each run. Moreover, LLMs are pre-trained on general datasets, which might be biased for the task of chart generation. To provide adequate visualization knowledge, we create a dataset consisting of abstract utterances and charts and improve model performance through fine-tuning. We further design an interactive interface for ChartGPT that allows users to check and modify the intermediate outputs of each step. The effectiveness of the proposed system is evaluated through quantitative evaluations and a user study.
Collapse
|
2
|
Chen N, Zhang Y, Xu J, Ren K, Yang Y. VisEval: A Benchmark for Data Visualization in the Era of Large Language Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1301-1311. [PMID: 39255134 DOI: 10.1109/tvcg.2024.3456320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Translating natural language to visualization (NL2VIS) has shown great promise for visual data analysis, but it remains a challenging task that requires multiple low-level implementations, such as natural language processing and visualization design. Recent advancements in pre-trained large language models (LLMs) are opening new avenues for generating visualizations from natural language. However, the lack of a comprehensive and reliable benchmark hinders our understanding of LLMs' capabilities in visualization generation. In this paper, we address this gap by proposing a new NL2VIS benchmark called VisEval. Firstly, we introduce a high-quality and large-scale dataset. This dataset includes 2,524 representative queries covering 146 databases, paired with accurately labeled ground truths. Secondly, we advocate for a comprehensive automated evaluation methodology covering multiple dimensions, including validity, legality, and readability. By systematically scanning for potential issues with a number of heterogeneous checkers, VisEval provides reliable and trustworthy evaluation outcomes. We run VisEval on a series of state-of-the-art LLMs. Our evaluation reveals prevalent challenges and delivers essential insights for future advancements.
Collapse
|
3
|
Yang L, Guo Z, Xu X, Kang H, Lai J, Li J. An Online Multimodal Food Data Exploration Platform for Specific Population Health: Development Study. JMIR Form Res 2024; 8:e55088. [PMID: 39547662 PMCID: PMC11607570 DOI: 10.2196/55088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 10/15/2024] [Accepted: 10/29/2024] [Indexed: 11/17/2024] Open
Abstract
BACKGROUND Nutrient needs vary over the lifespan. Improving knowledge of both population groups and care providers can help with healthier food choices, thereby promoting population health and preventing diseases. Providing evidence-based food knowledge online is credible, low cost, and easily accessible. OBJECTIVE This study aimed to develop an online multimodal food data exploration platform for easy access to evidence-based diet- and nutrition-related data. METHODS We developed an online platform named Food Atlas in collaboration with a multidisciplinary expert group from the National Institute for Nutrition and Health and Peking Union Medical College Hospital in China. To demonstrate its feasibility for Chinese food for pregnant women, a user-friendly and high-quality multimodal food knowledge graph was constructed, and various interactions with graph-structured data were developed for easy access, including graph-based interactive visualizations, natural language retrieval, and image-text retrieval. Subsequently, we evaluated Food Atlas from both the system perspective and the user perspective. RESULTS The constructed multimodal food knowledge graph contained a total of 2011 entities, 10,410 triplets, and 23,497 images. Its schema consisted of 11 entity types and 26 types of semantic relations. Compared with 5 other online dietary platforms (Foodwake, Boohee, Xiachufang, Allrecipes, and Yummly), Food Atlas offers a distinct and comprehensive set of data content and system functions desired by target populations. Meanwhile, a total of 28 participants representing 4 different user groups were recruited to evaluate its usability: preparing for pregnancy (n=8), pregnant (n=12), clinicians (n=5), and dietitians (n=3). The mean System Usability Scale index of our platform was 82.5 (SD 9.94; range 40.0-82.5). This above-average usability score and the use cases indicated that Food Atlas is tailored to the needs of the target users. Furthermore, 96% (27/28) of the participants stated that the platform had high consistency, illustrating the necessity and effectiveness of health professionals participating in online, evidence-based resource development. CONCLUSIONS This study demonstrates the development of an online multimodal food data exploration platform and its ability to meet the rising demand for accessible, credible, and appropriate evidence-based online dietary resources. Further research and broader implementation of such platforms have the potential to popularize knowledge, thereby helping populations at different life stages make healthier food choices.
Collapse
Affiliation(s)
- Lin Yang
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
- Key Laboratory of Medical Information Intelligent Technology, Chinese Academy of Medical Sciences, Beijing, China
| | - Zhen Guo
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
- Key Laboratory of Medical Information Intelligent Technology, Chinese Academy of Medical Sciences, Beijing, China
| | - Xiaowei Xu
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
- Key Laboratory of Medical Information Intelligent Technology, Chinese Academy of Medical Sciences, Beijing, China
| | - Hongyu Kang
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
- Department of Biomedical Engineering, School of Medical Technology, Beijing Institute of Technology, Beijing, China
| | - Jianqiang Lai
- National Institute for Nutrition and Health, Chinese Center for Disease Control and Prevention, Beijing, China
| | - Jiao Li
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
- Key Laboratory of Medical Information Intelligent Technology, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
4
|
Tu Y, Wang X, Qiu R, Shen HW, Miller M, Rao J, Gao S, Huber PR, Hollander AD, Lange M, Garcia CR, Stubbs J. An Interactive Knowledge and Learning Environment in Smart Foodsheds. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2023; 43:36-47. [PMID: 37030817 DOI: 10.1109/mcg.2023.3263960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
The Internet of Food (IoF) is an emerging field in smart foodsheds, involving the creation of a knowledge graph (KG) about the environment, agriculture, food, diet, and health. However, the heterogeneity and size of the KG present challenges for downstream tasks, such as information retrieval and interactive exploration. To address those challenges, we propose an interactive knowledge and learning environment (IKLE) that integrates three programming and modeling languages to support multiple downstream tasks in the analysis pipeline. To make IKLE easier to use, we have developed algorithms to automate the generation of each language. In addition, we collaborated with domain experts to design and develop a dataflow visualization system, which embeds the automatic language generations into components and allows users to build their analysis pipeline by dragging and connecting components of interest. We have demonstrated the effectiveness of IKLE through three real-world case studies in smart foodsheds.
Collapse
|
5
|
Wang Y, Hou Z, Shen L, Wu T, Wang J, Huang H, Zhang H, Zhang D. Towards Natural Language-Based Visualization Authoring. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1222-1232. [PMID: 36197854 DOI: 10.1109/tvcg.2022.3209357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
A key challenge to visualization authoring is the process of getting familiar with the complex user interfaces of authoring tools. Natural Language Interface (NLI) presents promising benefits due to its learnability and usability. However, supporting NLIs for authoring tools requires expertise in natural language processing, while existing NLIs are mostly designed for visual analytic workflow. In this paper, we propose an authoring-oriented NLI pipeline by introducing a structured representation of users' visualization editing intents, called editing actions, based on a formative study and an extensive survey on visualization construction tools. The editing actions are executable, and thus decouple natural language interpretation and visualization applications as an intermediate layer. We implement a deep learning-based NL interpreter to translate NL utterances into editing actions. The interpreter is reusable and extensible across authoring tools. The authoring tools only need to map the editing actions into tool-specific operations. To illustrate the usages of the NL interpreter, we implement an Excel chart editor and a proof-of-concept authoring tool, VisTalk. We conduct a user study with VisTalk to understand the usage patterns of NL-based authoring systems. Finally, we discuss observations on how users author charts with natural language, as well as implications for future research.
Collapse
|
6
|
Huang J, Xi Y, Hu J, Tao J. FlowNL: Asking the Flow Data in Natural Languages. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1200-1210. [PMID: 36194710 DOI: 10.1109/tvcg.2022.3209453] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Flow visualization is essentially a tool to answer domain experts' questions about flow fields using rendered images. Static flow visualization approaches require domain experts to raise their questions to visualization experts, who develop specific techniques to extract and visualize the flow structures of interest. Interactive visualization approaches allow domain experts to ask the system directly through the visual analytic interface, which provides flexibility to support various tasks. However, in practice, the visual analytic interface may require extra learning effort, which often discourages domain experts and limits its usage in real-world scenarios. In this paper, we propose FlowNL, a novel interactive system with a natural language interface. FlowNL allows users to manipulate the flow visualization system using plain English, which greatly reduces the learning effort. We develop a natural language parser to interpret user intention and translate textual input into a declarative language. We design the declarative language as an intermediate layer between the natural language and the programming language specifically for flow visualization. The declarative language provides selection and composition rules to derive relatively complicated flow structures from primitive objects that encode various kinds of information about scalar fields, flow patterns, regions of interest, connectivities, etc. We demonstrate the effectiveness of FlowNL using multiple usage scenarios and an empirical evaluation.
Collapse
|
7
|
Ulbrich P, Waldner M, Furmanova K, Marques SM, Bednar D, Kozlikova B, Byska J. sMolBoxes: Dataflow Model for Molecular Dynamics Exploration. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:581-590. [PMID: 36155456 DOI: 10.1109/tvcg.2022.3209411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
We present sMolBoxes, a dataflow representation for the exploration and analysis of long molecular dynamics (MD) simulations. When MD simulations reach millions of snapshots, a frame-by-frame observation is not feasible anymore. Thus, biochemists rely to a large extent only on quantitative analysis of geometric and physico-chemical properties. However, the usage of abstract methods to study inherently spatial data hinders the exploration and poses a considerable workload. sMolBoxes link quantitative analysis of a user-defined set of properties with interactive 3D visualizations. They enable visual explanations of molecular behaviors, which lead to an efficient discovery of biochemically significant parts of the MD simulation. sMolBoxes follow a node-based model for flexible definition, combination, and immediate evaluation of properties to be investigated. Progressive analytics enable fluid switching between multiple properties, which facilitates hypothesis generation. Each sMolBox provides quick insight to an observed property or function, available in more detail in the bigBox View. The case studies illustrate that even with relatively few sMolBoxes, it is possible to express complex analytical tasks, and their use in exploratory analysis is perceived as more efficient than traditional scripting-based methods.
Collapse
|
8
|
A Natural Language Interface for Automatic Generation of Data Flow Diagram using Web Extraction Techniques. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2023. [DOI: 10.1016/j.jksuci.2023.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
9
|
Wu J, Liu D, Guo Z, Wu Y. RASIPAM: Interactive Pattern Mining of Multivariate Event Sequences in Racket Sports. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:940-950. [PMID: 36179006 DOI: 10.1109/tvcg.2022.3209452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Experts in racket sports like tennis and badminton use tactical analysis to gain insight into competitors' playing styles. Many data-driven methods apply pattern mining to racket sports data - which is often recorded as multivariate event sequences - to uncover sports tactics. However, tactics obtained in this way are often inconsistent with those deduced by experts through their domain knowledge, which can be confusing to those experts. This work introduces RASIPAM, a RAcket-Sports Interactive PAttern Mining system, which allows experts to incorporate their knowledge into data mining algorithms to discover meaningful tactics interactively. RASIPAM consists of a constraint-based pattern mining algorithm that responds to the analysis demands of experts: Experts provide suggestions for finding tactics in intuitive written language, and these suggestions are translated into constraints to run the algorithm. RASIPAM further introduces a tailored visual interface that allows experts to compare the new tactics with the original ones and decide whether to apply a given adjustment. This interactive workflow iteratively progresses until experts are satisfied with all tactics. We conduct a quantitative experiment to show that our algorithm supports real-time interaction. Two case studies in tennis and in badminton respectively, each involving two domain experts, are conducted to show the effectiveness and usefulness of the system.
Collapse
|
10
|
Chen R, Shu X, Chen J, Weng D, Tang J, Fu S, Wu Y. Nebula: A Coordinating Grammar of Graphics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4127-4140. [PMID: 33909565 DOI: 10.1109/tvcg.2021.3076222] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In multiple coordinated views (MCVs), visualizations across views update their content in response to users' interactions in other views. Interactive systems provide direct manipulation to create coordination between views, but are restricted to limited types of predefined templates. By contrast, textual specification languages enable flexible coordination but expose technical burden. To bridge the gap, we contribute Nebula, a grammar based on natural language for coordinating visualizations in MCVs. The grammar design is informed by a novel framework based on a systematic review of 176 coordinations from existing theories and applications, which describes coordination by demonstration, i.e., how coordination is performed by users. With the framework, Nebula specification formalizes coordination as a composition of user- and coordination-triggered interactions in origin and destination views, respectively, along with potential data transformation between the interactions. We evaluate Nebula by demonstrating its expressiveness with a gallery of diverse examples and analyzing its usability on cognitive dimensions.
Collapse
|
11
|
Wu A, Wang Y, Shu X, Moritz D, Cui W, Zhang H, Zhang D, Qu H. AI4VIS: Survey on Artificial Intelligence Approaches for Data Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:5049-5070. [PMID: 34310306 DOI: 10.1109/tvcg.2021.3099002] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Visualizations themselves have become a data format. Akin to other data formats such as text and images, visualizations are increasingly created, stored, shared, and (re-)used with artificial intelligence (AI) techniques. In this survey, we probe the underlying vision of formalizing visualizations as an emerging data format and review the recent advance in applying AI techniques to visualization data (AI4VIS). We define visualization data as the digital representations of visualizations in computers and focus on data visualization (e.g., charts and infographics). We build our survey upon a corpus spanning ten different fields in computer science with an eye toward identifying important common interests. Our resulting taxonomy is organized around WHAT is visualization data and its representation, WHY and HOW to apply AI to visualization data. We highlight a set of common tasks that researchers apply to the visualization data and present a detailed discussion of AI approaches developed to accomplish those tasks. Drawing upon our literature review, we discuss several important research questions surrounding the management and exploitation of visualization data, as well as the role of AI in support of those processes. We make the list of surveyed papers and related material available online at.
Collapse
|
12
|
Wang Q, Chen Z, Wang Y, Qu H. A Survey on ML4VIS: Applying Machine Learning Advances to Data Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:5134-5153. [PMID: 34437063 DOI: 10.1109/tvcg.2021.3106142] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Inspired by the great success of machine learning (ML), researchers have applied ML techniques to visualizations to achieve a better design, development, and evaluation of visualizations. This branch of studies, known as ML4VIS, is gaining increasing research attention in recent years. To successfully adapt ML techniques for visualizations, a structured understanding of the integration of ML4VIS is needed. In this article, we systematically survey 88 ML4VIS studies, aiming to answer two motivating questions: "what visualization processes can be assisted by ML?" and "how ML techniques can be used to solve visualization problems? "This survey reveals seven main processes where the employment of ML techniques can benefit visualizations: Data Processing4VIS, Data-VIS Mapping, Insight Communication, Style Imitation, VIS Interaction, VIS Reading, and User Profiling. The seven processes are related to existing visualization theoretical models in an ML4VIS pipeline, aiming to illuminate the role of ML-assisted visualization in general visualizations. Meanwhile, the seven processes are mapped into main learning tasks in ML to align the capabilities of ML with the needs in visualization. Current practices and future opportunities of ML4VIS are discussed in the context of the ML4VIS pipeline and the ML-VIS mapping. While more studies are still needed in the area of ML4VIS, we hope this article can provide a stepping-stone for future exploration. A web-based interactive browser of this survey is available at https://ml4vis.github.io.
Collapse
|
13
|
Henkin R, Turkay C. Words of Estimative Correlation: Studying Verbalizations of Scatterplots. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1967-1981. [PMID: 32915742 DOI: 10.1109/tvcg.2020.3023537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Natural language and visualization are being increasingly deployed together for supporting data analysis in different ways, from multimodal interaction to enriched data summaries and insights. Yet, researchers still lack systematic knowledge on how viewers verbalize their interpretations of visualizations, and how they interpret verbalizations of visualizations in such contexts. We describe two studies aimed at identifying characteristics of data and charts that are relevant in such tasks. The first study asks participants to verbalize what they see in scatterplots that depict various levels of correlations. The second study then asks participants to choose visualizations that match a given verbal description of correlation. We extract key concepts from responses, organize them in a taxonomy and analyze the categorized responses. We observe that participants use a wide range of vocabulary across all scatterplots, but particular concepts are preferred for higher levels of correlation. A comparison between the studies reveals the ambiguity of some of the concepts. We discuss how the results could inform the design of multimodal representations aligned with the data and analytical tasks, and present a research roadmap to deepen the understanding about visualizations and natural language.
Collapse
|
14
|
Khan S, Nguyen PH, Abdul-Rahman A, Bach B, Chen M, Freeman E, Turkay C. Propagating Visual Designs to Numerous Plots and Dashboards. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:86-95. [PMID: 34587060 DOI: 10.1109/tvcg.2021.3114828] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
In the process of developing an infrastructure for providing visualization and visual analytics (VIS) tools to epidemiologists and modeling scientists, we encountered a technical challenge for applying a number of visual designs to numerous datasets rapidly and reliably with limited development resources. In this paper, we present a technical solution to address this challenge. Operationally, we separate the tasks of data management, visual designs, and plots and dashboard deployment in order to streamline the development workflow. Technically, we utilize: an ontology to bring datasets, visual designs, and deployable plots and dashboards under the same management framework; multi-criteria search and ranking algorithms for discovering potential datasets that match a visual design; and a purposely-design user interface for propagating each visual design to appropriate datasets (often in tens and hundreds) and quality-assuring the propagation before the deployment. This technical solution has been used in the development of the RAMPVIS infrastructure for supporting a consortium of epidemiologists and modeling scientists through visualization.
Collapse
|
15
|
Luo Y, Tang N, Li G, Tang J, Chai C, Qin X. Natural Language to Visualization by Neural Machine Translation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:217-226. [PMID: 34784276 DOI: 10.1109/tvcg.2021.3114848] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Supporting the translation from natural language (NL) query to visualization (NL2VIS) can simplify the creation of data visualizations because if successful, anyone can generate visualizations by their natural language from the tabular data. The state-of-the-art NL2VIS approaches (e.g., NL4DV and FlowSense) are based on semantic parsers and heuristic algorithms, which are not end-to-end and are not designed for supporting (possibly) complex data transformations. Deep neural network powered neural machine translation models have made great strides in many machine translation tasks, which suggests that they might be viable for NL2VIS as well. In this paper, we present ncNet, a Transformer-based sequence-to-sequence model for supporting NL2VIS, with several novel visualization-aware optimizations, including using attention-forcing to optimize the learning process, and visualization-aware rendering to produce better visualization results. To enhance the capability of machine to comprehend natural language queries, ncNet is also designed to take an optional chart template (e.g., a pie chart or a scatter plot) as an additional input, where the chart template will be served as a constraint to limit what could be visualized. We conducted both quantitative evaluation and user study, showing that ncNet achieves good accuracy in the nvBench benchmark and is easy-to-use.
Collapse
|
16
|
Jiang Q, Sun G, Dong Y, Liang R. DT2VIS: A Focus+Context Answer Generation System to Facilitate Visual Exploration of Tabular Data. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2021; 41:45-56. [PMID: 34260350 DOI: 10.1109/mcg.2021.3097326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The visual analysis dialog system utilizing natural language interface is emerging as a promising data analysis tool. However, previous work mostly focused on accurately understanding the query intention of a user but not on generating answers and inducing explorations. A focus+context answer generation approach, which allows users to obtain insight and contextual information simultaneously, is proposed in this work to address the incomplete user query (i.e., input query cannot reflect all possible intentions of the user). A query recommendation algorithm, which applies the historical query information of a user to recommend a follow-up query, is also designed and implemented to provide an in-depth exploration. These ideas are implemented in a system called DT2VIS. Specific cases of utilizing DT2VIS are also provided to analyze data. Finally, the results show that DT2VIS could help users easily and efficiently reach their analysis goals in a comparative study.
Collapse
|
17
|
Colvis—A Structured Annotation Acquisition System for Data Visualization. INFORMATION 2021. [DOI: 10.3390/info12040158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Annotations produced by analysts during the exploration of a data visualization are a precious source of knowledge. Harnessing this knowledge requires a thorough structure of annotations, but also a means to acquire them without harming user engagement. The main contribution of this article is a method, taking the form of an interface, that offers a comprehensive “subject-verb-complement” set of steps for analysts to take annotations, and seamlessly translate these annotations within a prior classification framework. Technical considerations are also an integral part of this study: through a concrete web implementation, we prove the feasibility of our method, but also highlight some of the unresolved challenges that remain to be addressed. After explaining all concepts related to our work, from a literature review to JSON Specifications, we follow by showing two use cases that illustrate how the interface can work in concrete situations. We conclude with a substantial discussion of the limitations, the current state of the method and the upcoming steps for this annotation interface.
Collapse
|
18
|
Narechania A, Srinivasan A, Stasko J. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:369-379. [PMID: 33048704 DOI: 10.1109/tvcg.2020.3030378] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Natural language interfaces (NLls) have shown great promise for visual data analysis, allowing people to flexibly specify and interact with visualizations. However, developing visualization NLIs remains a challenging task, requiring low-level implementation of natural language processing (NLP) techniques as well as knowledge of visual analytic tasks and visualization design. We present NL4DV, a toolkit for natural language-driven data visualization. NL4DV is a Python package that takes as input a tabular dataset and a natural language query about that dataset. In response, the toolkit returns an analytic specification modeled as a JSON object containing data attributes, analytic tasks, and a list of Vega-Lite specifications relevant to the input query. In doing so, NL4DV aids visualization developers who may not have a background in NLP, enabling them to create new visualization NLIs or incorporate natural language input within their existing systems. We demonstrate NL4DV's usage and capabilities through four examples: 1) rendering visualizations using natural language in a Jupyter notebook, 2) developing a NLI to specify and edit Vega-Lite charts, 3) recreating data ambiguity widgets from the DataTone system, and 4) incorporating speech input to create a multimodal visualization system.
Collapse
|
19
|
Lumina: an adaptive, automated and extensible prototype for exploring, enriching and visualizing data. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00718-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
20
|
Chowdhury I, Moeid A, Hoque E, Kabir MA, Hossain MS, Islam MM. Designing and Evaluating Multimodal Interactions for Facilitating Visual Analysis With Dashboards. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 9:60-71. [PMID: 34812375 PMCID: PMC8545227 DOI: 10.1109/access.2020.3046623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 12/11/2020] [Indexed: 06/13/2023]
Abstract
Exploring and analyzing data using visualizations is at the heart of many decision-making tasks. Typically, people perform visual data analysis using mouse and touch interactions. While such interactions are often easy to use, they can be inadequate for users to express complex information and may require many steps to complete a task. Recently natural language interaction has emerged as a promising technique for supporting exploration with visualization, as the user can express a complex analytical question more easily. In this paper, we investigate how to synergistically combine language and mouse-based direct manipulations so that the weakness of one modality can be complemented by the other. To this end, we have developed a novel system, named Multimodal Interactions System for Visual Analysis (MIVA), that allows user to provide input using both natural language (e.g., through speech) and direct manipulation (e.g., through mouse or touch) and presents the answer accordingly. To answer the current question in the context of past interactions, the system incorporates previous utterances and direct manipulations made by the user within a finite-state model. The uniqueness of our approach is that unlike most previous approaches which typically support multimodal interactions with a single visualization, MIVA enables multimodal interactions with multiple coordinated visualizations of a dashboard that visually summarizes a dataset. We tested MIVA's applicability on several dashboards including a COVID-19 dashboard that visualizes coronavirus cases around the globe. We further empirically evaluated our system through a user study with twenty participants. The results of our study revealed that MIVA system enhances the flow of visual analysis by enabling fluid, iterative exploration and refinement of data in a dashboard with multiple-coordinated views.
Collapse
Affiliation(s)
- Imran Chowdhury
- Department of Computer Science and EngineeringChittagong University of Engineering and TechnologyChittagong4349Bangladesh
| | - Abdul Moeid
- Department of Computer Science and EngineeringChittagong University of Engineering and TechnologyChittagong4349Bangladesh
| | - Enamul Hoque
- School of Information TechnologyYork UniversityTorontoONM3J 1P3Canada
| | - Muhammad Ashad Kabir
- School of Computing and MathematicsCharles Sturt UniversityBathurstNSW2795Australia
| | - Md. Sabir Hossain
- Department of Computer Science and EngineeringChittagong University of Engineering and TechnologyChittagong4349Bangladesh
| | | |
Collapse
|
21
|
Srinivasan A, Stasko J, Keefe DF, Tory M. How to Ask What to Say?: Strategies for Evaluating Natural Language Interfaces for Data Visualization. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2020; 40:96-103. [PMID: 32544054 DOI: 10.1109/mcg.2020.2986902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we discuss challenges and strategies for evaluating natural language interfaces (NLIs) for data visualization. Through an examination of prior studies and reflecting on own experiences in evaluating visualization NLIs, we highlight benefits and considerations of three task framing strategies: Jeopardy-style facts, open-ended tasks, and target replication tasks. We hope the discussions in this article can guide future researchers working on visualization NLIs and help them avoid common challenges and pitfalls when evaluating these systems. Finally, to motivate future research, we highlight topics that call for further investigation including development of new evaluation metrics, and considering the type of natural language input (spoken versus typed), among others.
Collapse
|
22
|
Saktheeswaran A, Srinivasan A, Stasko J. Touch? Speech? or Touch and Speech? Investigating Multimodal Interaction for Visual Network Exploration and Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:2168-2179. [PMID: 32012017 DOI: 10.1109/tvcg.2020.2970512] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Interaction plays a vital role during visual network exploration as users need to engage with both elements in the view (e.g., nodes, links) and interface controls (e.g., sliders, dropdown menus). Particularly as the size and complexity of a network grow, interactive displays supporting multimodal input (e.g., touch, speech, pen, gaze) exhibit the potential to facilitate fluid interaction during visual network exploration and analysis. While multimodal interaction with network visualization seems like a promising idea, many open questions remain. For instance, do users actually prefer multimodal input over unimodal input, and if so, why? Does it enable them to interact more naturally, or does having multiple modes of input confuse users? To answer such questions, we conducted a qualitative user study in the context of a network visualization tool, comparing speech- and touch-based unimodal interfaces to a multimodal interface combining the two. Our results confirm that participants strongly prefer multimodal input over unimodal input attributing their preference to: 1) the freedom of expression, 2) the complementary nature of speech and touch, and 3) integrated interactions afforded by the combination of the two modalities. We also describe the interaction patterns participants employed to perform common network visualization operations and highlight themes for future multimodal network visualization systems to consider.
Collapse
|