1
|
Yang H, Li J, Chen S. TopicRefiner: Coherence-Guided Steerable LDA for Visual Topic Enhancement. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:4542-4557. [PMID: 37053067 DOI: 10.1109/tvcg.2023.3266890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
This article presents a new Human-steerable Topic Modeling (HSTM) technique. Unlike existing techniques commonly relying on matrix decomposition-based topic models, we extend LDA as the fundamental component for extracting topics. LDA's high popularity and technical characteristics, such as better topic quality and no need to cherry-pick terms to construct the document-term matrix, ensure better applicability. Our research revolves around two inherent limitations of LDA. First, the principle of LDA is complex. Its calculation process is stochastic and difficult to control. We thus give a weighting method to incorporate users' refinements into the Gibbs sampling to control LDA. Second, LDA often runs on a corpus with massive terms and documents, forming a vast search space for users to find semantically relevant or irrelevant objects. We thus design a visual editing framework based on the coherence metric, proven to be the most consistent with human perception in assessing topic quality, to guide users' interactive refinements. Cases on two open real-world datasets, participants' performance in a user study, and quantitative experiment results demonstrate the usability and effectiveness of the proposed technique.
Collapse
|
2
|
Ying L, Shu X, Deng D, Yang Y, Tang T, Yu L, Wu Y. MetaGlyph: Automatic Generation of Metaphoric Glyph-based Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:331-341. [PMID: 36179002 DOI: 10.1109/tvcg.2022.3209447] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Glyph-based visualization achieves an impressive graphic design when associated with comprehensive visual metaphors, which help audiences effectively grasp the conveyed information through revealing data semantics. However, creating such metaphoric glyph-based visualization (MGV) is not an easy task, as it requires not only a deep understanding of data but also professional design skills. This paper proposes MetaGlyph, an automatic system for generating MGVs from a spreadsheet. To develop MetaGlyph, we first conduct a qualitative analysis to understand the design of current MGVs from the perspectives of metaphor embodiment and glyph design. Based on the results, we introduce a novel framework for generating MGVs by metaphoric image selection and an MGV construction. Specifically, MetaGlyph automatically selects metaphors with corresponding images from online resources based on the input data semantics. We then integrate a Monte Carlo tree search algorithm that explores the design of an MGV by associating visual elements with data dimensions given the data importance, semantic relevance, and glyph non-overlap. The system also provides editing feedback that allows users to customize the MGVs according to their design preferences. We demonstrate the use of MetaGlyph through a set of examples, one usage scenario, and validate its effectiveness through a series of expert interviews.
Collapse
|
3
|
Sevastjanova R, El-Assady M, Bradley A, Collins C, Butt M, Keim D. VisInReport: Complementing Visual Discourse Analytics Through Personalized Insight Reports. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4757-4769. [PMID: 34379592 DOI: 10.1109/tvcg.2021.3104026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We present VisInReport, a visual analytics tool that supports the manual analysis of discourse transcripts and generates reports based on user interaction. As an integral part of scholarly work in the social sciences and humanities, discourse analysis involves an aggregation of characteristics identified in the text, which, in turn, involves a prior identification of regions of particular interest. Manual data evaluation requires extensive effort, which can be a barrier to effective analysis. Our system addresses this challenge by augmenting the users' analysis with a set of automatically generated visualization layers. These layers enable the detection and exploration of relevant parts of the discussion supporting several tasks, such as topic modeling or question categorization. The system summarizes the extracted events visually and verbally, generating a content-rich insight into the data and the analysis process. During each analysis session, VisInReport builds a shareable report containing a curated selection of interactions and annotations generated by the analyst. We evaluate our approach on real-world datasets through a qualitative study with domain experts from political science, computer science, and linguistics. The results highlight the benefit of integrating the analysis and reporting processes through a visual analytics system, which supports the communication of results among collaborating researchers.
Collapse
|
4
|
Wang Q, Chen Z, Wang Y, Qu H. A Survey on ML4VIS: Applying Machine Learning Advances to Data Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:5134-5153. [PMID: 34437063 DOI: 10.1109/tvcg.2021.3106142] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Inspired by the great success of machine learning (ML), researchers have applied ML techniques to visualizations to achieve a better design, development, and evaluation of visualizations. This branch of studies, known as ML4VIS, is gaining increasing research attention in recent years. To successfully adapt ML techniques for visualizations, a structured understanding of the integration of ML4VIS is needed. In this article, we systematically survey 88 ML4VIS studies, aiming to answer two motivating questions: "what visualization processes can be assisted by ML?" and "how ML techniques can be used to solve visualization problems? "This survey reveals seven main processes where the employment of ML techniques can benefit visualizations: Data Processing4VIS, Data-VIS Mapping, Insight Communication, Style Imitation, VIS Interaction, VIS Reading, and User Profiling. The seven processes are related to existing visualization theoretical models in an ML4VIS pipeline, aiming to illuminate the role of ML-assisted visualization in general visualizations. Meanwhile, the seven processes are mapped into main learning tasks in ML to align the capabilities of ML with the needs in visualization. Current practices and future opportunities of ML4VIS are discussed in the context of the ML4VIS pipeline and the ML-VIS mapping. While more studies are still needed in the area of ML4VIS, we hope this article can provide a stepping-stone for future exploration. A web-based interactive browser of this survey is available at https://ml4vis.github.io.
Collapse
|
5
|
A Survey of Domain Knowledge Elicitation in Applied Machine Learning. MULTIMODAL TECHNOLOGIES AND INTERACTION 2021. [DOI: 10.3390/mti5120073] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Eliciting knowledge from domain experts can play an important role throughout the machine learning process, from correctly specifying the task to evaluating model results. However, knowledge elicitation is also fraught with challenges. In this work, we consider why and how machine learning researchers elicit knowledge from experts in the model development process. We develop a taxonomy to characterize elicitation approaches according to the elicitation goal, elicitation target, elicitation process, and use of elicited knowledge. We analyze the elicitation trends observed in 28 papers with this taxonomy and identify opportunities for adding rigor to these elicitation approaches. We suggest future directions for research in elicitation for machine learning by highlighting avenues for further exploration and drawing on what we can learn from elicitation research in other fields.
Collapse
|
6
|
Takahashi S, Uchita A, Watanabe K, Arikawa M. Gaze-driven placement of items for proactive visual exploration. J Vis (Tokyo) 2021; 25:613-633. [PMID: 34785979 PMCID: PMC8581132 DOI: 10.1007/s12650-021-00808-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 09/11/2021] [Accepted: 10/06/2021] [Indexed: 11/17/2022]
Abstract
Recent advances in digital signage technology have improved the ability to visually select specific items within a group. Although this is due to the ability to dynamically update the display of items, the corresponding layout schemes remain a subject of research. This paper explores the sophisticated layout of items by respecting the underlying context of searching for favorite items. Our study begins by formulating the static placement of items as an optimization problem that incorporates aesthetic layout criteria as constraints. This is further extended to accommodate the dynamic placement of items for more proactive visual exploration based on the ongoing search context. Our animated layout is driven by analyzing the distribution of eye gaze through an eye-tracking device, by which we infer how the most attractive items lead to the finally wanted ones. We create a planar layout of items as a context map to establish association rules to dynamically replace existing items with new ones. For this purpose, we extract the set of important topics from a set of annotated texts associated with the items using matrix factorization. We also conduct user studies to evaluate the validity of the design criteria incorporated into both static and dynamic placement of items. After discussing the pros and cons of the proposed approach and possible themes for future research, we conclude this paper.
Collapse
Affiliation(s)
- Shigeo Takahashi
- Department of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu, 965-8580 Japan
| | - Akane Uchita
- Department of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu, 965-8580 Japan
| | - Kazuho Watanabe
- Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, 441-8580 Japan
| | - Masatoshi Arikawa
- Graduate School of Engineering Science, Akita University, Akita, 010-8502 Japan
| |
Collapse
|
7
|
Kim H, Drake B, Endert A, Park H. ArchiText: Interactive Hierarchical Topic Modeling. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3644-3655. [PMID: 32191890 DOI: 10.1109/tvcg.2020.2981456] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Human-in-the-loop topic modeling allows users to explore and steer the process to produce better quality topics that align with their needs. When integrated into visual analytic systems, many existing automated topic modeling algorithms are given interactive parameters to allow users to tune or adjust them. However, this has limitations when the algorithms cannot be easily adapted to changes, and it is difficult to realize interactivity closely supported by underlying algorithms. Instead, we emphasize the concept of tight integration, which advocates for the need to co-develop interactive algorithms and interactive visual analytic systems in parallel to allow flexibility and scalability. In this article, we describe design goals for efficiently and effectively executing the concept of tight integration among computation, visualization, and interaction for hierarchical topic modeling of text data. We propose computational base operations for interactive tasks to achieve the design goals. To instantiate our concept, we present ArchiText, a prototype system for interactive hierarchical topic modeling, which offers fast, flexible, and algorithmically valid analysis via tight integration. Utilizing interactive hierarchical topic modeling, our technique lets users generate, explore, and flexibly steer hierarchical topics to discover more informed topics and their document memberships.
Collapse
|
8
|
Yang C, Liu T, Yi W, Chen X, Niu B. Identifying expertise through semantic modeling: A modified BBPSO algorithm for the reviewer assignment problem. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106483] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
9
|
El-Assady M, Kehlbeck R, Collins C, Keim D, Deussen O. Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1001-1011. [PMID: 31443000 DOI: 10.1109/tvcg.2019.2934654] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users' decision-making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process.
Collapse
|
10
|
Spinner T, Schlegel U, Schafer H, El-Assady M. explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1064-1074. [PMID: 31442998 DOI: 10.1109/tvcg.2019.2934629] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We propose a framework for interactive and explainable machine learning that enables users to (1) understand machine learning models; (2) diagnose model limitations using different explainable AI methods; as well as (3) refine and optimize the models. Our framework combines an iterative XAI pipeline with eight global monitoring and steering mechanisms, including quality monitoring, provenance tracking, model comparison, and trust building. To operationalize the framework, we present explAIner, a visual analytics system for interactive and explainable machine learning that instantiates all phases of the suggested pipeline within the commonly used TensorBoard environment. We performed a user-study with nine participants across different expertise levels to examine their perception of our workflow and to collect suggestions to fill the gap between our system and framework. The evaluation confirms that our tightly integrated system leads to an informed machine learning process while disclosing opportunities for further extensions.
Collapse
|