1
|
Ye Y, Xiao S, Zeng X, Zeng W. ModalChorus: Visual Probing and Alignment of Multi-Modal Embeddings via Modal Fusion Map. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:294-304. [PMID: 39250410 DOI: 10.1109/tvcg.2024.3456387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Multi-modal embeddings form the foundation for vision-language models, such as CLIP embeddings, the most widely used text-image embeddings. However, these embeddings are vulnerable to subtle misalignment of cross-modal features, resulting in decreased model performance and diminished generalization. To address this problem, we design ModalChorus, an interactive system for visual probing and alignment of multi-modal embeddings. ModalChorus primarily offers a two-stage process: 1) embedding probing with Modal Fusion Map (MFM), a novel parametric dimensionality reduction method that integrates both metric and nonmetric objectives to enhance modality fusion; and 2) embedding alignment that allows users to interactively articulate intentions for both point-set and set-set alignments. Quantitative and qualitative comparisons for CLIP embeddings with existing dimensionality reduction (e.g., t-SNE and MDS) and data fusion (e.g., data context map) methods demonstrate the advantages of MFM in showcasing cross-modal features over common vision-language datasets. Case studies reveal that ModalChorus can facilitate intuitive discovery of misalignment and efficient re-alignment in scenarios ranging from zero-shot classification to cross-modal retrieval and generation.
Collapse
|
2
|
Prasad V, van Sloun RJG, Vilanova A, Pezzotti N. ProactiV: Studying Deep Learning Model Behavior Under Input Transformations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:5651-5665. [PMID: 37535493 DOI: 10.1109/tvcg.2023.3301722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
Deep learning (DL) models have shown performance benefits across many applications, from classification to image-to-image translation. However, low interpretability often leads to unexpected model behavior once deployed in the real world. Usually, this unexpected behavior is because the training data domain does not reflect the deployment data domain. Identifying a model's breaking points under input conditions and domain shifts, i.e., input transformations, is essential to improve models. Although visual analytics (VA) has shown promise in studying the behavior of model outputs under continually varying inputs, existing methods mainly focus on per-class or instance-level analysis. We aim to generalize beyond classification where classes do not exist and provide a global view of model behavior under co-occurring input transformations. We present a DL model-agnostic VA method (ProactiV) to help model developers proactively study output behavior under input transformations to identify and verify breaking points. ProactiV relies on a proposed input optimization method to determine the changes to a given transformed input to achieve the desired output. The data from this optimization process allows the study of global and local model behavior under input transformations at scale. Additionally, the optimization method provides insights into the input characteristics that result in desired outputs and helps recognize model biases. We highlight how ProactiV effectively supports studying model behavior with example classification and image-to-image translation tasks.
Collapse
|
3
|
Routray SK. Visualization and Visual Analytics in Autonomous Driving. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2024; 44:43-53. [PMID: 38526907 DOI: 10.1109/mcg.2024.3381450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
Autonomous driving is no longer a topic of science fiction. Advancements of autonomous driving technologies are now reliable. Effectively harnessing the information is essential for enhancing the safety, reliability, and efficiency of autonomous vehicles. In this article, we explore the pivotal role of visualization and visual analytics (VA) techniques used in autonomous driving. By employing sophisticated data visualization methods, VA, researchers, and practitioners transform intricate datasets into intuitive visual representations, providing valuable insights for decision-making processes. This article delves into various visualization approaches, including spatial-temporal mapping, interactive dashboards, and machine learning-driven analytics, tailored specifically for autonomous driving scenarios. Furthermore, it investigates the integration of real-time sensor data, sensor coordination with VA, and machine learning algorithms to create comprehensive visualizations. This research advocates for the pivotal role of visualization and VA in shaping the future of autonomous driving systems, fostering innovation, and ensuring the safe integration of self-driving vehicles.
Collapse
|
4
|
Wang Q, L'Yi S, Gehlenborg N. DRAVA: Aligning Human Concepts with Machine Learning Latent Dimensions for the Visual Exploration of Small Multiples. PROCEEDINGS OF THE SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS. CHI CONFERENCE 2023; 2023:833. [PMID: 38074525 PMCID: PMC10707479 DOI: 10.1145/3544548.3581127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
Latent vectors extracted by machine learning (ML) are widely used in data exploration (e.g., t-SNE) but suffer from a lack of interpretability. While previous studies employed disentangled representation learning (DRL) to enable more interpretable exploration, they often overlooked the potential mismatches between the concepts of humans and the semantic dimensions learned by DRL. To address this issue, we propose Drava, a visual analytics system that supports users in 1) relating the concepts of humans with the semantic dimensions of DRL and identifying mismatches, 2) providing feedback to minimize the mismatches, and 3) obtaining data insights from concept-driven exploration. Drava provides a set of visualizations and interactions based on visual piles to help users understand and refine concepts and conduct concept-driven exploration. Meanwhile, Drava employs a concept adaptor model to fine-tune the semantic dimensions of DRL based on user refinement. The usefulness of Drava is demonstrated through application scenarios and experimental validation.
Collapse
Affiliation(s)
| | - Sehi L'Yi
- Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
5
|
Bertucci D, Hamid MM, Anand Y, Ruangrotsakun A, Tabatabai D, Perez M, Kahng M. DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:320-330. [PMID: 36166545 DOI: 10.1109/tvcg.2022.3209425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this paper, we present DendroMap, a novel approach to interactively exploring large-scale image datasets for machine learning (ML). ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets because images are ineffectively organized and interactions are insufficiently supported. To address these challenges, we develop DendroMap by adapting Treemaps, a well-known visualization technique. DendroMap effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images. It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests at multiple levels of abstraction. Our case studies with widely-used image datasets for deep learning demonstrate that users can discover insights about datasets and trained models by examining the diversity of images, identifying underperforming subgroups, and analyzing classification errors. We conducted a user study that evaluates the effectiveness of DendroMap in grouping and searching tasks by comparing it with a gridified version of t-SNE and found that participants preferred DendroMap. DendroMap is available at https://div-lab.github.io/dendromap/.
Collapse
|
6
|
Hoque MN, He W, Shekar AK, Gou L, Ren L. Visual Concept Programming: A Visual Analytics Approach to Injecting Human Intelligence at Scale. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:74-83. [PMID: 36166533 DOI: 10.1109/tvcg.2022.3209466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Data-centric AI has emerged as a new research area to systematically engineer the data to land AI models for real-world applications. As a core method for data-centric AI, data programming helps experts inject domain knowledge into data and label data at scale using carefully designed labeling functions (e.g., heuristic rules, logistics). Though data programming has shown great success in the NLP domain, it is challenging to program image data because of a) the challenge to describe images using visual vocabulary without human annotations and b) lacking efficient tools for data programming of images. We present Visual Concept Programming, a first-of-its-kind visual analytics approach of using visual concepts to program image data at scale while requiring a few human efforts. Our approach is built upon three unique components. It first uses a self-supervised learning approach to learn visual representation at the pixel level and extract a dictionary of visual concepts from images without using any human annotations. The visual concepts serve as building blocks of labeling functions for experts to inject their domain knowledge. We then design interactive visualizations to explore and understand visual concepts and compose labeling functions with concepts without writing code. Finally, with the composed labeling functions, users can label the image data at scale and use the labeled data to refine the pixel-wise visual representation and concept quality. We evaluate the learned pixel-wise visual representation for the downstream task of semantic segmentation to show the effectiveness and usefulness of our approach. In addition, we demonstrate how our approach tackles real-world problems of image retrieval for autonomous driving.
Collapse
|
7
|
Zhang X, Ono JP, Song H, Gou L, Ma KL, Ren L. SliceTeller: A Data Slice-Driven Approach for Machine Learning Model Validation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:842-852. [PMID: 36179005 DOI: 10.1109/tvcg.2022.3209465] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Real-world machine learning applications need to be thoroughly evaluated to meet critical product requirements for model release, to ensure fairness for different groups or individuals, and to achieve a consistent performance in various scenarios. For example, in autonomous driving, an object classification model should achieve high detection rates under different conditions of weather, distance, etc. Similarly, in the financial setting, credit-scoring models must not discriminate against minority groups. These conditions or groups are called as "Data Slices". In product MLOps cycles, product developers must identify such critical data slices and adapt models to mitigate data slice problems. Discovering where models fail, understanding why they fail, and mitigating these problems, are therefore essential tasks in the MLOps life-cycle. In this paper, we present SliceTeller, a novel tool that allows users to debug, compare and improve machine learning models driven by critical data slices. SliceTeller automatically discovers problematic slices in the data, helps the user understand why models fail. More importantly, we present an efficient algorithm, SliceBoosting, to estimate trade-offs when prioritizing the optimization over certain slices. Furthermore, our system empowers model developers to compare and analyze different model versions during model iterations, allowing them to choose the model version best suitable for their applications. We evaluate our system with three use cases, including two real-world use cases of product development, to demonstrate the power of SliceTeller in the debugging and improvement of product-quality ML models.
Collapse
|
8
|
Wu A, Deng D, Cheng F, Wu Y, Liu S, Qu H. In Defence of Visual Analytics Systems: Replies to Critics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1026-1036. [PMID: 36179000 DOI: 10.1109/tvcg.2022.3209360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The last decade has witnessed many visual analytics (VA) systems that make successful applications to wide-ranging domains like urban analytics and explainable AI. However, their research rigor and contributions have been extensively challenged within the visualization community. We come in defence of VA systems by contributing two interview studies for gathering critics and responses to those criticisms. First, we interview 24 researchers to collect criticisms the review comments on their VA work. Through an iterative coding and refinement process, the interview feedback is summarized into a list of 36 common criticisms. Second, we interview 17 researchers to validate our list and collect their responses, thereby discussing implications for defending and improving the scientific values and rigor of VA systems. We highlight that the presented knowledge is deep, extensive, but also imperfect, provocative, and controversial, and thus recommend reading with an inclusive and critical eye. We hope our work can provide thoughts and foundations for conducting VA research and spark discussions to promote the research field forward more rigorously and vibrantly.
Collapse
|
9
|
Deng Z, Weng D, Liu S, Tian Y, Xu M, Wu Y. A survey of urban visual analytics: Advances and future directions. COMPUTATIONAL VISUAL MEDIA 2022; 9:3-39. [PMID: 36277276 PMCID: PMC9579670 DOI: 10.1007/s41095-022-0275-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 02/08/2022] [Indexed: 06/16/2023]
Abstract
Developing effective visual analytics systems demands care in characterization of domain problems and integration of visualization techniques and computational models. Urban visual analytics has already achieved remarkable success in tackling urban problems and providing fundamental services for smart cities. To promote further academic research and assist the development of industrial urban analytics systems, we comprehensively review urban visual analytics studies from four perspectives. In particular, we identify 8 urban domains and 22 types of popular visualization, analyze 7 types of computational method, and categorize existing systems into 4 types based on their integration of visualization techniques and computational models. We conclude with potential research directions and opportunities.
Collapse
Affiliation(s)
- Zikun Deng
- State Key Lab of CAD & CG, Zhejiang University, Hangzhou, 310058 China
| | - Di Weng
- Microsoft Research Asia, Beijing, 100080 China
| | - Shuhan Liu
- State Key Lab of CAD & CG, Zhejiang University, Hangzhou, 310058 China
| | - Yuan Tian
- State Key Lab of CAD & CG, Zhejiang University, Hangzhou, 310058 China
| | - Mingliang Xu
- School of Information Engineering, Zhengzhou University, Zhengzhou, China
- Henan Institute of Advanced Technology, Zhengzhou University, Zhengzhou, 450001 China
| | - Yingcai Wu
- State Key Lab of CAD & CG, Zhejiang University, Hangzhou, 310058 China
| |
Collapse
|