1
|
Deng D, Zhang C, Zheng H, Pu Y, Ji S, Wu Y. AdversaFlow: Visual Red Teaming for Large Language Models with Multi-Level Adversarial Flow. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:492-502. [PMID: 39283796 DOI: 10.1109/tvcg.2024.3456150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Large Language Models (LLMs) are powerful but also raise significant security concerns, particularly regarding the harm they can cause, such as generating fake news that manipulates public opinion on social media and providing responses to unethical activities. Traditional red teaming approaches for identifying AI vulnerabilities rely on manual prompt construction and expertise. This paper introduces AdversaFlow, a novel visual analytics system designed to enhance LLM security against adversarial attacks through human-AI collaboration. AdversaFlow involves adversarial training between a target model and a red model, featuring unique multi-level adversarial flow and fluctuation path visualizations. These features provide insights into adversarial dynamics and LLM robustness, enabling experts to identify and mitigate vulnerabilities effectively. We present quantitative evaluations and case studies validating our system's utility and offering insights for future AI security solutions. Our method can enhance LLM security, supporting downstream scenarios like social media regulation by enabling more effective detection, monitoring, and mitigation of harmful content and behaviors.
Collapse
|
2
|
Zhang Z, Yang F, Cheng R, Ma Y. ParetoTracker: Understanding Population Dynamics in Multi-Objective Evolutionary Algorithms Through Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:820-830. [PMID: 39255166 DOI: 10.1109/tvcg.2024.3456142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Multi-objective evolutionary algorithms (MOEAs) have emerged as powerful tools for solving complex optimization problems characterized by multiple, often conflicting, objectives. While advancements have been made in computational efficiency as well as diversity and convergence of solutions, a critical challenge persists: the internal evolutionary mechanisms are opaque to human users. Drawing upon the successes of explainable AI in explaining complex algorithms and models, we argue that the need to understand the underlying evolutionary operators and population dynamics within MOEAs aligns well with a visual analytics paradigm. This paper introduces ParetoTracker, a visual analytics framework designed to support the comprehension and inspection of population dynamics in the evolutionary processes of MOEAs. Informed by preliminary literature review and expert interviews, the framework establishes a multi-level analysis scheme, which caters to user engagement and exploration ranging from examining overall trends in performance metrics to conducting fine-grained inspections of evolutionary operations. In contrast to conventional practices that require manual plotting of solutions for each generation, ParetoTracker facilitates the examination of temporal trends and dynamics across consecutive generations in an integrated visual interface. The effectiveness of the framework is demonstrated through case studies and expert interviews focused on widely adopted benchmark optimization problems.
Collapse
|
3
|
Li G, Wang J, Wang Y, Shan G, Zhao Y. An In-Situ Visual Analytics Framework for Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:6770-6786. [PMID: 38051629 DOI: 10.1109/tvcg.2023.3339585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
The past decade has witnessed the superior power of deep neural networks (DNNs) in applications across various domains. However, training a high-quality DNN remains a non-trivial task due to its massive number of parameters. Visualization has shown great potential in addressing this situation, as evidenced by numerous recent visualization works that aid in DNN training and interpretation. These works commonly employ a strategy of logging training-related data and conducting post-hoc analysis. Based on the results of offline analysis, the model can be further trained or fine-tuned. This strategy, however, does not cope with the increasing complexity of DNNs, because (1) the time-series data collected over the training are usually too large to be stored entirely; (2) the huge I/O overhead significantly impacts the training efficiency; (3) post-hoc analysis does not allow rapid human-interventions (e.g., stop training with improper hyper-parameter settings to save computational resources). To address these challenges, we propose an in-situ visualization and analysis framework for the training of DNNs. Specifically, we employ feature extraction algorithms to reduce the size of training-related data in-situ and use the reduced data for real-time visual analytics. The states of model training are disclosed to model designers in real-time, enabling human interventions on demand to steer the training. Through concrete case studies, we demonstrate how our in-situ framework helps deep learning experts optimize DNNs and improve their analysis efficiency.
Collapse
|
4
|
Xie L, Ouyang Y, Chen L, Wu Z, Li Q. Towards Better Modeling With Missing Data: A Contrastive Learning-Based Visual Analytics Perspective. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:5129-5146. [PMID: 37310838 DOI: 10.1109/tvcg.2023.3285210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Missing data can pose a challenge for machine learning (ML) modeling. To address this, current approaches are categorized into feature imputation and label prediction and are primarily focused on handling missing data to enhance ML performance. These approaches rely on the observed data to estimate the missing values and therefore encounter three main shortcomings in imputation, including the need for different imputation methods for various missing data mechanisms, heavy dependence on the assumption of data distribution, and potential introduction of bias. This study proposes a Contrastive Learning (CL) framework to model observed data with missing values, where the ML model learns the similarity between an incomplete sample and its complete counterpart and the dissimilarity between other samples. Our proposed approach demonstrates the advantages of CL without requiring any imputation. To enhance interpretability, we introduce CIVis, a visual analytics system that incorporates interpretable techniques to visualize the learning process and diagnose the model status. Users can leverage their domain knowledge through interactive sampling to identify negative and positive pairs in CL. The output of CIVis is an optimized model that takes specified features and predicts downstream tasks. We provide two usage scenarios in regression and classification tasks and conduct quantitative experiments, expert interviews, and a qualitative user study to demonstrate the effectiveness of our approach. In short, this study offers a valuable contribution to addressing the challenges associated with ML modeling in the presence of missing data by providing a practical solution that achieves high predictive accuracy and model interpretability.
Collapse
|
5
|
Almeghlef SM, AL-Ghamdi AALM, Ramzan MS, Ragab M. Application Layer-Based Denial-of-Service Attacks Detection against IoT-CoAP. ELECTRONICS 2023; 12:2563. [DOI: 10.3390/electronics12122563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2024]
Abstract
Internet of Things (IoT) is a massive network based on tiny devices connected internally and to the internet. Each connected device is uniquely identified in this network through a dedicated IP address and can share the information with other devices. In contrast to its alternatives, IoT consumes less power and resources; however, this makes its devices more vulnerable to different types of attacks as they cannot execute heavy security protocols. Moreover, traditionally used heavy protocols for web-based communication, such as the Hyper Text Transport Protocol (HTTP) are quite costly to be executed on IoT devices, and thus specially designed lightweight protocols, such as the Constrained Application Protocol (CoAP) are employed for this purpose. However, while the CoAP remains widely-used, it is also susceptible to attacks, such as the Distributed Denial-of-Service (DDoS) attack, which aims to overwhelm the resources of the target and make them unavailable to legitimate users. While protocols, such as the Datagram Transport Layer Security (DTLS) and Lightweight and the Secure Protocol for Wireless Sensor Network (LSPWSN) can help in securing CoAP against DDoS attacks, they also have their limitations. DTLS is not designed for constrained devices and is considered as a heavy protocol. LSPWSN, on the other hand, operates on the network layer, in contrast to CoAP which operates on the application layer. This paper presents a machine learning model, using the CIDAD dataset (created on 11 July 2022), that can detect the DDoS attacks against CoAP with an accuracy of 98%.
Collapse
Affiliation(s)
- Sultan M. Almeghlef
- Information Systems Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Abdullah AL-Malaise AL-Ghamdi
- Information Systems Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Information Systems Department, HECI School, Dar Al-Hekma University, Jeddah 34801, Saudi Arabia
| | - Muhammad Sher Ramzan
- Information Systems Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Mahmoud Ragab
- Information Technology Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Mathematics Department, Faculty of Science, Al-Azhar University, Naser City 11884, Cairo, Egypt
| |
Collapse
|
6
|
Collaris D, van Wijk JJ. StrategyAtlas: Strategy Analysis for Machine Learning Interpretability. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:2996-3008. [PMID: 35085084 DOI: 10.1109/tvcg.2022.3146806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Businesses in high-risk environments have been reluctant to adopt modern machine learning approaches due to their complex and uninterpretable nature. Most current solutions provide local, instance-level explanations, but this is insufficient for understanding the model as a whole. In this work, we show that strategy clusters (i.e., groups of data instances that are treated distinctly by the model) can be used to understand the global behavior of a complex ML model. To support effective exploration and understanding of these clusters, we introduce StrategyAtlas, a system designed to analyze and explain model strategies. Furthermore, it supports multiple ways to utilize these strategies for simplifying and improving the reference model. In collaboration with a large insurance company, we present a use case in automatic insurance acceptance, and show how professional data scientists were enabled to understand a complex model and improve the production model based on these insights.
Collapse
|
7
|
Jin S, Lee H, Park C, Chu H, Tae Y, Choo J, Ko S. A Visual Analytics System for Improving Attention-based Traffic Forecasting Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1102-1112. [PMID: 36155438 DOI: 10.1109/tvcg.2022.3209462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
With deep learning (DL) outperforming conventional methods for different tasks, much effort has been devoted to utilizing DL in various domains. Researchers and developers in the traffic domain have also designed and improved DL models for forecasting tasks such as estimation of traffic speed and time of arrival. However, there exist many challenges in analyzing DL models due to the black-box property of DL models and complexity of traffic data (i.e., spatio-temporal dependencies). Collaborating with domain experts, we design a visual analytics system, AttnAnalyzer, that enables users to explore how DL models make predictions by allowing effective spatio-temporal dependency analysis. The system incorporates dynamic time warping (DTW) and Granger causality tests for computational spatio-temporal dependency analysis while providing map, table, line chart, and pixel views to assist user to perform dependency and model behavior analysis. For the evaluation, we present three case studies showing how AttnAnalyzer can effectively explore model behaviors and improve model performance in two different road networks. We also provide domain expert feedback.
Collapse
|
8
|
Wang J, Zhang W, Yang H, Yeh CCM, Wang L. Visual Analytics for RNN-Based Deep Reinforcement Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4141-4155. [PMID: 33929961 DOI: 10.1109/tvcg.2021.3076749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep reinforcement learning (DRL) targets to train an autonomous agent to interact with a pre-defined environment and strives to achieve specific goals through deep neural networks (DNN). Recurrent neural network (RNN) based DRL has demonstrated superior performance, as RNNs can effectively capture the temporal evolution of the environment and respond with proper agent actions. However, apart from the outstanding performance, little is known about how RNNs understand the environment internally and what has been memorized over time. Revealing these details is extremely important for deep learning experts to understand and improve DRLs, which in contrast, is also challenging due to the complicated data transformations inside these models. In this article, we propose Deep Reinforcement Learning Interactive Visual Explorer (DRLIVE), a visual analytics system to effectively explore, interpret, and diagnose RNN-based DRLs. Having focused on DRL agents trained for different Atari games, DRLIVE accomplishes three tasks: game episode exploration, RNN hidden/cell state examination, and interactive model perturbation. Using the system, one can flexibly explore a DRL agent through interactive visualizations, discover interpretable RNN cells by prioritizing RNN hidden/cell states with a set of metrics, and further diagnose the DRL model by interactively perturbing its inputs. Through concrete studies with multiple deep learning experts, we validated the efficacy of DRLIVE.
Collapse
|
9
|
Goh HA, Ho CK, Abas FS. Front-end deep learning web apps development and deployment: a review. APPL INTELL 2022; 53:15923-15945. [PMID: 36466774 PMCID: PMC9709375 DOI: 10.1007/s10489-022-04278-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/17/2022] [Indexed: 12/03/2022]
Abstract
Machine learning and deep learning models are commonly developed using programming languages such as Python, C++, or R and deployed as web apps delivered from a back-end server or as mobile apps installed from an app store. However, recently front-end technologies and JavaScript libraries, such as TensorFlow.js, have been introduced to make machine learning more accessible to researchers and end-users. Using JavaScript, TensorFlow.js can define, train, and run new or existing, pre-trained machine learning models entirely in the browser from the client-side, which improves the user experience through interaction while preserving privacy. Deep learning models deployed on front-end browsers must be small, have fast inference, and ideally be interactive in real-time. Therefore, the emphasis on development and deployment is different. This paper aims to review the development and deployment of these deep-learning web apps to raise awareness of the recent advancements and encourage more researchers to take advantage of this technology for their own work. First, the rationale behind the deployment stack (front-end, JavaScript, and TensorFlow.js) is discussed. Then, the development approach for obtaining deep learning models that are optimized and suitable for front-end deployment is then described. The article also provides current web applications divided into seven categories to show deep learning potential on the front end. These include web apps for deep learning playground, pose detection and gesture tracking, music and art creation, expression detection and facial recognition, video segmentation, image and signal analysis, healthcare diagnosis, recognition, and identification.
Collapse
Affiliation(s)
- Hock-Ann Goh
- Faculty of Engineering and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, 75450 Melaka Malaysia
| | - Chin-Kuan Ho
- Asia Pacific University of Technology and Innovation, Jalan Teknologi 5, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia
| | - Fazly Salleh Abas
- Faculty of Engineering and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, 75450 Melaka Malaysia
| |
Collapse
|
10
|
Hinterreiter A, Ruch P, Stitz H, Ennemoser M, Bernard J, Strobelt H, Streit M. ConfusionFlow: A Model-Agnostic Visualization for Temporal Analysis of Classifier Confusion. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1222-1236. [PMID: 32746284 DOI: 10.1109/tvcg.2020.3012063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to assess classifiers' performances, evaluate their learning behavior over time, and compare different models. Typically, this analysis is based on single-number performance measures such as accuracy. A more detailed evaluation of classifiers is possible by inspecting class errors. The confusion matrix is an established way for visualizing these class errors, but it was not designed with temporal or comparative analysis in mind. More generally, established performance analysis systems do not allow a combined temporal and comparative analysis of class-level information. To address this issue, we propose ConfusionFlow, an interactive, comparative visualization tool that combines the benefits of class confusion matrices with the visualization of performance characteristics over time. ConfusionFlow is model-agnostic and can be used to compare performances for different model types, model architectures, and/or training and test datasets. We demonstrate the usefulness of ConfusionFlow in a case study on instance selection strategies in active learning. We further assess the scalability of ConfusionFlow and present a use case in the context of neural network pruning.
Collapse
|
11
|
Saeed AQ, Sheikh Abdullah SNH, Che-Hamzah J, Abdul Ghani AT. Accuracy of Using Generative Adversarial Networks for Glaucoma Detection During the COVID-19 Pandemic: A Systematic Review and Bibliometric Analysis. J Med Internet Res 2021; 23:e27414. [PMID: 34236992 PMCID: PMC8493455 DOI: 10.2196/27414] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 05/11/2021] [Accepted: 07/05/2021] [Indexed: 01/19/2023] Open
Abstract
Background Glaucoma leads to irreversible blindness. Globally, it is the second most common retinal disease that leads to blindness, slightly less common than cataracts. Therefore, there is a great need to avoid the silent growth of this disease using recently developed generative adversarial networks (GANs). Objective This paper aims to introduce a GAN technology for the diagnosis of eye disorders, particularly glaucoma. This paper illustrates deep adversarial learning as a potential diagnostic tool and the challenges involved in its implementation. This study describes and analyzes many of the pitfalls and problems that researchers will need to overcome to implement this kind of technology. Methods To organize this review comprehensively, articles and reviews were collected using the following keywords: (“Glaucoma,” “optic disc,” “blood vessels”) and (“receptive field,” “loss function,” “GAN,” “Generative Adversarial Network,” “Deep learning,” “CNN,” “convolutional neural network” OR encoder). The records were identified from 5 highly reputed databases: IEEE Xplore, Web of Science, Scopus, ScienceDirect, and PubMed. These libraries broadly cover the technical and medical literature. Publications within the last 5 years, specifically 2015-2020, were included because the target GAN technique was invented only in 2014 and the publishing date of the collected papers was not earlier than 2016. Duplicate records were removed, and irrelevant titles and abstracts were excluded. In addition, we excluded papers that used optical coherence tomography and visual field images, except for those with 2D images. A large-scale systematic analysis was performed, and then a summarized taxonomy was generated. Furthermore, the results of the collected articles were summarized and a visual representation of the results was presented on a T-shaped matrix diagram. This study was conducted between March 2020 and November 2020. Results We found 59 articles after conducting a comprehensive survey of the literature. Among the 59 articles, 30 present actual attempts to synthesize images and provide accurate segmentation/classification using single/multiple landmarks or share certain experiences. The other 29 articles discuss the recent advances in GANs, do practical experiments, and contain analytical studies of retinal disease. Conclusions Recent deep learning techniques, namely GANs, have shown encouraging performance in retinal disease detection. Although this methodology involves an extensive computing budget and optimization process, it saturates the greedy nature of deep learning techniques by synthesizing images and solves major medical issues. This paper contributes to this research field by offering a thorough analysis of existing works, highlighting current limitations, and suggesting alternatives to support other researchers and participants in further improving and strengthening future work. Finally, new directions for this research have been identified.
Collapse
Affiliation(s)
- Ali Q Saeed
- Faculty of Information Science & Technology (FTSM), Universiti Kebangsaan Malaysia (UKM), UKM, 43600 Bangi, Selangor, Malaysia, Selangor, MY.,Computer Center, Northern Technical University, Ninevah, IQ
| | - Siti Norul Huda Sheikh Abdullah
- Faculty of Information Science & Technology (FTSM), Universiti Kebangsaan Malaysia (UKM), UKM, 43600 Bangi, Selangor, Malaysia, Selangor, MY
| | - Jemaima Che-Hamzah
- Department of Ophthalmology, Faculty of Medicine, Universiti Kebangsaan Malaysia (UKM), Cheras, Kuala Lumpur, MY
| | - Ahmad Tarmizi Abdul Ghani
- Faculty of Information Science & Technology (FTSM), Universiti Kebangsaan Malaysia (UKM), UKM, 43600 Bangi, Selangor, Malaysia, Selangor, MY
| |
Collapse
|
12
|
Cao K, Liu M, Su H, Wu J, Zhu J, Liu S. Analyzing the Noise Robustness of Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3289-3304. [PMID: 31985427 DOI: 10.1109/tvcg.2020.2969185] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.
Collapse
|
13
|
Sun G, Wu H, Zhu L, Xu C, Liang H, Xu B, Liang R. VSumVis: Interactive Visual Understanding and Diagnosis of Video Summarization Model. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3458928] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
With the rapid development of mobile Internet, the popularity of video capture devices has brought a surge in multimedia video resources. Utilizing machine learning methods combined with well-designed features, we could automatically obtain video summarization to relax video resource consumption and retrieval issues. However, there always exists a gap between the summarization obtained by the model and the ones annotated by users. How to help users understand the difference, provide insights in improving the model, and enhance the trust in the model remains challenging in the current study. To address these challenges, we propose VSumVis under a user-centered design methodology, a visual analysis system with multi-feature examination and multi-level exploration, which could help users explore and analyze video content, as well as the intrinsic relationship that existed in our video summarization model. The system contains multiple coordinated views, i.e., video view, projection view, detail view, and sequential frames view. A multi-level analysis process to integrate video events and frames are presented with clusters and nodes visualization in our system. Temporal patterns concerning the difference between the manual annotation score and the saliency score produced by our model are further investigated and distinguished with sequential frames view. Moreover, we propose a set of rich user interactions that enable an in-depth, multi-faceted analysis of the features in our video summarization model. We conduct case studies and interviews with domain experts to provide anecdotal evidence about the effectiveness of our approach. Quantitative feedback from a user study confirms the usefulness of our visual system for exploring the video summarization model.
Collapse
Affiliation(s)
- Guodao Sun
- Zhejiang University of Technology, Hangzhou, China
| | - Hao Wu
- Zhejiang University of Technology, Hangzhou, China
| | - Lin Zhu
- Zhejiang University of Technology, Hangzhou, China
| | - Chaoqing Xu
- Zhejiang University of Technology, Hangzhou, China
| | - Haoran Liang
- Zhejiang University of Technology, Hangzhou, China
| | - Binwei Xu
- Zhejiang University of Technology, Hangzhou, China
| | | |
Collapse
|
14
|
Bauerle A, van Onzenoodt C, Ropinski T. Net2Vis - A Visual Grammar for Automatically Generating Publication-Tailored CNN Architecture Visualizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:2980-2991. [PMID: 33556010 DOI: 10.1109/tvcg.2021.3057483] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
To convey neural network architectures in publications, appropriate visualizations are of great importance. While most current deep learning papers contain such visualizations, these are usually handcrafted just before publication, which results in a lack of a common visual grammar, significant time investment, errors, and ambiguities. Current automatic network visualization tools focus on debugging the network itself and are not ideal for generating publication visualizations. Therefore, we present an approach to automate this process by translating network architectures specified in Keras into visualizations that can directly be embedded into any publication. To do so, we propose a visual grammar for convolutional neural networks (CNNs), which has been derived from an analysis of such figures extracted from all ICCV and CVPR papers published between 2013 and 2019. The proposed grammar incorporates visual encoding, network layout, layer aggregation, and legend generation. We have further realized our approach in an online system available to the community, which we have evaluated through expert feedback, and a quantitative study. It not only reduces the time needed to generate network visualizations for publications, but also enables a unified and unambiguous visualization design.
Collapse
|
15
|
Kaluarachchi T, Reis A, Nanayakkara S. A Review of Recent Deep Learning Approaches in Human-Centered Machine Learning. SENSORS (BASEL, SWITZERLAND) 2021; 21:2514. [PMID: 33916850 PMCID: PMC8038476 DOI: 10.3390/s21072514] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 03/14/2021] [Accepted: 03/30/2021] [Indexed: 11/22/2022]
Abstract
After Deep Learning (DL) regained popularity recently, the Artificial Intelligence (AI) or Machine Learning (ML) field is undergoing rapid growth concerning research and real-world application development. Deep Learning has generated complexities in algorithms, and researchers and users have raised concerns regarding the usability and adoptability of Deep Learning systems. These concerns, coupled with the increasing human-AI interactions, have created the emerging field that is Human-Centered Machine Learning (HCML). We present this review paper as an overview and analysis of existing work in HCML related to DL. Firstly, we collaborated with field domain experts to develop a working definition for HCML. Secondly, through a systematic literature review, we analyze and classify 162 publications that fall within HCML. Our classification is based on aspects including contribution type, application area, and focused human categories. Finally, we analyze the topology of the HCML landscape by identifying research gaps, highlighting conflicting interpretations, addressing current challenges, and presenting future HCML research opportunities.
Collapse
Affiliation(s)
- Tharindu Kaluarachchi
- Augmented Human Lab, Auckland Bioengineering Institue, The University of Auckland, 70 Symonds Street, Grafton, Auckland 1010, New Zealand; (A.R.); (S.N.)
| | | | | |
Collapse
|
16
|
Wang ZJ, Turko R, Shaikh O, Park H, Das N, Hohman F, Kahng M, Polo Chau DH. CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1396-1406. [PMID: 33048723 DOI: 10.1109/tvcg.2020.3030418] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Deep learning's great success motivates many practitioners and students to learn about this exciting technology. However, it is often challenging for beginners to take their first step due to the complexity of understanding and applying deep learning. We present CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a foundational deep learning model architecture. Our tool addresses key challenges that novices face while learning about CNNs, which we identify from interviews with instructors and a survey with past students. CNN Explainer tightly integrates a model overview that summarizes a CNN's structure, and on-demand, dynamic visual explanation views that help users understand the underlying components of CNNs. Through smooth transitions across levels of abstraction, our tool enables users to inspect the interplay between low-level mathematical operations and high-level model structures. A qualitative user study shows that CNN Explainer helps users more easily understand the inner workings of CNNs, and is engaging and enjoyable to use. We also derive design lessons from our study. Developed using modern web technologies, CNN Explainer runs locally in users' web browsers without the need for installation or specialized hardware, broadening the public's education access to modern deep learning techniques.
Collapse
|
17
|
Huang X, Jamonnak S, Zhao Y, Wang B, Hoai M, Yager K, Xu W. Interactive Visual Study of Multiple Attributes Learning Model of X-Ray Scattering Images. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1312-1321. [PMID: 33104509 DOI: 10.1109/tvcg.2020.3030384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images. However, visual analytics tools are lacking for the specific application of x-ray image classification with multiple structural attributes. In this paper, we present an interactive system for domain scientists to visually study the multiple attributes learning models applied to x-ray scattering images. It allows domain scientists to interactively explore this important type of scientific images in embedded spaces that are defined on the model prediction output, the actual labels, and the discovered feature space of neural networks. Users are allowed to flexibly select instance images, their clusters, and compare them regarding the specified visual representation of attributes. The exploration is guided by the manifestation of model performance related to mutual relationships among attributes, which often affect the learning accuracy and effectiveness. The system thus supports domain scientists to improve the training dataset and model, find questionable attributes labels, and identify outlier images or spurious data clusters. Case studies and scientists feedback demonstrate its functionalities and usefulness.
Collapse
|
18
|
Li G, Wang J, Shen HW, Chen K, Shan G, Lu Z. CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1364-1373. [PMID: 33048744 DOI: 10.1109/tvcg.2020.3030461] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Convolutional neural networks (CNNs) have demonstrated extraordinarily good performance in many computer vision tasks. The increasing size of CNN models, however, prevents them from being widely deployed to devices with limited computational resources, e.g., mobile/embedded devices. The emerging topic of model pruning strives to address this problem by removing less important neurons and fine-tuning the pruned networks to minimize the accuracy loss. Nevertheless, existing automated pruning solutions often rely on a numerical threshold of the pruning criteria, lacking the flexibility to optimally balance the trade-off between efficiency and accuracy. Moreover, the complicated interplay between the stages of neuron pruning and model fine-tuning makes this process opaque, and therefore becomes difficult to optimize. In this paper, we address these challenges through a visual analytics approach, named CNNPruner. It considers the importance of convolutional filters through both instability and sensitivity, and allows users to interactively create pruning plans according to a desired goal on model size or accuracy. Also, CNNPruner integrates state-of-the-art filter visualization techniques to help users understand the roles that different filters played and refine their pruning plans. Through comprehensive case studies on CNNs with real-world sizes, we validate the effectiveness of CNNPruner.
Collapse
|
19
|
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R. A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1385-1395. [PMID: 33035164 DOI: 10.1109/tvcg.2020.3028888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Collapse
|
20
|
Wang Q, Alexander W, Pegg J, Qu H, Chen M. HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1417-1426. [PMID: 33048739 DOI: 10.1109/tvcg.2020.3030449] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we present a visual analytics tool for enabling hypothesis-based evaluation of machine learning (ML) models. We describe a novel ML-testing framework that combines the traditional statistical hypothesis testing (commonly used in empirical research) with logical reasoning about the conclusions of multiple hypotheses. The framework defines a controlled configuration for testing a number of hypotheses as to whether and how some extra information about a "concept" or "feature" may benefit or hinder an ML model. Because reasoning multiple hypotheses is not always straightforward, we provide HypoML as a visual analysis tool, with which, the multi-thread testing results are first transformed to analytical results using statistical and logical inferences, and then to a visual representation for rapid observation of the conclusions and the logical flow between the testing results and hypotheses. We have applied HypoML to a number of hypothesized concepts, demonstrating the intuitive and explainable nature of the visual analysis.
Collapse
|
21
|
Visual analysis of meteorological satellite data via model-agnostic meta-learning. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00704-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
22
|
Abstract
Many image processing, computer graphics, and computer vision problems can be treated as image-to-image translation tasks. Such translation entails learning to map one visual representation of a given input to another representation. Image-to-image translation with generative adversarial networks (GANs) has been intensively studied and applied to various tasks, such as multimodal image-to-image translation, super-resolution translation, object transfiguration-related translation, etc. However, image-to-image translation techniques suffer from some problems, such as mode collapse, instability, and a lack of diversity. This article provides a comprehensive overview of image-to-image translation based on GAN algorithms and its variants. It also discusses and analyzes current state-of-the-art image-to-image translation techniques that are based on multimodal and multidomain representations. Finally, open issues and future research directions utilizing reinforcement learning and three-dimensional (3D) modal translation are summarized and discussed.
Collapse
|
23
|
Khan A, Sohail A, Zahoora U, Qureshi AS. A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 2020. [DOI: 10.1007/s10462-020-09825-6] [Citation(s) in RCA: 351] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
24
|
Hazarika S, Li H, Wang KC, Shen HW, Chou CS. NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:34-44. [PMID: 31425114 DOI: 10.1109/tvcg.2019.2934591] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Complex computational models are often designed to simulate real-world physical phenomena in many scientific disciplines. However, these simulation models tend to be computationally very expensive and involve a large number of simulation input parameters, which need to be analyzed and properly calibrated before the models can be applied for real scientific studies. We propose a visual analysis system to facilitate interactive exploratory analysis of high-dimensional input parameter space for a complex yeast cell polarization simulation. The proposed system can assist the computational biologists, who designed the simulation model, to visually calibrate the input parameters by modifying the parameter values and immediately visualizing the predicted simulation outcome without having the need to run the original expensive simulation for every instance. Our proposed visual analysis system is driven by a trained neural network-based surrogate model as the backend analysis framework. In this work, we demonstrate the advantage of using neural networks as surrogate models for visual analysis by incorporating some of the recent advances in the field of uncertainty quantification, interpretability and explainability of neural network-based models. We utilize the trained network to perform interactive parameter sensitivity analysis of the original simulation as well as recommend optimal parameter configurations using the activation maximization framework of neural networks. We also facilitate detail analysis of the trained network to extract useful insights about the simulation model, learned by the network, during the training process. We performed two case studies, and discovered multiple new parameter configurations, which can trigger high cell polarization results in the original simulation model. We evaluated our results by comparing with the original simulation model outcomes as well as the findings from previous parameter analysis performed by our experts.
Collapse
|
25
|
Hohman F, Park H, Robinson C, Polo Chau DH. Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1096-1106. [PMID: 31443005 DOI: 10.1109/tvcg.2019.2934659] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Deep learning is increasingly used in decision-making tasks. However, understanding how neural networks produce final predictions remains a fundamental challenge. Existing work on interpreting neural network predictions for images often focuses on explaining predictions for single images or neurons. As predictions are often computed from millions of weights that are optimized over millions of images, such explanations can easily miss a bigger picture. We present Summit, an interactive system that scalably and systematically summarizes and visualizes what features a deep learning model has learned and how those features interact to make predictions. Summit introduces two new scalable summarization techniques: (1) activation aggregation discovers important neurons, and (2) neuron-influence aggregation identifies relationships among such neurons. Summit combines these techniques to create the novel attribution graph that reveals and summarizes crucial neuron associations and substructures that contribute to a model's outcomes. Summit scales to large data, such as the ImageNet dataset with 1.2M images, and leverages neural network feature visualization and dataset examples to help users distill large, complex neural network models into compact, interactive visualizations. We present neural network exploration scenarios where Summit helps us discover multiple surprising insights into a prevalent, large-scale image classifier's learned representations and informs future neural network architecture design. The Summit visualization runs in modern web browsers and is open-sourced.
Collapse
|
26
|
Ma Y, Xie T, Li J, Maciejewski R. Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1075-1085. [PMID: 31478859 DOI: 10.1109/tvcg.2019.2934631] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Machine learning models are currently being deployed in a variety of real-world applications where model predictions are used to make decisions about healthcare, bank loans, and numerous other critical tasks. As the deployment of artificial intelligence technologies becomes ubiquitous, it is unsurprising that adversaries have begun developing methods to manipulate machine learning models to their advantage. While the visual analytics community has developed methods for opening the black box of machine learning models, little work has focused on helping the user understand their model vulnerabilities in the context of adversarial attacks. In this paper, we present a visual analytics framework for explaining and exploring model vulnerabilities to adversarial attacks. Our framework employs a multi-faceted visualization scheme designed to support the analysis of data poisoning attacks from the perspective of models, data instances, features, and local structures. We demonstrate our framework through two case studies on binary classifiers and illustrate model vulnerabilities with respect to varying attack strategies.
Collapse
|
27
|
Cashman D, Perer A, Chang R, Strobelt H. Ablate, Variate, and Contemplate: Visual Analytics for Discovering Neural Architectures. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:863-873. [PMID: 31502978 DOI: 10.1109/tvcg.2019.2934261] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The performance of deep learning models is dependent on the precise configuration of many layers and parameters. However, there are currently few systematic guidelines for how to configure a successful model. This means model builders often have to experiment with different configurations by manually programming different architectures (which is tedious and time consuming) or rely on purely automated approaches to generate and train the architectures (which is expensive). In this paper, we present Rapid Exploration of Model Architectures and Parameters, or REMAP, a visual analytics tool that allows a model builder to discover a deep learning model quickly via exploration and rapid experimentation of neural network architectures. In REMAP, the user explores the large and complex parameter space for neural network architectures using a combination of global inspection and local experimentation. Through a visual overview of a set of models, the user identifies interesting clusters of architectures. Based on their findings, the user can run ablation and variation experiments to identify the effects of adding, removing, or replacing layers in a given architecture and generate new models accordingly. They can also handcraft new models using a simple graphical interface. As a result, a model builder can build deep learning models quickly, efficiently, and without manual programming. We inform the design of REMAP through a design study with four deep learning model builders. Through a use case, we demonstrate that REMAP allows users to discover performant neural network architectures efficiently using visual exploration and user-defined semi-automated searches through the model space.
Collapse
|
28
|
Bouwmans T, Javed S, Sultana M, Jung SK. Deep neural network concepts for background subtraction:A systematic review and comparative evaluation. Neural Netw 2019; 117:8-66. [DOI: 10.1016/j.neunet.2019.04.024] [Citation(s) in RCA: 173] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 02/27/2019] [Accepted: 04/30/2019] [Indexed: 12/16/2022]
|
29
|
Wang J, Gou L, Zhang W, Yang H, Shen HW. DeepVID: Deep Visual Interpretation and Diagnosis for Image Classifiers via Knowledge Distillation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2168-2180. [PMID: 30892211 DOI: 10.1109/tvcg.2019.2903943] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Deep Neural Networks (DNNs) have been extensively used in multiple disciplines due to their superior performance. However, in most cases, DNNs are considered as black-boxes and the interpretation of their internal working mechanism is usually challenging. Given that model trust is often built on the understanding of how a model works, the interpretation of DNNs becomes more important, especially in safety-critical applications (e.g., medical diagnosis, autonomous driving). In this paper, we propose DeepVID, a Deep learning approach to Visually Interpret and Diagnose DNN models, especially image classifiers. In detail, we train a small locally-faithful model to mimic the behavior of an original cumbersome DNN around a particular data instance of interest, and the local model is sufficiently simple such that it can be visually interpreted (e.g., a linear model). Knowledge distillation is used to transfer the knowledge from the cumbersome DNN to the small model, and a deep generative model (i.e., variational auto-encoder) is used to generate neighbors around the instance of interest. Those neighbors, which come with small feature variances and semantic meanings, can effectively probe the DNN's behaviors around the interested instance and help the small model to learn those behaviors. Through comprehensive evaluations, as well as case studies conducted together with deep learning experts, we validate the effectiveness of DeepVID.
Collapse
|