1
|
Filus K, Domańska J. What is the doggest dog? Examination of typicality perception in ImageNet-trained networks. Neural Netw 2025; 188:107425. [PMID: 40220560 DOI: 10.1016/j.neunet.2025.107425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 03/04/2025] [Accepted: 03/18/2025] [Indexed: 04/14/2025]
Abstract
Due to the emergence of numerous model architectures in recent years, researchers finally have access to models that are diverse enough to properly study them from the perspective of cognitive psychology theories, e.g. Prototype Theory. The theory assumes that the degree of membership in a basic-level category is graded. As a result, some concepts are perceived as more central (typical) than others. The most typical category is called a prototype. It can be perceived as the clearest example of a category, reflecting the redundancy structure of the category as a whole. Its inverse is called an anti-prototype. Reasonable perception of prototypes and anti-prototypes is important for accurate projection of the world structure onto the class space and more human-like world perception beyond simple memorization. That is why it is beneficial to study deep models from the perspective of prototype theory. To enable it, we propose 3 methods that return the prototypes and anti-prototypes perceived by deep networks for a specific basic-level category. Additionally, one of our methods allows to visualize the centrality of objects. The results on a wide range of 42 networks trained on ImageNet (Convolutional Networks, Vision Transformers, ConvNeXts and hybrid models) reveal that the networks share the typicality perception to a large extent and that this perception does not lie so far from the human one. We release the dataset with per-network prototypes and anti-prototypes resulting from our work to enable further research on this topic.
Collapse
Affiliation(s)
- Katarzyna Filus
- Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Gliwice, Poland.
| | - Joanna Domańska
- Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Gliwice, Poland
| |
Collapse
|
2
|
Tang K, Ma Y, Miao D, Song P, Gu Z, Tian Z, Wang W. Decision Fusion Networks for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:3890-3903. [PMID: 35951567 DOI: 10.1109/tnnls.2022.3196129] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Convolutional neural networks, in which each layer receives features from the previous layer(s) and then aggregates/abstracts higher level features from them, are widely adopted for image classification. To avoid information loss during feature aggregation/abstraction and fully utilize lower layer features, we propose a novel decision fusion module (DFM) for making an intermediate decision based on the features in the current layer and then fuse its results with the original features before passing them to the next layers. This decision is devised to determine an auxiliary category corresponding to the category at a higher hierarchical level, which can, thus, serve as category-coherent guidance for later layers. Therefore, by stacking a collection of DFMs into a classification network, the generated decision fusion network is explicitly formulated to progressively aggregate/abstract more discriminative features guided by these decisions and then refine the decisions based on the newly generated features in a layer-by-layer manner. Comprehensive results on four benchmarks validate that the proposed DFM can bring significant improvements for various common classification networks at a minimal additional computational cost and are superior to the state-of-the-art decision fusion-based methods. In addition, we demonstrate the generalization ability of the DFM to object detection and semantic segmentation.
Collapse
|
3
|
Cai D, Chen J, Zhao J, Xue Y, Yang S, Yuan W, Feng M, Weng H, Liu S, Peng Y, Zhu J, Wang K, Jackson C, Tang H, Huang J, Wang X. HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical Cytology Classification. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:4344-4355. [PMID: 38923481 DOI: 10.1109/tmi.2024.3419697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
Cervical cytology is a critical screening strategy for early detection of pre-cancerous and cancerous cervical lesions. The challenge lies in accurately classifying various cervical cytology cell types. Existing automated cervical cytology methods are primarily trained on databases covering a narrow range of coarse-grained cell types, which fail to provide a comprehensive and detailed performance analysis that accurately represents real-world cytopathology conditions. To overcome these limitations, we introduce HiCervix, the most extensive, multi-center cervical cytology dataset currently available to the public. HiCervix includes 40,229 cervical cells from 4,496 whole slide images, categorized into 29 annotated classes. These classes are organized within a three-level hierarchical tree to capture fine-grained subtype information. To exploit the semantic correlation inherent in this hierarchical tree, we propose HierSwin, a hierarchical vision transformer-based classification network. HierSwin serves as a benchmark for detailed feature learning in both coarse-level and fine-level cervical cancer classification tasks. In our comprehensive experiments, HierSwin demonstrated remarkable performance, achieving 92.08% accuracy for coarse-level classification and 82.93% accuracy averaged across all three levels. When compared to board-certified cytopathologists, HierSwin achieved high classification performance (0.8293 versus 0.7359 averaged accuracy), highlighting its potential for clinical applications. This newly released HiCervix dataset, along with our benchmark HierSwin method, is poised to make a substantial impact on the advancement of deep learning algorithms for rapid cervical cancer screening and greatly improve cancer prevention and patient outcomes in real-world clinical settings.
Collapse
|
4
|
Li G, Wang J, Wang Y, Shan G, Zhao Y. An In-Situ Visual Analytics Framework for Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:6770-6786. [PMID: 38051629 DOI: 10.1109/tvcg.2023.3339585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
The past decade has witnessed the superior power of deep neural networks (DNNs) in applications across various domains. However, training a high-quality DNN remains a non-trivial task due to its massive number of parameters. Visualization has shown great potential in addressing this situation, as evidenced by numerous recent visualization works that aid in DNN training and interpretation. These works commonly employ a strategy of logging training-related data and conducting post-hoc analysis. Based on the results of offline analysis, the model can be further trained or fine-tuned. This strategy, however, does not cope with the increasing complexity of DNNs, because (1) the time-series data collected over the training are usually too large to be stored entirely; (2) the huge I/O overhead significantly impacts the training efficiency; (3) post-hoc analysis does not allow rapid human-interventions (e.g., stop training with improper hyper-parameter settings to save computational resources). To address these challenges, we propose an in-situ visualization and analysis framework for the training of DNNs. Specifically, we employ feature extraction algorithms to reduce the size of training-related data in-situ and use the reduced data for real-time visual analytics. The states of model training are disclosed to model designers in real-time, enabling human interventions on demand to steer the training. Through concrete case studies, we demonstrate how our in-situ framework helps deep learning experts optimize DNNs and improve their analysis efficiency.
Collapse
|
5
|
Prasad V, van Sloun RJG, Vilanova A, Pezzotti N. ProactiV: Studying Deep Learning Model Behavior Under Input Transformations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:5651-5665. [PMID: 37535493 DOI: 10.1109/tvcg.2023.3301722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
Deep learning (DL) models have shown performance benefits across many applications, from classification to image-to-image translation. However, low interpretability often leads to unexpected model behavior once deployed in the real world. Usually, this unexpected behavior is because the training data domain does not reflect the deployment data domain. Identifying a model's breaking points under input conditions and domain shifts, i.e., input transformations, is essential to improve models. Although visual analytics (VA) has shown promise in studying the behavior of model outputs under continually varying inputs, existing methods mainly focus on per-class or instance-level analysis. We aim to generalize beyond classification where classes do not exist and provide a global view of model behavior under co-occurring input transformations. We present a DL model-agnostic VA method (ProactiV) to help model developers proactively study output behavior under input transformations to identify and verify breaking points. ProactiV relies on a proposed input optimization method to determine the changes to a given transformed input to achieve the desired output. The data from this optimization process allows the study of global and local model behavior under input transformations at scale. Additionally, the optimization method provides insights into the input characteristics that result in desired outputs and helps recognize model biases. We highlight how ProactiV effectively supports studying model behavior with example classification and image-to-image translation tasks.
Collapse
|
6
|
Li Y, Wang J, Aboagye P, Yeh CCM, Zheng Y, Wang L, Zhang W, Ma KL. Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:2875-2887. [PMID: 38625780 PMCID: PMC11412260 DOI: 10.1109/tvcg.2024.3388514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Recent advancements in pre-trained language-image models have ushered in a new era of visual comprehension. Leveraging the power of these models, this article tackles two issues within the realm of visual analytics: (1) the efficient exploration of large-scale image datasets and identification of data biases within them; (2) the evaluation of image captions and steering of their generation process. On the one hand, by visually examining the captions generated from language-image models for an image dataset, we gain deeper insights into the visual contents, unearthing data biases that may be entrenched within the dataset. On the other hand, by depicting the association between visual features and textual captions, we expose the weaknesses of pre-trained language-image models in their captioning capability and propose an interactive interface to steer caption generation. The two parts have been coalesced into a coordinated visual analytics system, fostering the mutual enrichment of visual and textual contents. We validate the effectiveness of the system with domain practitioners through concrete case studies with large-scale image datasets.
Collapse
|
7
|
Nalla V, Pouriyeh S, Parizi RM, Trivedi H, Sheng QZ, Hwang I, Seyyed-Kalantari L, Woo M. Deep learning for computer-aided abnormalities classification in digital mammogram: A data-centric perspective. Curr Probl Diagn Radiol 2024; 53:346-352. [PMID: 38302303 DOI: 10.1067/j.cpradiol.2024.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 11/05/2023] [Accepted: 01/16/2024] [Indexed: 02/03/2024]
Abstract
Breast cancer is the most common type of cancer in women, and early abnormality detection using mammography can significantly improve breast cancer survival rates. Diverse datasets are required to improve the training and validation of deep learning (DL) systems for autonomous breast cancer diagnosis. However, only a small number of mammography datasets are publicly available. This constraint has created challenges when comparing different DL models using the same dataset. The primary contribution of this study is the comprehensive description of a selection of currently available public mammography datasets. The information available on publicly accessible datasets is summarized and their usability reviewed to enable more effective models to be developed for breast cancer detection and to improve understanding of existing models trained using these datasets. This study aims to bridge the existing knowledge gap by offering researchers and practitioners a valuable resource to develop and assess DL models in breast cancer diagnosis.
Collapse
Affiliation(s)
- Vineela Nalla
- Department of Information Technology, Kennesaw State University, Kennesaw, Georgia, USA
| | - Seyedamin Pouriyeh
- Department of Information Technology, Kennesaw State University, Kennesaw, Georgia, USA
| | - Reza M Parizi
- Decentralized Science Lab, Kennesaw State University, Marietta, GA, USA
| | - Hari Trivedi
- Department of Radiology and Imaging Services, Emory University, Atlanta, Georgia, USA
| | - Quan Z Sheng
- School of Computing, Macquarie University, Sydney, Australia
| | - Inchan Hwang
- School of Data Science and Analytics, Kennesaw State University, Kennesaw, Georgia, USA
| | - Laleh Seyyed-Kalantari
- Department of Electrical Engineering and Computer Science, York University, Toronto, Ontario, Canada
| | - MinJae Woo
- School of Data Science and Analytics, Kennesaw State University, Kennesaw, Georgia, USA.
| |
Collapse
|
8
|
Statsenko Y, Babushkin V, Talako T, Kurbatova T, Smetanina D, Simiyu GL, Habuza T, Ismail F, Almansoori TM, Gorkom KNV, Szólics M, Hassan A, Ljubisavljevic M. Automatic Detection and Classification of Epileptic Seizures from EEG Data: Finding Optimal Acquisition Settings and Testing Interpretable Machine Learning Approach. Biomedicines 2023; 11:2370. [PMID: 37760815 PMCID: PMC10525492 DOI: 10.3390/biomedicines11092370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 07/13/2023] [Accepted: 07/21/2023] [Indexed: 09/29/2023] Open
Abstract
Deep learning (DL) is emerging as a successful technique for automatic detection and differentiation of spontaneous seizures that may otherwise be missed or misclassified. Herein, we propose a system architecture based on top-performing DL models for binary and multigroup classifications with the non-overlapping window technique, which we tested on the TUSZ dataset. The system accurately detects seizure episodes (87.7% Sn, 91.16% Sp) and carefully distinguishes eight seizure types (95-100% Acc). An increase in EEG sampling rate from 50 to 250 Hz boosted model performance: the precision of seizure detection rose by 5%, and seizure differentiation by 7%. A low sampling rate is a reasonable solution for training reliable models with EEG data. Decreasing the number of EEG electrodes from 21 to 8 did not affect seizure detection but worsened seizure differentiation significantly: 98.24 ± 0.17 vs. 85.14 ± 3.14% recall. In detecting epileptic episodes, all electrodes provided equally informative input, but in seizure differentiation, their informative value varied. We improved model explainability with interpretable ML. Activation maximization highlighted the presence of EEG patterns specific to eight seizure types. Cortical projection of epileptic sources depicted differences between generalized and focal seizures. Interpretable ML techniques confirmed that our system recognizes biologically meaningful features as indicators of epileptic activity in EEG.
Collapse
Affiliation(s)
- Yauhen Statsenko
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
- Medical Imaging Platform, ASPIRE Precision Medicine Research Institute Abu Dhabi, Al Ain P.O. Box 15551, United Arab Emirates
- Big Data Analytics Center, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Vladimir Babushkin
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Tatsiana Talako
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
- Department of Oncohematology, Minsk Scientific and Practical Center for Surgery, Transplantology and Hematology, 220089 Minsk, Belarus
| | - Tetiana Kurbatova
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Darya Smetanina
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Gillian Lylian Simiyu
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Tetiana Habuza
- Big Data Analytics Center, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
- Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Fatima Ismail
- Pediatric Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Taleb M. Almansoori
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Klaus N.-V. Gorkom
- Radiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Miklós Szólics
- Neurology Division, Medicine Department, Tawam Hospital, Al Ain P.O. Box 15258, United Arab Emirates
- Internal Medicine Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Ali Hassan
- Neurology Division, Medicine Department, Tawam Hospital, Al Ain P.O. Box 15258, United Arab Emirates
| | - Milos Ljubisavljevic
- Physiology Department, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates;
- Neuroscience Platform, ASPIRE Precision Medicine Research Institute Abu Dhabi, Al Ain P.O. Box 15551, United Arab Emirates
| |
Collapse
|
9
|
Collaris D, van Wijk JJ. StrategyAtlas: Strategy Analysis for Machine Learning Interpretability. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:2996-3008. [PMID: 35085084 DOI: 10.1109/tvcg.2022.3146806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Businesses in high-risk environments have been reluctant to adopt modern machine learning approaches due to their complex and uninterpretable nature. Most current solutions provide local, instance-level explanations, but this is insufficient for understanding the model as a whole. In this work, we show that strategy clusters (i.e., groups of data instances that are treated distinctly by the model) can be used to understand the global behavior of a complex ML model. To support effective exploration and understanding of these clusters, we introduce StrategyAtlas, a system designed to analyze and explain model strategies. Furthermore, it supports multiple ways to utilize these strategies for simplifying and improving the reference model. In collaboration with a large insurance company, we present a use case in automatic insurance acceptance, and show how professional data scientists were enabled to understand a complex model and improve the production model based on these insights.
Collapse
|
10
|
Mi JX, Li N, Huang KY, Li W, Zhou L. Hierarchical neural network with efficient selection inference. Neural Netw 2023; 161:535-549. [PMID: 36812830 DOI: 10.1016/j.neunet.2023.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 10/28/2022] [Accepted: 02/09/2023] [Indexed: 02/17/2023]
Abstract
The image classification precision is vastly enhanced with the growing complexity of convolutional neural network (CNN) structures. However, the uneven visual separability between categories leads to various difficulties in classification. The hierarchical structure of categories can be leveraged to deal with it, but a few CNNs pay attention to the character of data. Besides, a network model with a hierarchical structure is promising to extract more specific features from the data than current CNNs, since, for the latter, all categories have the same fixed number of layers for feed-forward computation. In this paper, we propose to use category hierarchies to integrate ResNet-style modules to form a hierarchical network model in a top-down manner. To extract abundant discriminative features and improve the computation efficiency, we adopt residual block selection based on coarse categories to allocate different computation paths. Each residual block works as a switch to determine the JUMP or JOIN mode for an individual coarse category. Interestingly, since some categories need less feed-forward computation than others by jumping layers, the average inference time cost is reduced. Extensive experiments show that our hierarchical network achieves higher prediction accuracy with similar FLOPs on CIFAR-10 and CIFAR-100, SVHM, and Tiny-ImageNet datasets compared to original residual networks and other existing selection inference methods.
Collapse
Affiliation(s)
- Jian-Xun Mi
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China.
| | - Nuo Li
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China.
| | - Ke-Yang Huang
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
| | - Weisheng Li
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
| | - Lifang Zhou
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Software, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
| |
Collapse
|
11
|
Wang J, Zhang W, Yang H, Yeh CCM, Wang L. Visual Analytics for RNN-Based Deep Reinforcement Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4141-4155. [PMID: 33929961 DOI: 10.1109/tvcg.2021.3076749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep reinforcement learning (DRL) targets to train an autonomous agent to interact with a pre-defined environment and strives to achieve specific goals through deep neural networks (DNN). Recurrent neural network (RNN) based DRL has demonstrated superior performance, as RNNs can effectively capture the temporal evolution of the environment and respond with proper agent actions. However, apart from the outstanding performance, little is known about how RNNs understand the environment internally and what has been memorized over time. Revealing these details is extremely important for deep learning experts to understand and improve DRLs, which in contrast, is also challenging due to the complicated data transformations inside these models. In this article, we propose Deep Reinforcement Learning Interactive Visual Explorer (DRLIVE), a visual analytics system to effectively explore, interpret, and diagnose RNN-based DRLs. Having focused on DRL agents trained for different Atari games, DRLIVE accomplishes three tasks: game episode exploration, RNN hidden/cell state examination, and interactive model perturbation. Using the system, one can flexibly explore a DRL agent through interactive visualizations, discover interpretable RNN cells by prioritizing RNN hidden/cell states with a set of metrics, and further diagnose the DRL model by interactively perturbing its inputs. Through concrete studies with multiple deep learning experts, we validated the efficacy of DRLIVE.
Collapse
|
12
|
Li Q, Wei X, Lin H, Liu Y, Chen T, Ma X. Inspecting the Running Process of Horizontal Federated Learning via Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4085-4100. [PMID: 33872152 DOI: 10.1109/tvcg.2021.3074010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As a decentralized training approach, horizontal federated learning (HFL) enables distributed clients to collaboratively learn a machine learning model while keeping personal/private information on local devices. Despite the enhanced performance and efficiency of HFL over local training, clues for inspecting the behaviors of the participating clients and the federated model are usually lacking due to the privacy-preserving nature of HFL. Consequently, the users can only conduct a shallow-level analysis of potential abnormal behaviors and have limited means to assess the contributions of individual clients and implement the necessary intervention. Visualization techniques have been introduced to facilitate the HFL process inspection, usually by providing model metrics and evaluation results as a dashboard representation. Although the existing visualization methods allow a simple examination of the HFL model performance, they cannot support the intensive exploration of the HFL process. In this article, strictly following the HFL privacy-preserving protocol, we design an exploratory visual analytics system for the HFL process termed HFLens, which supports comparative visual interpretation at the overview, communication round, and client instance levels. Specifically, the proposed system facilitates the investigation of the overall process involving all clients, the correlation analysis of clients' information in one or different communication round(s), the identification of potential anomalies, and the contribution assessment of each HFL client. Two case studies confirm the efficacy of our system. Experts' feedback suggests that our approach indeed helps in understanding and diagnosing the HFL process better.
Collapse
|
13
|
Xuan X, Zhang X, Kwon OH, Ma KL. VAC-CNN: A Visual Analytics System for Comparative Studies of Deep Convolutional Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:2326-2337. [PMID: 35389868 DOI: 10.1109/tvcg.2022.3165347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The rapid development of Convolutional Neural Networks (CNNs) in recent years has triggered significant breakthroughs in many machine learning (ML) applications. The ability to understand and compare various CNN models available is thus essential. The conventional approach with visualizing each model's quantitative features, such as classification accuracy and computational complexity, is not sufficient for a deeper understanding and comparison of the behaviors of different models. Moreover, most of the existing tools for assessing CNN behaviors only support comparison between two models and lack the flexibility of customizing the analysis tasks according to user needs. This paper presents a visual analytics system, VAC-CNN (Visual Analytics for Comparing CNNs), that supports the in-depth inspection of a single CNN model as well as comparative studies of two or more models. The ability to compare a larger number of (e.g., tens of) models especially distinguishes our system from previous ones. With a carefully designed model visualization and explaining support, VAC-CNN facilitates a highly interactive workflow that promptly presents both quantitative and qualitative information at each analysis stage. We demonstrate VAC-CNN's effectiveness for assisting novice ML practitioners in evaluating and comparing multiple CNN models through two use cases and one preliminary evaluation study using the image classification tasks on the ImageNet dataset.
Collapse
|
14
|
Hinterreiter A, Ruch P, Stitz H, Ennemoser M, Bernard J, Strobelt H, Streit M. ConfusionFlow: A Model-Agnostic Visualization for Temporal Analysis of Classifier Confusion. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1222-1236. [PMID: 32746284 DOI: 10.1109/tvcg.2020.3012063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to assess classifiers' performances, evaluate their learning behavior over time, and compare different models. Typically, this analysis is based on single-number performance measures such as accuracy. A more detailed evaluation of classifiers is possible by inspecting class errors. The confusion matrix is an established way for visualizing these class errors, but it was not designed with temporal or comparative analysis in mind. More generally, established performance analysis systems do not allow a combined temporal and comparative analysis of class-level information. To address this issue, we propose ConfusionFlow, an interactive, comparative visualization tool that combines the benefits of class confusion matrices with the visualization of performance characteristics over time. ConfusionFlow is model-agnostic and can be used to compare performances for different model types, model architectures, and/or training and test datasets. We demonstrate the usefulness of ConfusionFlow in a case study on instance selection strategies in active learning. We further assess the scalability of ConfusionFlow and present a use case in the context of neural network pruning.
Collapse
|
15
|
Wang X, He J, Jin Z, Yang M, Wang Y, Qu H. M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:802-812. [PMID: 34587037 DOI: 10.1109/tvcg.2021.3114794] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multimodal sentiment analysis aims to recognize people's attitudes from multiple communication channels such as verbal content (i.e., text), voice, and facial expressions. It has become a vibrant and important research topic in natural language processing. Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels. However, current multimodal models with strong performance are often deep-learning-based techniques and work like black boxes. It is not clear how models utilize multimodal information for sentiment predictions. Despite recent advances in techniques for enhancing the explainability of machine learning models, they often target unimodal scenarios (e.g., images, sentences), and little research has been done on explaining multimodal models. In this paper, we present an interactive visual analytics system, M2 Lens, to visualize and explain multimodal models for sentiment analysis. M2 Lens provides explanations on intra- and inter-modal interactions at the global, subset, and local levels. Specifically, it summarizes the influence of three typical interaction types (i.e., dominance, complement, and conflict) on the model predictions. Moreover, M2 Lens identifies frequent and influential multimodal features and supports the multi-faceted exploration of model behaviors from language, acoustic, and visual modalities. Through two case studies and expert interviews, we demonstrate our system can help users gain deep insights into the multimodal models for sentiment analysis.
Collapse
|
16
|
Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy. SENSORS 2021; 21:s21227743. [PMID: 34833819 PMCID: PMC8623838 DOI: 10.3390/s21227743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/08/2021] [Accepted: 11/17/2021] [Indexed: 11/24/2022]
Abstract
In sensor-based human activity recognition, many methods based on convolutional neural networks (CNNs) have been proposed. In the typical CNN-based activity recognition model, each class is treated independently of others. However, actual activity classes often have hierarchical relationships. It is important to consider an activity recognition model that uses the hierarchical relationship among classes to improve recognition performance. In image recognition, branch CNNs (B-CNNs) have been proposed for classification using class hierarchies. B-CNNs can easily perform classification using hand-crafted class hierarchies, but it is difficult to manually design an appropriate class hierarchy when the number of classes is large or there is little prior knowledge. Therefore, in our study, we propose a class hierarchy-adaptive B-CNN, which adds a method to the B-CNN for automatically constructing class hierarchies. Our method constructs the class hierarchy from training data automatically to effectively train the B-CNN without prior knowledge. We evaluated our method on several benchmark datasets for activity recognition. As a result, our method outperformed standard CNN models without considering the hierarchical relationship among classes. In addition, we confirmed that our method has performance comparable to a B-CNN model with a class hierarchy based on human prior knowledge.
Collapse
|
17
|
Cheng S, Li X, Shan G, Niu B, Wang Y, Luo M. ACMViz: a visual analytics approach to understand DRL-based autonomous control model. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-021-00793-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
18
|
Cao K, Liu M, Su H, Wu J, Zhu J, Liu S. Analyzing the Noise Robustness of Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3289-3304. [PMID: 31985427 DOI: 10.1109/tvcg.2020.2969185] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.
Collapse
|
19
|
Wang ZJ, Turko R, Shaikh O, Park H, Das N, Hohman F, Kahng M, Polo Chau DH. CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1396-1406. [PMID: 33048723 DOI: 10.1109/tvcg.2020.3030418] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Deep learning's great success motivates many practitioners and students to learn about this exciting technology. However, it is often challenging for beginners to take their first step due to the complexity of understanding and applying deep learning. We present CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a foundational deep learning model architecture. Our tool addresses key challenges that novices face while learning about CNNs, which we identify from interviews with instructors and a survey with past students. CNN Explainer tightly integrates a model overview that summarizes a CNN's structure, and on-demand, dynamic visual explanation views that help users understand the underlying components of CNNs. Through smooth transitions across levels of abstraction, our tool enables users to inspect the interplay between low-level mathematical operations and high-level model structures. A qualitative user study shows that CNN Explainer helps users more easily understand the inner workings of CNNs, and is engaging and enjoyable to use. We also derive design lessons from our study. Developed using modern web technologies, CNN Explainer runs locally in users' web browsers without the need for installation or specialized hardware, broadening the public's education access to modern deep learning techniques.
Collapse
|
20
|
Li G, Wang J, Shen HW, Chen K, Shan G, Lu Z. CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1364-1373. [PMID: 33048744 DOI: 10.1109/tvcg.2020.3030461] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Convolutional neural networks (CNNs) have demonstrated extraordinarily good performance in many computer vision tasks. The increasing size of CNN models, however, prevents them from being widely deployed to devices with limited computational resources, e.g., mobile/embedded devices. The emerging topic of model pruning strives to address this problem by removing less important neurons and fine-tuning the pruned networks to minimize the accuracy loss. Nevertheless, existing automated pruning solutions often rely on a numerical threshold of the pruning criteria, lacking the flexibility to optimally balance the trade-off between efficiency and accuracy. Moreover, the complicated interplay between the stages of neuron pruning and model fine-tuning makes this process opaque, and therefore becomes difficult to optimize. In this paper, we address these challenges through a visual analytics approach, named CNNPruner. It considers the importance of convolutional filters through both instability and sensitivity, and allows users to interactively create pruning plans according to a desired goal on model size or accuracy. Also, CNNPruner integrates state-of-the-art filter visualization techniques to help users understand the roles that different filters played and refine their pruning plans. Through comprehensive case studies on CNNs with real-world sizes, we validate the effectiveness of CNNPruner.
Collapse
|
21
|
Cheng F, Ming Y, Qu H. DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1438-1447. [PMID: 33074811 DOI: 10.1109/tvcg.2020.3030342] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
With machine learning models being increasingly applied to various decision-making scenarios, people have spent growing efforts to make machine learning models more transparent and explainable. Among various explanation techniques, counterfactual explanations have the advantages of being human-friendly and actionable-a counterfactual explanation tells the user how to gain the desired prediction with minimal changes to the input. Besides, counterfactual explanations can also serve as efficient probes to the models' decisions. In this work, we exploit the potential of counterfactual explanations to understand and explore the behavior of machine learning models. We design DECE, an interactive visualization system that helps understand and explore a model's decisions on individual instances and data subsets, supporting users ranging from decision-subjects to model developers. DECE supports exploratory analysis of model decisions by combining the strengths of counterfactual explanations at instance- and subgroup-levels. We also introduce a set of interactions that enable users to customize the generation of counterfactual explanations to find more actionable ones that can suit their needs. Through three use cases and an expert interview, we demonstrate the effectiveness of DECE in supporting decision exploration tasks and instance explanations.
Collapse
|
22
|
Gou L, Zou L, Li N, Hofmann M, Shekar AK, Wendt A, Ren L. VATLD: A Visual Analytics System to Assess, Understand and Improve Traffic Light Detection. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:261-271. [PMID: 33079663 DOI: 10.1109/tvcg.2020.3030350] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Traffic light detection is crucial for environment perception and decision-making in autonomous driving. State-of-the-art detectors are built upon deep Convolutional Neural Networks (CNNs) and have exhibited promising performance. However, one looming concern with CNN based detectors is how to thoroughly evaluate the performance of accuracy and robustness before they can be deployed to autonomous vehicles. In this work, we propose a visual analytics system, VATLD, equipped with a disentangled representation learning and semantic adversarial learning, to assess, understand, and improve the accuracy and robustness of traffic light detectors in autonomous driving applications. The disentangled representation learning extracts data semantics to augment human cognition with human-friendly visual summarization, and the semantic adversarial learning efficiently exposes interpretable robustness risks and enables minimal human interaction for actionable insights. We also demonstrate the effectiveness of various performance improvement strategies derived from actionable insights with our visual analytics system, VATLD, and illustrate some practical implications for safety-critical applications in autonomous driving.
Collapse
|
23
|
Ravindran P, Owens FC, Wade AC, Shmulsky R, Wiedenhoeft AC. Towards Sustainable North American Wood Product Value Chains, Part I: Computer Vision Identification of Diffuse Porous Hardwoods. FRONTIERS IN PLANT SCIENCE 2021; 12:758455. [PMID: 35126406 PMCID: PMC8815006 DOI: 10.3389/fpls.2021.758455] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 12/20/2021] [Indexed: 05/06/2023]
Abstract
Availability of and access to wood identification expertise or technology is a critical component for the design and implementation of practical, enforceable strategies for effective promotion, monitoring and incentivisation of sustainable practices and conservation efforts in the forest products value chain. To address this need in the context of the multi-billion-dollar North American wood products industry 22-class, image-based, deep learning models for the macroscopic identification of North American diffuse porous hardwoods were trained for deployment on the open-source, field-deployable XyloTron platform using transverse surface images of specimens from three different xylaria and evaluated on specimens from a fourth xylarium that did not contribute training data. Analysis of the model performance, in the context of the anatomy of the woods considered, demonstrates immediate readiness of the technology developed herein for field testing in a human-in-the-loop monitoring scenario. Also proposed are strategies for training, evaluating, and advancing the state-of-the-art for developing an expansive, continental scale model for all the North American hardwoods.
Collapse
Affiliation(s)
- Prabu Ravindran
- Department of Botany, University of Wisconsin, Madison, WI, United States
- Forest Products Laboratory, Center for Wood Anatomy Research, USDA Forest Service, Madison, WI, United States
- *Correspondence: Prabu Ravindran,
| | - Frank C. Owens
- Department of Sustainable Bioproducts, Mississippi State University, Starkville, MS, United States
| | - Adam C. Wade
- Department of Sustainable Bioproducts, Mississippi State University, Starkville, MS, United States
| | - Rubin Shmulsky
- Department of Sustainable Bioproducts, Mississippi State University, Starkville, MS, United States
| | - Alex C. Wiedenhoeft
- Department of Botany, University of Wisconsin, Madison, WI, United States
- Forest Products Laboratory, Center for Wood Anatomy Research, USDA Forest Service, Madison, WI, United States
- Department of Sustainable Bioproducts, Mississippi State University, Starkville, MS, United States
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
24
|
Wang Q, Yuan J, Chen S, Su H, Qu H, Liu S. Visual Genealogy of Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:3340-3352. [PMID: 31180859 DOI: 10.1109/tvcg.2019.2921323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A comprehensive and comprehensible summary of existing deep neural networks (DNNs) helps practitioners understand the behaviour and evolution of DNNs, offers insights for architecture optimization, and sheds light on the working mechanisms of DNNs. However, this summary is hard to obtain because of the complexity and diversity of DNN architectures. To address this issue, we develop DNN Genealogy, an interactive visualization tool, to offer a visual summary of representative DNNs and their evolutionary relationships. DNN Genealogy enables users to learn DNNs from multiple aspects, including architecture, performance, and evolutionary relationships. Central to this tool is a systematic analysis and visualization of 66 representative DNNs based on our analysis of 140 papers. A directed acyclic graph is used to illustrate the evolutionary relationships among these DNNs and highlight the representative DNNs. A focus + context visualization is developed to orient users during their exploration. A set of network glyphs is used in the graph to facilitate the understanding and comparing of DNNs in the context of the evolution. Case studies demonstrate that DNN Genealogy provides helpful guidance in understanding, applying, and optimizing DNNs. DNN Genealogy is extensible and will continue to be updated to reflect future advances in DNNs.
Collapse
|
25
|
Lechner M, Hasani R, Amini A, Henzinger TA, Rus D, Grosu R. Neural circuit policies enabling auditable autonomy. NAT MACH INTELL 2020. [DOI: 10.1038/s42256-020-00237-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
26
|
Tree Crown Delineation Algorithm Based on a Convolutional Neural Network. REMOTE SENSING 2020. [DOI: 10.3390/rs12081288] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Tropical forests concentrate the largest diversity of species on the planet and play a key role in maintaining environmental processes. Due to the importance of those forests, there is growing interest in mapping their components and getting information at an individual tree level to conduct reliable satellite-based forest inventory for biomass and species distribution qualification. Individual tree crown information could be manually gathered from high resolution satellite images; however, to achieve this task at large-scale, an algorithm to identify and delineate each tree crown individually, with high accuracy, is a prerequisite. In this study, we propose the application of a convolutional neural network—Mask R-CNN algorithm—to perform the tree crown detection and delineation. The algorithm uses very high-resolution satellite images from tropical forests. The results obtained are promising—the R e c a l l , P r e c i s i o n , and F 1 score values obtained were were 0.81 , 0.91 , and 0.86 , respectively. In the study site, the total of tree crowns delineated was 59,062 . These results suggest that this algorithm can be used to assist the planning and conduction of forest inventories. As the algorithm is based on a Deep Learning approach, it can be systematically trained and used for other regions.
Collapse
|
27
|
Hazarika S, Li H, Wang KC, Shen HW, Chou CS. NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:34-44. [PMID: 31425114 DOI: 10.1109/tvcg.2019.2934591] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Complex computational models are often designed to simulate real-world physical phenomena in many scientific disciplines. However, these simulation models tend to be computationally very expensive and involve a large number of simulation input parameters, which need to be analyzed and properly calibrated before the models can be applied for real scientific studies. We propose a visual analysis system to facilitate interactive exploratory analysis of high-dimensional input parameter space for a complex yeast cell polarization simulation. The proposed system can assist the computational biologists, who designed the simulation model, to visually calibrate the input parameters by modifying the parameter values and immediately visualizing the predicted simulation outcome without having the need to run the original expensive simulation for every instance. Our proposed visual analysis system is driven by a trained neural network-based surrogate model as the backend analysis framework. In this work, we demonstrate the advantage of using neural networks as surrogate models for visual analysis by incorporating some of the recent advances in the field of uncertainty quantification, interpretability and explainability of neural network-based models. We utilize the trained network to perform interactive parameter sensitivity analysis of the original simulation as well as recommend optimal parameter configurations using the activation maximization framework of neural networks. We also facilitate detail analysis of the trained network to extract useful insights about the simulation model, learned by the network, during the training process. We performed two case studies, and discovered multiple new parameter configurations, which can trigger high cell polarization results in the original simulation model. We evaluated our results by comparing with the original simulation model outcomes as well as the findings from previous parameter analysis performed by our experts.
Collapse
|
28
|
Hohman F, Park H, Robinson C, Polo Chau DH. Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1096-1106. [PMID: 31443005 DOI: 10.1109/tvcg.2019.2934659] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Deep learning is increasingly used in decision-making tasks. However, understanding how neural networks produce final predictions remains a fundamental challenge. Existing work on interpreting neural network predictions for images often focuses on explaining predictions for single images or neurons. As predictions are often computed from millions of weights that are optimized over millions of images, such explanations can easily miss a bigger picture. We present Summit, an interactive system that scalably and systematically summarizes and visualizes what features a deep learning model has learned and how those features interact to make predictions. Summit introduces two new scalable summarization techniques: (1) activation aggregation discovers important neurons, and (2) neuron-influence aggregation identifies relationships among such neurons. Summit combines these techniques to create the novel attribution graph that reveals and summarizes crucial neuron associations and substructures that contribute to a model's outcomes. Summit scales to large data, such as the ImageNet dataset with 1.2M images, and leverages neural network feature visualization and dataset examples to help users distill large, complex neural network models into compact, interactive visualizations. We present neural network exploration scenarios where Summit helps us discover multiple surprising insights into a prevalent, large-scale image classifier's learned representations and informs future neural network architecture design. The Summit visualization runs in modern web browsers and is open-sourced.
Collapse
|
29
|
Spinner T, Schlegel U, Schafer H, El-Assady M. explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1064-1074. [PMID: 31442998 DOI: 10.1109/tvcg.2019.2934629] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We propose a framework for interactive and explainable machine learning that enables users to (1) understand machine learning models; (2) diagnose model limitations using different explainable AI methods; as well as (3) refine and optimize the models. Our framework combines an iterative XAI pipeline with eight global monitoring and steering mechanisms, including quality monitoring, provenance tracking, model comparison, and trust building. To operationalize the framework, we present explAIner, a visual analytics system for interactive and explainable machine learning that instantiates all phases of the suggested pipeline within the commonly used TensorBoard environment. We performed a user-study with nine participants across different expertise levels to examine their perception of our workflow and to collect suggestions to fill the gap between our system and framework. The evaluation confirms that our tightly integrated system leads to an informed machine learning process while disclosing opportunities for further extensions.
Collapse
|
30
|
Survey of XAI in Digital Pathology. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR DIGITAL PATHOLOGY 2020. [DOI: 10.1007/978-3-030-50402-1_4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
31
|
Legg P, Smith J, Downing A. Visual analytics for collaborative human-machine confidence in human-centric active learning tasks. HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES 2019. [DOI: 10.1186/s13673-019-0167-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Abstract
Active machine learning is a human-centric paradigm that leverages a small labelled dataset to build an initial weak classifier, that can then be improved over time through human-machine collaboration. As new unlabelled samples are observed, the machine can either provide a prediction, or query a human ‘oracle’ when the machine is not confident in its prediction. Of course, just as the machine may lack confidence, the same can also be true of a human ‘oracle’: humans are not all-knowing, untiring oracles. A human’s ability to provide an accurate and confident response will often vary between queries, according to the duration of the current interaction, their level of engagement with the system, and the difficulty of the labelling task. This poses an important question of how uncertainty can be expressed and accounted for in a human-machine collaboration. In short, how can we facilitate a mutually-transparent collaboration between two uncertain actors—a person and a machine—that leads to an improved outcome? In this work, we demonstrate the benefit of human-machine collaboration within the process of active learning, where limited data samples are available or where labelling costs are high. To achieve this, we developed a visual analytics tool for active learning that promotes transparency, inspection, understanding and trust, of the learning process through human-machine collaboration. Fundamental to the notion of confidence, both parties can report their level of confidence during active learning tasks using the tool, such that this can be used to inform learning. Human confidence of labels can be accounted for by the machine, the machine can query for samples based on confidence measures, and the machine can report confidence of current predictions to the human, to further the trust and transparency between the collaborative parties. In particular, we find that this can improve the robustness of the classifier when incorrect sample labels are provided, due to unconfidence or fatigue. Reported confidences can also better inform human-machine sample selection in collaborative sampling. Our experimentation compares the impact of different selection strategies for acquiring samples: machine-driven, human-driven, and collaborative selection. We demonstrate how a collaborative approach can improve trust in the model robustness, achieving high accuracy and low user correction, with only limited data sample selections.
Collapse
|
32
|
|
33
|
Liu S, Li Z, Li T, Srikumar V, Pascucci V, Bremer PT. NLIZE: A Perturbation-Driven Visual Interrogation Tool for Analyzing and Interpreting Natural Language Inference Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:651-660. [PMID: 30188829 DOI: 10.1109/tvcg.2018.2865230] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
With the recent advances in deep learning, neural network models have obtained state-of-the-art performances for many linguistic tasks in natural language processing. However, this rapid progress also brings enormous challenges. The opaque nature of a neural network model leads to hard-to-debug-systems and difficult-to-interpret mechanisms. Here, we introduce a visualization system that, through a tight yet flexible integration between visualization elements and the underlying model, allows a user to interrogate the model by perturbing the input, internal state, and prediction while observing changes in other parts of the pipeline. We use the natural language inference problem as an example to illustrate how a perturbation-driven paradigm can help domain experts assess the potential limitation of a model, probe its inner states, and interpret and form hypotheses about fundamental model mechanisms such as attention.
Collapse
|
34
|
Wang J, Gou L, Shen HW, Yang H. DQNViz: A Visual Analytics Approach to Understand Deep Q-Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:288-298. [PMID: 30188823 DOI: 10.1109/tvcg.2018.2864504] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Deep Q-Network (DQN), as one type of deep reinforcement learning model, targets to train an intelligent agent that acquires optimal actions while interacting with an environment. The model is well known for its ability to surpass professional human players across many Atari 2600 games. Despite the superhuman performance, in-depth understanding of the model and interpreting the sophisticated behaviors of the DQN agent remain to be challenging tasks, due to the long-time model training process and the large number of experiences dynamically generated by the agent. In this work, we propose DQNViz, a visual analytics system to expose details of the blind training process in four levels, and enable users to dive into the large experience space of the agent for comprehensive analysis. As an initial attempt in visualizing DQN models, our work focuses more on Atari games with a simple action space, most notably the Breakout game. From our visual analytics of the agent's experiences, we extract useful action/reward patterns that help to interpret the model and control the training. Through multiple case studies conducted together with deep learning experts, we demonstrate that DQNViz can effectively help domain experts to understand, diagnose, and potentially improve DQN models.
Collapse
|
35
|
Kahng M, Thorat N, Chau DHP, Viegas FB, Wattenberg M. GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:310-320. [PMID: 30130198 DOI: 10.1109/tvcg.2018.2864500] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recent success in deep learning has generated immense interest among practitioners and students, inspiring many to learn about this new technology. While visual and interactive approaches have been successfully developed to help people more easily learn deep learning, most existing tools focus on simpler models. In this work, we present GAN Lab, the first interactive visualization tool designed for non-experts to learn and experiment with Generative Adversarial Networks (GANs), a popular class of complex deep learning models. With GAN Lab, users can interactively train generative models and visualize the dynamic training process's intermediate results. GAN Lab tightly integrates an model overview graph that summarizes GAN's structure, and a layered distributions view that helps users interpret the interplay between submodels. GAN Lab introduces new interactive experimentation features for learning complex deep learning models, such as step-by-step training at multiple levels of abstraction for understanding intricate training dynamics. Implemented using TensorFlow.js, GAN Lab is accessible to anyone via modern web browsers, without the need for installation or specialized hardware, overcoming a major practical challenge in deploying interactive tools for deep learning.
Collapse
|
36
|
Ming Y, Qu H, Bertini E. RuleMatrix: Visualizing and Understanding Classifiers with Rules. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:342-352. [PMID: 30130210 DOI: 10.1109/tvcg.2018.2864812] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
With the growing adoption of machine learning techniques, there is a surge of research interest towards making machine learning systems more transparent and interpretable. Various visualizations have been developed to help model developers understand, diagnose, and refine machine learning models. However, a large number of potential but neglected users are the domain experts with little knowledge of machine learning but are expected to work with machine learning systems. In this paper, we present an interactive visualization technique to help users with little expertise in machine learning to understand, explore and validate predictive models. By viewing the model as a black box, we extract a standardized rule-based knowledge representation from its input-output behavior. Then, we design RuleMatrix, a matrix-based visualization of rules to help users navigate and verify the rules and the black-box model. We evaluate the effectiveness of RuleMatrix via two use cases and a usability study.
Collapse
|
37
|
Sacha D, Kraus M, Keim DA, Chen M. VIS4ML: An Ontology for Visual Analytics Assisted Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:385-395. [PMID: 30130221 DOI: 10.1109/tvcg.2018.2864838] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
While many VA workflows make use of machine-learned models to support analytical tasks, VA workflows have become increasingly important in understanding and improving Machine Learning (ML) processes. In this paper, we propose an ontology (VIS4ML) for a subarea of VA, namely "VA-assisted ML". The purpose of VIS4ML is to describe and understand existing VA workflows used in ML as well as to detect gaps in ML processes and the potential of introducing advanced VA techniques to such processes. Ontologies have been widely used to map out the scope of a topic in biology, medicine, and many other disciplines. We adopt the scholarly methodologies for constructing VIS4ML, including the specification, conceptualization, formalization, implementation, and validation of ontologies. In particular, we reinterpret the traditional VA pipeline to encompass model-development workflows. We introduce necessary definitions, rules, syntaxes, and visual notations for formulating VIS4ML and make use of semantic web technologies for implementing it in the Web Ontology Language (OWL). VIS4ML captures the high-level knowledge about previous workflows where VA is used to assist in ML. It is consistent with the established VA concepts and will continue to evolve along with the future developments in VA and ML. While this ontology is an effort for building the theoretical foundation of VA, it can be used by practitioners in real-world applications to optimize model-development workflows by systematically examining the potential benefits that can be brought about by either machine or human capabilities. Meanwhile, VIS4ML is intended to be extensible and will continue to be updated to reflect future advancements in using VA for building high-quality data-analytical models or for building such models rapidly.
Collapse
|
38
|
Hohman FM, Kahng M, Pienta R, Chau DH. Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:10.1109/TVCG.2018.2843369. [PMID: 29993551 PMCID: PMC6703958 DOI: 10.1109/tvcg.2018.2843369] [Citation(s) in RCA: 103] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Deep learning has recently seen rapid development and received significant attention due to its state-of-the-art performance on previously-thought hard problems. However, because of the internal complexity and nonlinear structure of deep neural networks, the underlying decision making processes for why these models are achieving such performance are challenging and sometimes mystifying to interpret. As deep learning spreads across domains, it is of paramount importance that we equip users of deep learning with tools for understanding when a model works correctly, when it fails, and ultimately how to improve its performance. Standardized toolkits for building neural networks have helped democratize deep learning; visual analytics systems have now been developed to support model explanation, interpretation, debugging, and improvement. We present a survey of the role of visual analytics in deep learning research, which highlights its short yet impactful history and thoroughly summarizes the state-of-the-art using a human-centered interrogative framework, focusing on the Five W's and How (Why, Who, What, How, When, and Where). We conclude by highlighting research directions and open research problems. This survey helps researchers and practitioners in both visual analytics and deep learning to quickly learn key aspects of this young and rapidly growing body of research, whose impact spans a diverse range of domains.
Collapse
|
39
|
Olah C, Satyanarayan A, Johnson I, Carter S, Schubert L, Ye K, Mordvintsev A. The Building Blocks of Interpretability. ACTA ACUST UNITED AC 2018. [DOI: 10.23915/distill.00010] [Citation(s) in RCA: 219] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
40
|
Visually-Enabled Active Deep Learning for (Geo) Text and Image Classification: A Review. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2018. [DOI: 10.3390/ijgi7020065] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|