1
|
Chavlis S, Poirazi P. Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning. Nat Commun 2025; 16:943. [PMID: 39843414 PMCID: PMC11754790 DOI: 10.1038/s41467-025-56297-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 01/13/2025] [Indexed: 01/24/2025] Open
Abstract
Artificial neural networks (ANNs) are at the core of most Deep Learning (DL) algorithms that successfully tackle complex problems like image recognition, autonomous driving, and natural language processing. However, unlike biological brains who tackle similar problems in a very efficient manner, DL algorithms require a large number of trainable parameters, making them energy-intensive and prone to overfitting. Here, we show that a new ANN architecture that incorporates the structured connectivity and restricted sampling properties of biological dendrites counteracts these limitations. We find that dendritic ANNs are more robust to overfitting and match or outperform traditional ANNs on several image classification tasks while using significantly fewer trainable parameters. These advantages are likely the result of a different learning strategy, whereby most of the nodes in dendritic ANNs respond to multiple classes, unlike classical ANNs that strive for class-specificity. Our findings suggest that the incorporation of dendritic properties can make learning in ANNs more precise, resilient, and parameter-efficient and shed new light on how biological features can impact the learning strategies of ANNs.
Collapse
Affiliation(s)
- Spyridon Chavlis
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Crete, Greece
| | - Panayiota Poirazi
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Crete, Greece.
| |
Collapse
|
2
|
Lohmann F, Allenspach S, Atz K, Schiebroek CCG, Hiss JA, Schneider G. Protein Binding Site Representation in Latent Space. Mol Inform 2025; 44:e202400205. [PMID: 39692081 PMCID: PMC11733832 DOI: 10.1002/minf.202400205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 11/14/2024] [Accepted: 11/26/2024] [Indexed: 12/19/2024]
Abstract
Interpretability and reliability of deep learning models are important for computer-based drug discovery. Aiming to understand feature perception by such a model, we investigate a graph neural network for affinity prediction of protein-ligand complexes. We assess a latent representation of ligand binding sites and investigate underlying geometric structure in this latent space and its relation to protein function. We introduce an automated computational pipeline for dimensionality reduction, clustering, hypothesis testing, and visualization of latent space. The results indicate that the learned protein latent space is inherently structured and not randomly distributed. Several of the identified protein binding site clusters in latent space correspond to functional protein families. Ligand size was found to be a determinant of cluster geometry. The computational pipeline proved applicable to latent space analysis and interpretation and can be adapted to work for different datasets and deep learning models.
Collapse
Affiliation(s)
- Frederieke Lohmann
- Department of Chemistry and Applied BiosciencesETH ZurichVladimir–Prelog–Weg 48093ZürichSwitzerland
| | - Stephan Allenspach
- Department of Chemistry and Applied BiosciencesETH ZurichVladimir–Prelog–Weg 48093ZürichSwitzerland
| | - Kenneth Atz
- Department of Chemistry and Applied BiosciencesETH ZurichVladimir–Prelog–Weg 48093ZürichSwitzerland
| | - Carl C. G. Schiebroek
- Department of Chemistry and Applied BiosciencesETH ZurichVladimir–Prelog–Weg 48093ZürichSwitzerland
| | - Jan A. Hiss
- Department of Chemistry and Applied BiosciencesETH ZurichVladimir–Prelog–Weg 48093ZürichSwitzerland
- Department of Biosystems Science and EngineeringETH ZurichKlingelbergstrasse 484056BaselSwitzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied BiosciencesETH ZurichVladimir–Prelog–Weg 48093ZürichSwitzerland
- Department of Biosystems Science and EngineeringETH ZurichKlingelbergstrasse 484056BaselSwitzerland
| |
Collapse
|
3
|
Li G, Wang J, Wang Y, Shan G, Zhao Y. An In-Situ Visual Analytics Framework for Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:6770-6786. [PMID: 38051629 DOI: 10.1109/tvcg.2023.3339585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
The past decade has witnessed the superior power of deep neural networks (DNNs) in applications across various domains. However, training a high-quality DNN remains a non-trivial task due to its massive number of parameters. Visualization has shown great potential in addressing this situation, as evidenced by numerous recent visualization works that aid in DNN training and interpretation. These works commonly employ a strategy of logging training-related data and conducting post-hoc analysis. Based on the results of offline analysis, the model can be further trained or fine-tuned. This strategy, however, does not cope with the increasing complexity of DNNs, because (1) the time-series data collected over the training are usually too large to be stored entirely; (2) the huge I/O overhead significantly impacts the training efficiency; (3) post-hoc analysis does not allow rapid human-interventions (e.g., stop training with improper hyper-parameter settings to save computational resources). To address these challenges, we propose an in-situ visualization and analysis framework for the training of DNNs. Specifically, we employ feature extraction algorithms to reduce the size of training-related data in-situ and use the reduced data for real-time visual analytics. The states of model training are disclosed to model designers in real-time, enabling human interventions on demand to steer the training. Through concrete case studies, we demonstrate how our in-situ framework helps deep learning experts optimize DNNs and improve their analysis efficiency.
Collapse
|
4
|
Segev A, Jung S. Common knowledge processing patterns in networks of different systems. PLoS One 2023; 18:e0290326. [PMID: 37796927 PMCID: PMC10553345 DOI: 10.1371/journal.pone.0290326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/01/2023] [Indexed: 10/07/2023] Open
Abstract
Knowledge processing has patterns which can be found in biological neuron activity and artificial neural networks. The work explores whether an underlying structure exists for knowledge which crosses domains. The results show common data processing patterns in biological systems and human-made knowledge-based systems, present examples of human-generated knowledge processing systems, such as artificial neural networks and research topic knowledge networks, and explore change of system patterns over time. The work analyzes nature-based systems, which are animal connectomes, and observes neuron circuitry of knowledge processing based on complexity of the knowledge processing system. The variety of domains and similarity in processing mechanisms raise the question: if it is common in natural and artificial systems to see this pattern-based knowledge processing, how unique is knowledge processing in humans.
Collapse
Affiliation(s)
- Aviv Segev
- Department of Computer Science, University of South Alabama, Mobile, AL, United States of America
| | - Sukhwan Jung
- Department of Computer Science, University of South Alabama, Mobile, AL, United States of America
| |
Collapse
|
5
|
Moon J, Posada-Quintero HF, Chon KH. Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction. Neural Netw 2023; 165:562-595. [PMID: 37364469 DOI: 10.1016/j.neunet.2023.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 04/11/2023] [Accepted: 05/09/2023] [Indexed: 06/28/2023]
Abstract
Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
Collapse
Affiliation(s)
- Jihye Moon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| | | | - Ki H Chon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| |
Collapse
|
6
|
Collaris D, van Wijk JJ. StrategyAtlas: Strategy Analysis for Machine Learning Interpretability. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:2996-3008. [PMID: 35085084 DOI: 10.1109/tvcg.2022.3146806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Businesses in high-risk environments have been reluctant to adopt modern machine learning approaches due to their complex and uninterpretable nature. Most current solutions provide local, instance-level explanations, but this is insufficient for understanding the model as a whole. In this work, we show that strategy clusters (i.e., groups of data instances that are treated distinctly by the model) can be used to understand the global behavior of a complex ML model. To support effective exploration and understanding of these clusters, we introduce StrategyAtlas, a system designed to analyze and explain model strategies. Furthermore, it supports multiple ways to utilize these strategies for simplifying and improving the reference model. In collaboration with a large insurance company, we present a use case in automatic insurance acceptance, and show how professional data scientists were enabled to understand a complex model and improve the production model based on these insights.
Collapse
|
7
|
Borys K, Schmitt YA, Nauta M, Seifert C, Krämer N, Friedrich CM, Nensa F. Explainable AI in medical imaging: An overview for clinical practitioners – Beyond saliency-based XAI approaches. Eur J Radiol 2023; 162:110786. [PMID: 36990051 DOI: 10.1016/j.ejrad.2023.110786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 03/03/2023] [Accepted: 03/14/2023] [Indexed: 03/30/2023]
Abstract
Driven by recent advances in Artificial Intelligence (AI) and Computer Vision (CV), the implementation of AI systems in the medical domain increased correspondingly. This is especially true for the domain of medical imaging, in which the incorporation of AI aids several imaging-based tasks such as classification, segmentation, and registration. Moreover, AI reshapes medical research and contributes to the development of personalized clinical care. Consequently, alongside its extended implementation arises the need for an extensive understanding of AI systems and their inner workings, potentials, and limitations which the field of eXplainable AI (XAI) aims at. Because medical imaging is mainly associated with visual tasks, most explainability approaches incorporate saliency-based XAI methods. In contrast to that, in this article we would like to investigate the full potential of XAI methods in the field of medical imaging by specifically focusing on XAI techniques not relying on saliency, and providing diversified examples. We dedicate our investigation to a broad audience, but particularly healthcare professionals. Moreover, this work aims at establishing a common ground for cross-disciplinary understanding and exchange across disciplines between Deep Learning (DL) builders and healthcare professionals, which is why we aimed for a non-technical overview. Presented XAI methods are divided by a method's output representation into the following categories: Case-based explanations, textual explanations, and auxiliary explanations.
Collapse
|
8
|
Vlahek D, Mongus D. An Efficient Iterative Approach to Explainable Feature Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2606-2618. [PMID: 34478388 DOI: 10.1109/tnnls.2021.3107049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
This article introduces a new iterative approach to explainable feature learning. During each iteration, new features are generated, first by applying arithmetic operations on the input set of features. These are then evaluated in terms of probability distribution agreements between values of samples belonging to different classes. Finally, a graph-based approach for feature selection is proposed, which allows for selecting high-quality and uncorrelated features to be used in feature generation during the next iteration. As shown by the results, the proposed method improved the accuracy of all tested classifiers, where the best accuracies were achieved using random forest. In addition, the method turned out to be insensitive to both of the input parameters, while superior performances in comparison to the state of the art were demonstrated on nine out of 15 test sets and achieving comparable results in the others. Finally, we demonstrate the explainability of the learned feature representation for knowledge discovery.
Collapse
|
9
|
Elzen SVD, Andrienko G, Andrienko N, Fisher BD, Martins RM, Peltonen J, Telea AC, Verleysen M, Rhyne TM. The Flow of Trust: A Visualization Framework to Externalize, Explore, and Explain Trust in ML Applications. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2023; 43:78-88. [PMID: 37030833 DOI: 10.1109/mcg.2023.3237286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
We present a conceptual framework for the development of visual interactive techniques to formalize and externalize trust in machine learning (ML) workflows. Currently, trust in ML applications is an implicit process that takes place in the user's mind. As such, there is no method of feedback or communication of trust that can be acted upon. Our framework will be instrumental in developing interactive visualization approaches that will help users to efficiently and effectively build and communicate trust in ways that fit each of the ML process stages. We formulate several research questions and directions that include: 1) a typology/taxonomy of trust objects, trust issues, and possible reasons for (mis)trust; 2) formalisms to represent trust in machine-readable form; 3) means by which users can express their state of trust by interacting with a computer system (e.g., text, drawing, marking); 4) ways in which a system can facilitate users' expression and communication of the state of trust; and 5) creation of visual interactive techniques for representation and exploration of trust over all stages of an ML pipeline.
Collapse
|
10
|
Chen L, Ouyang Y, Zhang H, Hong S, Li Q. RISeer: Inspecting the Status and Dynamics of Regional Industrial Structure via Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1070-1080. [PMID: 36155450 DOI: 10.1109/tvcg.2022.3209351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Restructuring the regional industrial structure (RIS) has the potential to halt economic recession and achieve revitalization. Understanding the current status and dynamics of RIS will greatly assist in studying and evaluating the current industrial structure. Previous studies have focused on qualitative and quantitative research to rationalize RIS from a macroscopic perspective. Although recent studies have traced information at the industrial enterprise level to complement existing research from a micro perspective, the ambiguity of the underlying variables contributing to the industrial sector and its composition, the dynamic nature, and the large number of multivariant features of RIS records have obscured a deep and fine-grained understanding of RIS. To this end, we propose an interactive visualization system, RISeer, which is based on interpretable machine learning models and enhanced visualizations designed to identify the evolutionary patterns of the RIS and facilitate inter-regional inspection and comparison. Two case studies confirm the effectiveness of our approach, and feedback from experts indicates that RISeer helps them to gain a fine-grained understanding of the dynamics and evolution of the RIS.
Collapse
|
11
|
Zhang C, Wang X, Zhao C, Ren Y, Zhang T, Peng Z, Fan X, Ma X, Li Q. PromotionLens: Inspecting Promotion Strategies of Online E-commerce via Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:767-777. [PMID: 36155462 DOI: 10.1109/tvcg.2022.3209440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Promotions are commonly used by e-commerce merchants to boost sales. The efficacy of different promotion strategies can help sellers adapt their offering to customer demand in order to survive and thrive. Current approaches to designing promotion strategies are either based on econometrics, which may not scale to large amounts of sales data, or are spontaneous and provide little explanation of sales volume. Moreover, accurately measuring the effects of promotion designs and making bootstrappable adjustments accordingly remains a challenge due to the incompleteness and complexity of the information describing promotion strategies and their market environments. We present PromotionLens, a visual analytics system for exploring, comparing, and modeling the impact of various promotion strategies. Our approach combines representative multivariant time-series forecasting models and well-designed visualizations to demonstrate and explain the impact of sales and promotional factors, and to support "what-if" analysis of promotions. Two case studies, expert feedback, and a qualitative user study demonstrate the efficacy of PromotionLens.
Collapse
|
12
|
Ye Z, Chen M. Visualizing Ensemble Predictions of Music Mood. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:864-874. [PMID: 36170399 DOI: 10.1109/tvcg.2022.3209379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Music mood classification has been a challenging problem in comparison with other music classification problems (e.g., genre, composer, or period). One solution for addressing this challenge is to use an ensemble of machine learning models. In this paper, we show that visualization techniques can effectively convey the popular prediction as well as uncertainty at different music sections along the temporal axis while enabling the analysis of individual ML models in conjunction with their application to different musical data. In addition to the traditional visual designs, such as stacked line graph, ThemeRiver, and pixel-based visualization, we introduce a new variant of ThemeRiver, called "dual-flux ThemeRiver", which allows viewers to observe and measure the most popular prediction more easily than stacked line graph and ThemeRiver. Together with pixel-based visualization, dual-flux ThemeRiver plots can also assist in model-development workflows, in addition to annotating music using ensemble model predictions.
Collapse
|
13
|
Yuan J, Liu M, Tian F, Liu S. Visual Analysis of Neural Architecture Spaces for Summarizing Design Principles. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:288-298. [PMID: 36191103 DOI: 10.1109/tvcg.2022.3209404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Recent advances in artificial intelligence largely benefit from better neural network architectures. These architectures are a product of a costly process of trial-and-error. To ease this process, we develop ArchExplorer, a visual analysis method for understanding a neural architecture space and summarizing design principles. The key idea behind our method is to make the architecture space explainable by exploiting structural distances between architectures. We formulate the pairwise distance calculation as solving an all-pairs shortest path problem. To improve efficiency, we decompose this problem into a set of single-source shortest path problems. The time complexity is reduced from O(kn2N) to O(knN). Architectures are hierarchically clustered according to the distances between them. A circle-packing-based architecture visualization has been developed to convey both the global relationships between clusters and local neighborhoods of the architectures in each cluster. Two case studies and a post-analysis are presented to demonstrate the effectiveness of ArchExplorer in summarizing design principles and selecting better-performing architectures.
Collapse
|
14
|
Bertucci D, Hamid MM, Anand Y, Ruangrotsakun A, Tabatabai D, Perez M, Kahng M. DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:320-330. [PMID: 36166545 DOI: 10.1109/tvcg.2022.3209425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this paper, we present DendroMap, a novel approach to interactively exploring large-scale image datasets for machine learning (ML). ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets because images are ineffectively organized and interactions are insufficiently supported. To address these challenges, we develop DendroMap by adapting Treemaps, a well-known visualization technique. DendroMap effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images. It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests at multiple levels of abstraction. Our case studies with widely-used image datasets for deep learning demonstrate that users can discover insights about datasets and trained models by examining the diversity of images, identifying underperforming subgroups, and analyzing classification errors. We conducted a user study that evaluates the effectiveness of DendroMap in grouping and searching tasks by comparing it with a gridified version of t-SNE and found that participants preferred DendroMap. DendroMap is available at https://div-lab.github.io/dendromap/.
Collapse
|
15
|
Wu A, Deng D, Cheng F, Wu Y, Liu S, Qu H. In Defence of Visual Analytics Systems: Replies to Critics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1026-1036. [PMID: 36179000 DOI: 10.1109/tvcg.2022.3209360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The last decade has witnessed many visual analytics (VA) systems that make successful applications to wide-ranging domains like urban analytics and explainable AI. However, their research rigor and contributions have been extensively challenged within the visualization community. We come in defence of VA systems by contributing two interview studies for gathering critics and responses to those criticisms. First, we interview 24 researchers to collect criticisms the review comments on their VA work. Through an iterative coding and refinement process, the interview feedback is summarized into a list of 36 common criticisms. Second, we interview 17 researchers to validate our list and collect their responses, thereby discussing implications for defending and improving the scientific values and rigor of VA systems. We highlight that the presented knowledge is deep, extensive, but also imperfect, provocative, and controversial, and thus recommend reading with an inclusive and critical eye. We hope our work can provide thoughts and foundations for conducting VA research and spark discussions to promote the research field forward more rigorously and vibrantly.
Collapse
|
16
|
Wang J, Zhang W, Yang H, Yeh CCM, Wang L. Visual Analytics for RNN-Based Deep Reinforcement Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4141-4155. [PMID: 33929961 DOI: 10.1109/tvcg.2021.3076749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep reinforcement learning (DRL) targets to train an autonomous agent to interact with a pre-defined environment and strives to achieve specific goals through deep neural networks (DNN). Recurrent neural network (RNN) based DRL has demonstrated superior performance, as RNNs can effectively capture the temporal evolution of the environment and respond with proper agent actions. However, apart from the outstanding performance, little is known about how RNNs understand the environment internally and what has been memorized over time. Revealing these details is extremely important for deep learning experts to understand and improve DRLs, which in contrast, is also challenging due to the complicated data transformations inside these models. In this article, we propose Deep Reinforcement Learning Interactive Visual Explorer (DRLIVE), a visual analytics system to effectively explore, interpret, and diagnose RNN-based DRLs. Having focused on DRL agents trained for different Atari games, DRLIVE accomplishes three tasks: game episode exploration, RNN hidden/cell state examination, and interactive model perturbation. Using the system, one can flexibly explore a DRL agent through interactive visualizations, discover interpretable RNN cells by prioritizing RNN hidden/cell states with a set of metrics, and further diagnose the DRL model by interactively perturbing its inputs. Through concrete studies with multiple deep learning experts, we validated the efficacy of DRLIVE.
Collapse
|
17
|
Li Q, Wei X, Lin H, Liu Y, Chen T, Ma X. Inspecting the Running Process of Horizontal Federated Learning via Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4085-4100. [PMID: 33872152 DOI: 10.1109/tvcg.2021.3074010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As a decentralized training approach, horizontal federated learning (HFL) enables distributed clients to collaboratively learn a machine learning model while keeping personal/private information on local devices. Despite the enhanced performance and efficiency of HFL over local training, clues for inspecting the behaviors of the participating clients and the federated model are usually lacking due to the privacy-preserving nature of HFL. Consequently, the users can only conduct a shallow-level analysis of potential abnormal behaviors and have limited means to assess the contributions of individual clients and implement the necessary intervention. Visualization techniques have been introduced to facilitate the HFL process inspection, usually by providing model metrics and evaluation results as a dashboard representation. Although the existing visualization methods allow a simple examination of the HFL model performance, they cannot support the intensive exploration of the HFL process. In this article, strictly following the HFL privacy-preserving protocol, we design an exploratory visual analytics system for the HFL process termed HFLens, which supports comparative visual interpretation at the overview, communication round, and client instance levels. Specifically, the proposed system facilitates the investigation of the overall process involving all clients, the correlation analysis of clients' information in one or different communication round(s), the identification of potential anomalies, and the contribution assessment of each HFL client. Two case studies confirm the efficacy of our system. Experts' feedback suggests that our approach indeed helps in understanding and diagnosing the HFL process better.
Collapse
|
18
|
Effect of display platforms on spatial knowledge acquisition and engagement: an evaluation with 3D geometry visualizations. J Vis (Tokyo) 2022. [DOI: 10.1007/s12650-022-00889-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
19
|
Getting over High-Dimensionality: How Multidimensional Projection Methods Can Assist Data Science. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136799] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The exploration and analysis of multidimensional data can be pretty complex tasks, requiring sophisticated tools able to transform large amounts of data bearing multiple parameters into helpful information. Multidimensional projection techniques figure as powerful tools for transforming multidimensional data into visual information according to similarity features. Integrating this class of methods into a framework devoted to data sciences can contribute to generating more expressive means of visual analytics. Although the Principal Component Analysis (PCA) is a well-known method in this context, it is not the only one, and, sometimes, its abilities and limitations are not adequately discussed or taken into consideration by users. Therefore, knowing in-depth multidimensional projection techniques, their strengths, and the possible distortions they can create is of significant importance for researchers developing knowledge-discovery systems. This research presents a comprehensive overview of current state-of-the-art multidimensional projection techniques and shows example codes in Python and R languages, all available on the internet. The survey segment discusses the different types of techniques applied to multidimensional projection tasks from their background, application processes, capabilities, and limitations, opening the internal processes of the methods and demystifying their concepts. We also illustrate two problems, from a genetic experiment (supervised) and text mining (non-supervised), presenting solutions through multidimensional projection application. Finally, we brought elements that reverberate the competitiveness of multidimensional projection techniques towards high-dimension data visualization, commonly needed in data sciences solutions.
Collapse
|
20
|
Dai T, Arulkumaran K, Gerbert T, Tukra S, Behbahani F, Bharath AA. Analysing deep reinforcement learning agents trained with domain randomisation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
21
|
Zhang G. Research on safety simulation model and algorithm of dynamic system based on artificial neural network. Soft comput 2022. [DOI: 10.1007/s00500-022-07299-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Xuan X, Zhang X, Kwon OH, Ma KL. VAC-CNN: A Visual Analytics System for Comparative Studies of Deep Convolutional Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:2326-2337. [PMID: 35389868 DOI: 10.1109/tvcg.2022.3165347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The rapid development of Convolutional Neural Networks (CNNs) in recent years has triggered significant breakthroughs in many machine learning (ML) applications. The ability to understand and compare various CNN models available is thus essential. The conventional approach with visualizing each model's quantitative features, such as classification accuracy and computational complexity, is not sufficient for a deeper understanding and comparison of the behaviors of different models. Moreover, most of the existing tools for assessing CNN behaviors only support comparison between two models and lack the flexibility of customizing the analysis tasks according to user needs. This paper presents a visual analytics system, VAC-CNN (Visual Analytics for Comparing CNNs), that supports the in-depth inspection of a single CNN model as well as comparative studies of two or more models. The ability to compare a larger number of (e.g., tens of) models especially distinguishes our system from previous ones. With a carefully designed model visualization and explaining support, VAC-CNN facilitates a highly interactive workflow that promptly presents both quantitative and qualitative information at each analysis stage. We demonstrate VAC-CNN's effectiveness for assisting novice ML practitioners in evaluating and comparing multiple CNN models through two use cases and one preliminary evaluation study using the image classification tasks on the ImageNet dataset.
Collapse
|
23
|
Jamonnak S, Zhao Y, Huang X, Amiruzzaman M. Geo-Context Aware Study of Vision-Based Autonomous Driving Models and Spatial Video Data. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1019-1029. [PMID: 34596546 DOI: 10.1109/tvcg.2021.3114853] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Vision-based deep learning (DL) methods have made great progress in learning autonomous driving models from large-scale crowd-sourced video datasets. They are trained to predict instantaneous driving behaviors from video data captured by on-vehicle cameras. In this paper, we develop a geo-context aware visualization system for the study of Autonomous Driving Model (ADM) predictions together with large-scale ADM video data. The visual study is seamlessly integrated with the geographical environment by combining DL model performance with geospatial visualization techniques. Model performance measures can be studied together with a set of geospatial attributes over map views. Users can also discover and compare prediction behaviors of multiple DL models in both city-wide and street-level analysis, together with road images and video contents. Therefore, the system provides a new visual exploration platform for DL model designers in autonomous driving. Use cases and domain expert evaluation show the utility and effectiveness of the visualization system.
Collapse
|
24
|
He W, Zou L, Shekar AK, Gou L, Ren L. Where Can We Help? A Visual Analytics Approach to Diagnosing and Improving Semantic Segmentation of Movable Objects. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1040-1050. [PMID: 34587077 DOI: 10.1109/tvcg.2021.3114855] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Semantic segmentation is a critical component in autonomous driving and has to be thoroughly evaluated due to safety concerns. Deep neural network (DNN) based semantic segmentation models are widely used in autonomous driving. However, it is challenging to evaluate DNN-based models due to their black-box-like nature, and it is even more difficult to assess model performance for crucial objects, such as lost cargos and pedestrians, in autonomous driving applications. In this work, we propose VASS, a Visual Analytics approach to diagnosing and improving the accuracy and robustness of Semantic Segmentation models, especially for critical objects moving in various driving scenes. The key component of our approach is a context-aware spatial representation learning that extracts important spatial information of objects, such as position, size, and aspect ratio, with respect to given scene contexts. Based on this spatial representation, we first use it to create visual summarization to analyze models' performance. We then use it to guide the generation of adversarial examples to evaluate models' spatial robustness and obtain actionable insights. We demonstrate the effectiveness of VASS via two case studies of lost cargo detection and pedestrian detection in autonomous driving. For both cases, we show quantitative evaluation on the improvement of models' performance with actionable insights obtained from VASS.
Collapse
|
25
|
Meng L, Wei Y, Pan R, Zhou S, Zhang J, Chen W. VADAF: Visualization for Abnormal Client Detection and Analysis in Federated Learning. ACM T INTERACT INTEL 2021. [DOI: 10.1145/3426866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Federated Learning (FL) provides a powerful solution to distributed machine learning on a large corpus of decentralized data. It ensures privacy and security by performing computation on devices (which we refer to as clients) based on local data to improve the shared global model. However, the inaccessibility of the data and the invisibility of the computation make it challenging to interpret and analyze the training process, especially to distinguish potential client anomalies. Identifying these anomalies can help experts diagnose and improve FL models. For this reason, we propose a visual analytics system, VADAF, to depict the training dynamics and facilitate analyzing potential client anomalies. Specifically, we design a visualization scheme that supports massive training dynamics in the FL environment. Moreover, we introduce an anomaly detection method to detect potential client anomalies, which are further analyzed based on both the client model’s visual and objective estimation. Three case studies have demonstrated the effectiveness of our system in understanding the FL training process and supporting abnormal client detection and analysis.
Collapse
Affiliation(s)
- Linhao Meng
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Yating Wei
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Rusheng Pan
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Shuyue Zhou
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Jianwei Zhang
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| | - Wei Chen
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
| |
Collapse
|
26
|
Hinterreiter A, Steinparz C, SchÖfl M, Stitz H, Streit M. Projection Path Explorer: Exploring Visual Patterns in Projected Decision-making Paths. ACM T INTERACT INTEL 2021. [DOI: 10.1145/3387165] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In problem-solving, a path towards a solutions can be viewed as a sequence of decisions. The decisions, made by humans or computers, describe a trajectory through a high-dimensional representation space of the problem. By means of dimensionality reduction, these trajectories can be visualized in lower-dimensional space. Such embedded trajectories have previously been applied to a wide variety of data, but analysis has focused almost exclusively on the self-similarity of single trajectories. In contrast, we describe patterns emerging from drawing many trajectories—for different initial conditions, end states, and solution strategies—in the same embedding space. We argue that general statements about the problem-solving tasks and solving strategies can be made by interpreting these patterns. We explore and characterize such patterns in trajectories resulting from human and machine-made decisions in a variety of application domains: logic puzzles (Rubik’s cube), strategy games (chess), and optimization problems (neural network training). We also discuss the importance of suitably chosen representation spaces and similarity metrics for the embedding.
Collapse
Affiliation(s)
- Andreas Hinterreiter
- Johannes Kepler University Linz, Austria and Imperial College London, London, UK
| | | | | | - Holger Stitz
- Johannes Kepler University Linz, Austria and datavisyn GmbH, Austria
| | - Marc Streit
- Johannes Kepler University Linz, Linz, Austria
| |
Collapse
|
27
|
Ferreira MD, Cantareira GD, de Mello RF, Paulovich FV. Neural network training fingerprint: visual analytics of the training process in classification neural networks. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-021-00809-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
28
|
van der Stouwe AMM, Tuitert I, Giotis I, Calon J, Gannamani R, Dalenberg JR, van der Veen S, Klamer MR, Telea AC, Tijssen MAJ. Next move in movement disorders (NEMO): developing a computer-aided classification tool for hyperkinetic movement disorders. BMJ Open 2021; 11:e055068. [PMID: 34635535 PMCID: PMC8506849 DOI: 10.1136/bmjopen-2021-055068] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 09/28/2021] [Indexed: 11/05/2022] Open
Abstract
INTRODUCTION Our aim is to develop a novel approach to hyperkinetic movement disorder classification, that combines clinical information, electromyography, accelerometry and video in a computer-aided classification tool. We see this as the next step towards rapid and accurate phenotype classification, the cornerstone of both the diagnostic and treatment process. METHODS AND ANALYSIS The Next Move in Movement Disorders (NEMO) study is a cross-sectional study at Expertise Centre Movement Disorders Groningen, University Medical Centre Groningen. It comprises patients with single and mixed phenotype movement disorders. Single phenotype groups will first include dystonia, myoclonus and tremor, and then chorea, tics, ataxia and spasticity. Mixed phenotypes are myoclonus-dystonia, dystonic tremor, myoclonus ataxia and jerky/tremulous functional movement disorders. Groups will contain 20 patients, or 40 healthy participants. The gold standard for inclusion consists of interobserver agreement on the phenotype among three independent clinical experts. Electromyography, accelerometry and three-dimensional video data will be recorded during performance of a set of movement tasks, chosen by a team of specialists to elicit movement disorders. These data will serve as input for the machine learning algorithm. Labels for supervised learning are provided by the expert-based classification, allowing the algorithm to learn to predict what the output label should be when given new input data. Methods using manually engineered features based on existing clinical knowledge will be used, as well as deep learning methods which can detect relevant and possibly new features. Finally, we will employ visual analytics to visualise how the classification algorithm arrives at its decision. ETHICS AND DISSEMINATION Ethical approval has been obtained from the relevant local ethics committee. The NEMO study is designed to pioneer the application of machine learning of movement disorders. We expect to publish articles in multiple related fields of research and patients will be informed of important results via patient associations and press releases.
Collapse
Affiliation(s)
- A M Madelein van der Stouwe
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Inge Tuitert
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Ioannis Giotis
- ZiuZ Visual Intelligence BV, Gorredijk, Groningen, The Netherlands
| | - Joost Calon
- ZiuZ Visual Intelligence BV, Gorredijk, Groningen, The Netherlands
| | - Rahul Gannamani
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Jelle R Dalenberg
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Sterre van der Veen
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Marrit R Klamer
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
- ZiuZ Visual Intelligence BV, Gorredijk, Groningen, The Netherlands
| | - Alex C Telea
- Department of Information and Computing Sciences, University of Utrecht, Utrecht, The Netherlands
| | - Marina A J Tijssen
- Department of Neurology, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands
- Expertise Centre Movement Disorders Groningen, University Medical Center Groningen, Groningen, The Netherlands
| |
Collapse
|
29
|
Osaku D, Gomes J, Falcão A. Convolutional neural network simplification with progressive retraining. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.06.032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
30
|
Garcia R, Munz T, Weiskopf D. Visual analytics tool for the interpretation of hidden states in recurrent neural networks. Vis Comput Ind Biomed Art 2021; 4:24. [PMID: 34585277 PMCID: PMC8479019 DOI: 10.1186/s42492-021-00090-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 08/12/2021] [Indexed: 11/29/2022] Open
Abstract
In this paper, we introduce a visual analytics approach aimed at helping machine learning experts analyze the hidden states of layers in recurrent neural networks. Our technique allows the user to interactively inspect how hidden states store and process information throughout the feeding of an input sequence into the network. The technique can help answer questions, such as which parts of the input data have a higher impact on the prediction and how the model correlates each hidden state configuration with a certain output. Our visual analytics approach comprises several components: First, our input visualization shows the input sequence and how it relates to the output (using color coding). In addition, hidden states are visualized through a nonlinear projection into a 2-D visualization space using t-distributed stochastic neighbor embedding to understand the shape of the space of the hidden states. Trajectories are also employed to show the details of the evolution of the hidden state configurations. Finally, a time-multi-class heatmap matrix visualizes the evolution of the expected predictions for multi-class classifiers, and a histogram indicates the distances between the hidden states within the original space. The different visualizations are shown simultaneously in multiple views and support brushing-and-linking to facilitate the analysis of the classifications and debugging for misclassified input sequences. To demonstrate the capability of our approach, we discuss two typical use cases for long short-term memory models applied to two widely used natural language processing datasets.
Collapse
Affiliation(s)
- Rafael Garcia
- VISUS, University of Stuttgart, 70569, Stuttgart, Germany
| | - Tanja Munz
- VISUS, University of Stuttgart, 70569, Stuttgart, Germany.
| | | |
Collapse
|
31
|
|
32
|
Chen C, Wang Z, Wu J, Wang X, Guo LZ, Li YF, Liu S. Interactive Graph Construction for Graph-Based Semi-Supervised Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3701-3716. [PMID: 34048346 DOI: 10.1109/tvcg.2021.3084694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Semi-supervised learning (SSL) provides a way to improve the performance of prediction models (e.g., classifier) via the usage of unlabeled samples. An effective and widely used method is to construct a graph that describes the relationship between labeled and unlabeled samples. Practical experience indicates that graph quality significantly affects the model performance. In this paper, we present a visual analysis method that interactively constructs a high-quality graph for better model performance. In particular, we propose an interactive graph construction method based on the large margin principle. We have developed a river visualization and a hybrid visualization that combines a scatterplot, a node-link diagram, and a bar chart to convey the label propagation of graph-based SSL. Based on the understanding of the propagation, a user can select regions of interest to inspect and modify the graph. We conducted two case studies to showcase how our method facilitates the exploitation of labeled and unlabeled samples for improving model performance.
Collapse
|
33
|
Hyperspectral and Lidar Data Applied to the Urban Land Cover Machine Learning and Neural-Network-Based Classification: A Review. REMOTE SENSING 2021. [DOI: 10.3390/rs13173393] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Rapid technological advances in airborne hyperspectral and lidar systems paved the way for using machine learning algorithms to map urban environments. Both hyperspectral and lidar systems can discriminate among many significant urban structures and materials properties, which are not recognizable by applying conventional RGB cameras. In most recent years, the fusion of hyperspectral and lidar sensors has overcome challenges related to the limits of active and passive remote sensing systems, providing promising results in urban land cover classification. This paper presents principles and key features for airborne hyperspectral imaging, lidar, and the fusion of those, as well as applications of these for urban land cover classification. In addition, machine learning and deep learning classification algorithms suitable for classifying individual urban classes such as buildings, vegetation, and roads have been reviewed, focusing on extracted features critical for classification of urban surfaces, transferability, dimensionality, and computational expense.
Collapse
|
34
|
Classification of Explainable Artificial Intelligence Methods through Their Output Formats. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3030032] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulations.
Collapse
|
35
|
Cao K, Liu M, Su H, Wu J, Zhu J, Liu S. Analyzing the Noise Robustness of Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3289-3304. [PMID: 31985427 DOI: 10.1109/tvcg.2020.2969185] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.
Collapse
|
36
|
Kleinbub JR, Testolin A, Palmieri A, Salvatore S. The phase space of meaning model of psychopathology: A computer simulation modelling study. PLoS One 2021; 16:e0249320. [PMID: 33901183 PMCID: PMC8075201 DOI: 10.1371/journal.pone.0249320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 03/16/2021] [Indexed: 11/18/2022] Open
Abstract
INTRODUCTION The hypothesis of a general psychopathology factor that underpins all common forms of mental disorders has been gaining momentum in contemporary clinical research and is known as the p factor hypothesis. Recently, a semiotic, embodied, and psychoanalytic conceptualisation of the p factor has been proposed called the Harmonium Model, which provides a computational account of such a construct. This research tested the core tenet of the Harmonium model, which is the idea that psychopathology can be conceptualised as due to poorly-modulable cognitive processes, and modelled the concept of Phase Space of Meaning (PSM) at the computational level. METHOD Two studies were performed, both based on a simulation design implementing a deep learning model, simulating a cognitive process: a classification task. The level of performance of the task was considered the simulated equivalent to the normality-psychopathology continuum, the dimensionality of the neural network's internal computational dynamics being the simulated equivalent of the PSM's dimensionality. RESULTS The neural networks' level of performance was shown to be associated with the characteristics of the internal computational dynamics, assumed to be the simulated equivalent of poorly-modulable cognitive processes. DISCUSSION Findings supported the hypothesis. They showed that the neural network's low performance was a matter of the combination of predicted characteristics of the neural networks' internal computational dynamics. Implications, limitations, and further research directions are discussed.
Collapse
Affiliation(s)
- Johann Roland Kleinbub
- Department of Philosophy, Sociology, Education, and Applied Psychology, University of Padua, Padua, Italy
| | - Alberto Testolin
- Department of General Psychology, University of Padova, Padua, Italy
- Department of Information Engineering, University of Padova, Padua, Italy
| | - Arianna Palmieri
- Department of Philosophy, Sociology, Education, and Applied Psychology, University of Padua, Padua, Italy
- Padova Neuroscience Center, University of Padua, Padua, Italy
| | - Sergio Salvatore
- Department of Dynamic and Clinical Psychology, and Health Studies, Sapienza Università di Roma, Rome, Italy
| |
Collapse
|
37
|
Dmitriev K, Marino J, Baker K, Kaufman AE. Visual Analytics of a Computer-Aided Diagnosis System for Pancreatic Lesions. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:2174-2185. [PMID: 31613771 DOI: 10.1109/tvcg.2019.2947037] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Machine learning is a powerful and effective tool for medical image analysis to perform computer-aided diagnosis (CAD). Having great potential in improving the accuracy of a diagnosis, CAD systems are often analyzed in terms of the final accuracy, leading to a limited understanding of the internal decision process, impossibility to gain insights, and ultimately to skepticism from clinicians. We present a visual analytics approach to uncover the decision-making process of a CAD system for classifying pancreatic cystic lesions. This CAD algorithm consists of two distinct components: random forest (RF), which classifies a set of predefined features, including demographic features, and a convolutional neural network (CNN), which analyzes radiological (imaging) features of the lesions. We study the class probabilities generated by the RF and the semantical meaning of the features learned by the CNN. We also use an eye tracker to better understand which radiological features are particularly useful for a radiologist to make a diagnosis and to quantitatively compare with the features that lead the CNN to its final classification decision. Additionally, we evaluate the effects and benefits of supplying the CAD system with a case-based visual aid in a second-reader setting.
Collapse
|
38
|
Huang X, Jamonnak S, Zhao Y, Wang B, Hoai M, Yager K, Xu W. Interactive Visual Study of Multiple Attributes Learning Model of X-Ray Scattering Images. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1312-1321. [PMID: 33104509 DOI: 10.1109/tvcg.2020.3030384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images. However, visual analytics tools are lacking for the specific application of x-ray image classification with multiple structural attributes. In this paper, we present an interactive system for domain scientists to visually study the multiple attributes learning models applied to x-ray scattering images. It allows domain scientists to interactively explore this important type of scientific images in embedded spaces that are defined on the model prediction output, the actual labels, and the discovered feature space of neural networks. Users are allowed to flexibly select instance images, their clusters, and compare them regarding the specified visual representation of attributes. The exploration is guided by the manifestation of model performance related to mutual relationships among attributes, which often affect the learning accuracy and effectiveness. The system thus supports domain scientists to improve the training dataset and model, find questionable attributes labels, and identify outlier images or spurious data clusters. Case studies and scientists feedback demonstrate its functionalities and usefulness.
Collapse
|
39
|
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R. A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1385-1395. [PMID: 33035164 DOI: 10.1109/tvcg.2020.3028888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Collapse
|
40
|
Neto MP, Paulovich FV. Explainable Matrix - Visualization for Global and Local Interpretability of Random Forest Classification Ensembles. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1427-1437. [PMID: 33048689 DOI: 10.1109/tvcg.2020.3030354] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Over the past decades, classification models have proven to be essential machine learning tools given their potential and applicability in various domains. In these years, the north of the majority of the researchers had been to improve quantitative metrics, notwithstanding the lack of information about models' decisions such metrics convey. This paradigm has recently shifted, and strategies beyond tables and numbers to assist in interpreting models' decisions are increasing in importance. Part of this trend, visualization techniques have been extensively used to support classification models' interpretability, with a significant focus on rule-based models. Despite the advances, the existing approaches present limitations in terms of visual scalability, and the visualization of large and complex models, such as the ones produced by the Random Forest (RF) technique, remains a challenge. In this paper, we propose Explainable Matrix (ExMatrix), a novel visualization method for RF interpretability that can handle models with massive quantities of rules. It employs a simple yet powerful matrix-like visual metaphor, where rows are rules, columns are features, and cells are rules predicates, enabling the analysis of entire models and auditing classification results. ExMatrix applicability is confirmed via different examples, showing how it can be used in practice to promote RF models interpretability.
Collapse
|
41
|
Wang Q, Alexander W, Pegg J, Qu H, Chen M. HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1417-1426. [PMID: 33048739 DOI: 10.1109/tvcg.2020.3030449] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we present a visual analytics tool for enabling hypothesis-based evaluation of machine learning (ML) models. We describe a novel ML-testing framework that combines the traditional statistical hypothesis testing (commonly used in empirical research) with logical reasoning about the conclusions of multiple hypotheses. The framework defines a controlled configuration for testing a number of hypotheses as to whether and how some extra information about a "concept" or "feature" may benefit or hinder an ML model. Because reasoning multiple hypotheses is not always straightforward, we provide HypoML as a visual analysis tool, with which, the multi-thread testing results are first transformed to analytical results using statistical and logical inferences, and then to a visual representation for rapid observation of the conclusions and the logical flow between the testing results and hypotheses. We have applied HypoML to a number of hypothesized concepts, demonstrating the intuitive and explainable nature of the visual analysis.
Collapse
|
42
|
Pal NR. In Search of Trustworthy and Transparent Intelligent Systems With Human-Like Cognitive and Reasoning Capabilities. Front Robot AI 2021; 7:76. [PMID: 33501243 PMCID: PMC7806014 DOI: 10.3389/frobt.2020.00076] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 05/07/2020] [Indexed: 11/25/2022] Open
Abstract
At present we are witnessing a tremendous interest in Artificial Intelligence (AI), particularly in Deep Learning (DL)/Deep Neural Networks (DNNs). One of the reasons appears to be the unmatched performance achieved by such systems. This has resulted in an enormous hope on such techniques and often these are viewed as all—cure solutions. But most of these systems cannot explain why a particular decision is made (black box) and sometimes miserably fail in cases where other systems would not. Consequently, in critical applications such as healthcare and defense practitioners do not like to trust such systems. Although an AI system is often designed taking inspiration from the brain, there is not much attempt to exploit cues from the brain in true sense. In our opinion, to realize intelligent systems with human like reasoning ability, we need to exploit knowledge from the brain science. Here we discuss a few findings in brain science that may help designing intelligent systems. We explain the relevance of transparency, explainability, learning from a few examples, and the trustworthiness of an AI system. We also discuss a few ways that may help to achieve these attributes in a learning system.
Collapse
Affiliation(s)
- Nikhil R Pal
- Indian Statistical Institute, Electronics and Communication Sciences Unit, The Centre for Artificial Intelligence and Machine Learning, Calcutta, India
| |
Collapse
|
43
|
Development of Models for Children—Pedestrian Crossing Speed at Signalized Crosswalks. SUSTAINABILITY 2021. [DOI: 10.3390/su13020777] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Modeling the behavior of pedestrians is an important tool in the analysis of their behavior and consequently ensuring the safety of pedestrian traffic. Children pedestrians show specific traffic behavior which is related to cognitive development, and the parameters that affect their traffic behavior are very different. The aim of this paper is to develop a model of the children-pedestrian’s speed at a signalized pedestrian crosswalk. For the same set of data collected in the city of Osijek—Croatia, two models were developed based on neural network and multiple linear regression. In both cases the models are based on 300 data of measured children speed at signalized pedestrian crosswalks on primary city roads located near a primary school. As parameters, both models include the selected traffic infrastructure features and children’s characteristics and their movements. The models are validated on data collected on the same type of pedestrian crosswalks, using the same methodology in two other urban environments—the city of Rijeka, Croatia and Enna in Italy. It was shown that the neural network model, developed for Osijek, can be applied with sufficient reliability to the other two cities, while the multiple linear regression model is applicable with relatively satisfactory reliability only in Rijeka. A comparative analysis of the statistical indicators of reliability of these two models showed that better results are achieved by the neural network model.
Collapse
|
44
|
Wang Q, Yuan J, Chen S, Su H, Qu H, Liu S. Visual Genealogy of Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:3340-3352. [PMID: 31180859 DOI: 10.1109/tvcg.2019.2921323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A comprehensive and comprehensible summary of existing deep neural networks (DNNs) helps practitioners understand the behaviour and evolution of DNNs, offers insights for architecture optimization, and sheds light on the working mechanisms of DNNs. However, this summary is hard to obtain because of the complexity and diversity of DNN architectures. To address this issue, we develop DNN Genealogy, an interactive visualization tool, to offer a visual summary of representative DNNs and their evolutionary relationships. DNN Genealogy enables users to learn DNNs from multiple aspects, including architecture, performance, and evolutionary relationships. Central to this tool is a systematic analysis and visualization of 66 representative DNNs based on our analysis of 140 papers. A directed acyclic graph is used to illustrate the evolutionary relationships among these DNNs and highlight the representative DNNs. A focus + context visualization is developed to orient users during their exploration. A set of network glyphs is used in the graph to facilitate the understanding and comparing of DNNs in the context of the evolution. Case studies demonstrate that DNN Genealogy provides helpful guidance in understanding, applying, and optimizing DNNs. DNN Genealogy is extensible and will continue to be updated to reflect future advances in DNNs.
Collapse
|
45
|
Abstract
Deep Neural Networks are known for impressive results in a wide range of applications, being responsible for many advances in technology over the past few years. However, debugging and understanding neural networks models’ inner workings is a complex task, as there are several parameters and variables involved in every decision. Multidimensional projection techniques have been successfully adopted to display neural network hidden layer outputs in an explainable manner, but comparing different outputs often means overlapping projections or observing them side-by-side, presenting hurdles for users in properly conveying data flow. In this paper, we introduce a novel approach for comparing projections obtained from multiple stages in a neural network model and visualizing differences in data perception. Changes among projections are transformed into trajectories that, in turn, generate vector fields used to represent the general flow of information. This representation can then be used to create layouts that highlight new information about abstract structures identified by neural networks.
Collapse
|
46
|
Marcilio-Jr WE, Eler DM. SADIRE: a context-preserving sampling technique for dimensionality reduction visualizations. J Vis (Tokyo) 2020. [DOI: 10.1007/s12650-020-00685-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
47
|
Abstract
Although animated characters are based on human features, these features are exaggerated. These exaggerations greatly differ by country, gender, and the character’s role in the story. This study investigated the characteristics of US and Japanese character designs and the similarities and differences or even the differences in exaggerations between them. In particular, these similarities and differences can be used to formulate a shared set of principles for US and Japanese animated character designs; 90 Japanese and 90 US cartoon characters were analyzed. Lengths for 20 parts of the body were obtained for prototypical real human bodies and animated characters from Japan and the United States. The distributions of lengths were determined, for all characters and for characters as segmented by country, gender, and the character’s role in the story. We also compared the body part lengths of animated characters and prototypical real human bodies, noting whether exaggerations were towards augmentation or diminishment. In addition, a decision tree classification method was used to determine the required body length parameters for identifying the classification conditions of animated characters by country, gender, and character’s role in the story. The results indicated that both US and Japanese male animated characters tend to feature exaggerations in head and body sizes, with exaggerations for US characters being more obvious. The decision tree only required five length parameters of the head and chest to distinguish between US and Japanese animated characters (accuracy = 94.48% and 67.46% for the training and testing groups, respectively). Through a decision tree method, this study quantitatively revealed the exaggeration patterns in animated characters and their differences by country, gender, and character’s role in the story. The results serve as a reference for designers and researchers of animated character model designs with regards to quantifying and classifying character exaggerations.
Collapse
|
48
|
Zurowietz M, Nattkemper TW. An Interactive Visualization for Feature Localization in Deep Neural Networks. Front Artif Intell 2020; 3:49. [PMID: 33733166 PMCID: PMC7861262 DOI: 10.3389/frai.2020.00049] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 06/15/2020] [Indexed: 11/13/2022] Open
Abstract
Deep artificial neural networks have become the go-to method for many machine learning tasks. In the field of computer vision, deep convolutional neural networks achieve state-of-the-art performance for tasks such as classification, object detection, or instance segmentation. As deep neural networks become more and more complex, their inner workings become more and more opaque, rendering them a "black box" whose decision making process is no longer comprehensible. In recent years, various methods have been presented that attempt to peek inside the black box and to visualize the inner workings of deep neural networks, with a focus on deep convolutional neural networks for computer vision. These methods can serve as a toolbox to facilitate the design and inspection of neural networks for computer vision and the interpretation of the decision making process of the network. Here, we present the new tool Interactive Feature Localization in Deep neural networks (IFeaLiD) which provides a novel visualization approach to convolutional neural network layers. The tool interprets neural network layers as multivariate feature maps and visualizes the similarity between the feature vectors of individual pixels of an input image in a heat map display. The similarity display can reveal how the input image is perceived by different layers of the network and how the perception of one particular image region compares to the perception of the remaining image. IFeaLiD runs interactively in a web browser and can process even high resolution feature maps in real time by using GPU acceleration with WebGL 2. We present examples from four computer vision datasets with feature maps from different layers of a pre-trained ResNet101. IFeaLiD is open source and available online at https://ifealid.cebitec.uni-bielefeld.de.
Collapse
Affiliation(s)
- Martin Zurowietz
- Biodata Mining Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Tim W Nattkemper
- Biodata Mining Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
49
|
Pondenkandath V, Alberti M, Eichenberger N, Ingold R, Liwicki M. Cross-Depicted Historical Motif Categorization and Retrieval with Deep Learning. J Imaging 2020; 6:jimaging6070071. [PMID: 34460664 PMCID: PMC8321079 DOI: 10.3390/jimaging6070071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 06/30/2020] [Accepted: 07/13/2020] [Indexed: 11/26/2022] Open
Abstract
In this paper, we tackle the problem of categorizing and identifying cross-depicted historical motifs using recent deep learning techniques, with aim of developing a content-based image retrieval system. As cross-depiction, we understand the problem that the same object can be represented (depicted) in various ways. The objects of interest in this research are watermarks, which are crucial for dating manuscripts. For watermarks, cross-depiction arises due to two reasons: (i) there are many similar representations of the same motif, and (ii) there are several ways of capturing the watermarks, i.e., as the watermarks are not visible on a scan or photograph, the watermarks are typically retrieved via hand tracing, rubbing, or special photographic techniques. This leads to different representations of the same (or similar) objects, making it hard for pattern recognition methods to recognize the watermarks. While this is a simple problem for human experts, computer vision techniques have problems generalizing from the various depiction possibilities. In this paper, we present a study where we use deep neural networks for categorization of watermarks with varying levels of detail. The macro-averaged F1-score on an imbalanced 12 category classification task is 88.3%, the multi-labelling performance (Jaccard Index) on a 622 label task is 79.5%. To analyze the usefulness of an image-based system for assisting humanities scholars in cataloguing manuscripts, we also measure the performance of similarity matching on expert-crafted test sets of varying sizes (50 and 1000 watermark samples). A significant outcome is that all relevant results belonging to the same super-class are found by our system (Mean Average Precision of 100%), despite the cross-depicted nature of the motifs. This result has not been achieved in the literature so far.
Collapse
Affiliation(s)
- Vinaychandran Pondenkandath
- Document, Image and Video Analysis Group (DIVA), University of Fribourg, 1700 Fribourg, Switzerland; (M.A.); (R.I.)
- Correspondence:
| | - Michele Alberti
- Document, Image and Video Analysis Group (DIVA), University of Fribourg, 1700 Fribourg, Switzerland; (M.A.); (R.I.)
| | | | - Rolf Ingold
- Document, Image and Video Analysis Group (DIVA), University of Fribourg, 1700 Fribourg, Switzerland; (M.A.); (R.I.)
| | - Marcus Liwicki
- EISLAB Machine Learning, Luleå University of Technology, 97187 Luleå, Sweden;
| |
Collapse
|
50
|
Hou BJ, Zhou ZH. Learning With Interpretable Structure From Gated RNN. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2267-2279. [PMID: 32071002 DOI: 10.1109/tnnls.2020.2967051] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The interpretability of deep learning models has raised extended attention these years. It will be beneficial if we can learn an interpretable structure from deep learning models. In this article, we focus on recurrent neural networks (RNNs), especially gated RNNs whose inner mechanism is still not clearly understood. We find that finite-state automaton (FSA) that processes sequential data have a more interpretable inner mechanism according to the definition of interpretability and can be learned from RNNs as the interpretable structure. We propose two methods to learn FSA from RNN based on two different clustering methods. With the learned FSA and via experiments on artificial and real data sets, we find that FSA is more trustable than the RNN from which it learned, which gives FSA a chance to substitute RNNs in applications involving humans' lives or dangerous facilities. Besides, we analyze how the number of gates affects the performance of RNN. Our result suggests that gate in RNN is important but the less the better, which could be a guidance to design other RNNs. Finally, we observe that the FSA learned from RNN gives semantic aggregated states, and its transition graph shows us a very interesting vision of how RNNs intrinsically handle text classification tasks.
Collapse
|