1
|
Zhao J, Liu X, Tang H, Wang X, Yang S, Liu D, Chen Y, Chen YV. Mesoscopic structure graphs for interpreting uncertainty in non-linear embeddings. Comput Biol Med 2024; 182:109105. [PMID: 39265479 DOI: 10.1016/j.compbiomed.2024.109105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 07/06/2024] [Accepted: 09/01/2024] [Indexed: 09/14/2024]
Abstract
Probabilistic-based non-linear dimensionality reduction (PB-NL-DR) methods, such as t-SNE and UMAP, are effective in unfolding complex high-dimensional manifolds, allowing users to explore and understand the structural patterns of data. However, due to the trade-off between global and local structure preservation and the randomness during computation, these methods may introduce false neighborhood relationships, known as distortion errors and misleading visualizations. To address this issue, we first conduct a detailed survey to illustrate the design space of prior layout enrichment visualizations for interpreting DR results, and then propose a node-link visualization technique, ManiGraph. This technique rethinks the neighborhood fidelity between the high- and low-dimensional spaces by constructing dynamic mesoscopic structure graphs and measuring region-adapted trustworthiness. ManiGraph also addresses the overplotting issue in scatterplot visualization for large-scale datasets and supports examining in unsupervised scenarios. We demonstrate the effectiveness of ManiGraph in different analytical cases, including generic machine learning using 3D toy data illustrations and fashion-MNIST, a computational biology study using a single-cell RNA sequencing dataset, and a deep learning-enabled colorectal cancer study with histopathology-MNIST.
Collapse
Affiliation(s)
- Junhan Zhao
- Harvard Medical School, Boston, 02114, MA, USA; Harvard T.H.Chan School of Public Health, Boston, 02114, MA, USA; Purdue University, West Lafayette, 47907, IN, USA.
| | - Xiang Liu
- Purdue University, West Lafayette, 47907, IN, USA; Indiana University School of Medicine, Indianapolis, 46202, IN, USA.
| | - Hongping Tang
- Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, 518048, China.
| | - Xiyue Wang
- Stanford University School of Medicine, Stanford, 94304, CA, USA.
| | - Sen Yang
- Stanford University School of Medicine, Stanford, 94304, CA, USA.
| | - Donfang Liu
- Rochester Institute of Technology, Rochester, 14623, NY, USA.
| | - Yijiang Chen
- Stanford University School of Medicine, Stanford, 94304, CA, USA.
| | | |
Collapse
|
2
|
Hong J, Maciejewski R, Trubuil A, Isenberg T. Visualizing and Comparing Machine Learning Predictions to Improve Human-AI Teaming on the Example of Cell Lineage. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:1956-1969. [PMID: 37665712 DOI: 10.1109/tvcg.2023.3302308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
Abstract
We visualize the predictions of multiple machine learning models to help biologists as they interactively make decisions about cell lineage-the development of a (plant) embryo from a single ovum cell. Based on a confocal microscopy dataset, traditionally biologists manually constructed the cell lineage, starting from this observation and reasoning backward in time to establish their inheritance. To speed up this tedious process, we make use of machine learning (ML) models trained on a database of manually established cell lineages to assist the biologist in cell assignment. Most biologists, however, are not familiar with ML, nor is it clear to them which model best predicts the embryo's development. We thus have developed a visualization system that is designed to support biologists in exploring and comparing ML models, checking the model predictions, detecting possible ML model mistakes, and deciding on the most likely embryo development. To evaluate our proposed system, we deployed our interface with six biologists in an observational study. Our results show that the visual representations of machine learning are easily understandable, and our tool, LineageD+, could potentially increase biologists' working efficiency and enhance the understanding of embryos.
Collapse
|
3
|
Ye Z, Chen M. Visualizing Ensemble Predictions of Music Mood. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:864-874. [PMID: 36170399 DOI: 10.1109/tvcg.2022.3209379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Music mood classification has been a challenging problem in comparison with other music classification problems (e.g., genre, composer, or period). One solution for addressing this challenge is to use an ensemble of machine learning models. In this paper, we show that visualization techniques can effectively convey the popular prediction as well as uncertainty at different music sections along the temporal axis while enabling the analysis of individual ML models in conjunction with their application to different musical data. In addition to the traditional visual designs, such as stacked line graph, ThemeRiver, and pixel-based visualization, we introduce a new variant of ThemeRiver, called "dual-flux ThemeRiver", which allows viewers to observe and measure the most popular prediction more easily than stacked line graph and ThemeRiver. Together with pixel-based visualization, dual-flux ThemeRiver plots can also assist in model-development workflows, in addition to annotating music using ensemble model predictions.
Collapse
|
4
|
Yuan J, Liu M, Tian F, Liu S. Visual Analysis of Neural Architecture Spaces for Summarizing Design Principles. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:288-298. [PMID: 36191103 DOI: 10.1109/tvcg.2022.3209404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Recent advances in artificial intelligence largely benefit from better neural network architectures. These architectures are a product of a costly process of trial-and-error. To ease this process, we develop ArchExplorer, a visual analysis method for understanding a neural architecture space and summarizing design principles. The key idea behind our method is to make the architecture space explainable by exploiting structural distances between architectures. We formulate the pairwise distance calculation as solving an all-pairs shortest path problem. To improve efficiency, we decompose this problem into a set of single-source shortest path problems. The time complexity is reduced from O(kn2N) to O(knN). Architectures are hierarchically clustered according to the distances between them. A circle-packing-based architecture visualization has been developed to convey both the global relationships between clusters and local neighborhoods of the architectures in each cluster. Two case studies and a post-analysis are presented to demonstrate the effectiveness of ArchExplorer in summarizing design principles and selecting better-performing architectures.
Collapse
|
5
|
Wang Q, Chen Z, Wang Y, Qu H. A Survey on ML4VIS: Applying Machine Learning Advances to Data Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:5134-5153. [PMID: 34437063 DOI: 10.1109/tvcg.2021.3106142] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Inspired by the great success of machine learning (ML), researchers have applied ML techniques to visualizations to achieve a better design, development, and evaluation of visualizations. This branch of studies, known as ML4VIS, is gaining increasing research attention in recent years. To successfully adapt ML techniques for visualizations, a structured understanding of the integration of ML4VIS is needed. In this article, we systematically survey 88 ML4VIS studies, aiming to answer two motivating questions: "what visualization processes can be assisted by ML?" and "how ML techniques can be used to solve visualization problems? "This survey reveals seven main processes where the employment of ML techniques can benefit visualizations: Data Processing4VIS, Data-VIS Mapping, Insight Communication, Style Imitation, VIS Interaction, VIS Reading, and User Profiling. The seven processes are related to existing visualization theoretical models in an ML4VIS pipeline, aiming to illuminate the role of ML-assisted visualization in general visualizations. Meanwhile, the seven processes are mapped into main learning tasks in ML to align the capabilities of ML with the needs in visualization. Current practices and future opportunities of ML4VIS are discussed in the context of the ML4VIS pipeline and the ML-VIS mapping. While more studies are still needed in the area of ML4VIS, we hope this article can provide a stepping-stone for future exploration. A web-based interactive browser of this survey is available at https://ml4vis.github.io.
Collapse
|
6
|
Constructing Explainable Classifiers from the Start—Enabling Human-in-the Loop Machine Learning. INFORMATION 2022. [DOI: 10.3390/info13100464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Interactive machine learning (IML) enables the incorporation of human expertise because the human participates in the construction of the learned model. Moreover, with human-in-the-loop machine learning (HITL-ML), the human experts drive the learning, and they can steer the learning objective not only for accuracy but perhaps for characterisation and discrimination rules, where separating one class from others is the primary objective. Moreover, this interaction enables humans to explore and gain insights into the dataset as well as validate the learned models. Validation requires transparency and interpretable classifiers. The huge relevance of understandable classification has been recently emphasised for many applications under the banner of explainable artificial intelligence (XAI). We use parallel coordinates to deploy an IML system that enables the visualisation of decision tree classifiers but also the generation of interpretable splits beyond parallel axis splits. Moreover, we show that characterisation and discrimination rules are also well communicated using parallel coordinates. In particular, we report results from the largest usability study of a IML system, confirming the merits of our approach.
Collapse
|
7
|
Streeb D, Metz Y, Schlegel U, Schneider B, El-Assady M, Neth H, Chen M, Keim DA. Task-Based Visual Interactive Modeling: Decision Trees and Rule-Based Classifiers. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:3307-3323. [PMID: 33439846 DOI: 10.1109/tvcg.2020.3045560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Visual analytics enables the coupling of machine learning models and humans in a tightly integrated workflow, addressing various analysis tasks. Each task poses distinct demands to analysts and decision-makers. In this survey, we focus on one canonical technique for rule-based classification, namely decision tree classifiers. We provide an overview of available visualizations for decision trees with a focus on how visualizations differ with respect to 16 tasks. Further, we investigate the types of visual designs employed, and the quality measures presented. We find that (i) interactive visual analytics systems for classifier development offer a variety of visual designs, (ii) utilization tasks are sparsely covered, (iii) beyond classifier development, node-link diagrams are omnipresent, (iv) even systems designed for machine learning experts rarely feature visual representations of quality measures other than accuracy. In conclusion, we see a potential for integrating algorithmic techniques, mathematical quality measures, and tailored interactive visualizations to enable human experts to utilize their knowledge more effectively.
Collapse
|
8
|
SDA-Vis: A Visualization System for Student Dropout Analysis Based on Counterfactual Exploration. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12125785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
High and persistent dropout rates represent one of the biggest challenges for improving the efficiency of the educational system, particularly in underdeveloped countries. A range of features influence college dropouts, with some belonging to the educational field and others to non-educational fields. Understanding the interplay of these variables to identify a student as a potential dropout could help decision makers interpret the situation and decide what they should do next to reduce student dropout rates based on corrective actions. This paper presents SDA-Vis, a visualization system that supports counterfactual explanations for student dropout dynamics, considering various academic, social, and economic variables. In contrast to conventional systems, our approach provides information about feature-perturbed versions of a student using counterfactual explanations. SDA-Vis comprises a set of linked views that allow users to identify variables alteration to chance predefined students situations. This involves perturbing the variables of a dropout student to achieve synthetic non-dropout students. SDA-Vis has been developed under the guidance and supervision of domain experts, in line with some analytical objectives. We demonstrate the usefulness of SDA-Vis through case studies run in collaboration with domain experts, using a real data set from a Latin American university. The analysis reveals the effectiveness of SDA-Vis in identifying students at risk of dropping out and proposes corrective actions, even for particular cases that have not been shown to be at risk with the traditional tools that experts use.
Collapse
|
9
|
Chen M, Abdul-Rahman A, Archambault D, Dykes J, Ritsos P, Slingsby A, Torsney-Weir T, Turkay C, Bach B, Borgo R, Brett A, Fang H, Jianu R, Khan S, Laramee R, Matthews L, Nguyen P, Reeve R, Roberts J, Vidal F, Wang Q, Wood J, Xu K. RAMPVIS: Answering the challenges of building visualisation capabilities for large-scale emergency responses. Epidemics 2022; 39:100569. [PMID: 35597098 PMCID: PMC9045880 DOI: 10.1016/j.epidem.2022.100569] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 01/09/2022] [Accepted: 04/19/2022] [Indexed: 11/25/2022] Open
|
10
|
Andrienko N, Andrienko G, Adilova L, Wrobel S, Rhyne TM. Visual Analytics for Human-Centered Machine Learning. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2022; 42:123-133. [PMID: 35077350 DOI: 10.1109/mcg.2021.3130314] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
We introduce a new research area in visual analytics (VA) aiming to bridge existing gaps between methods of interactive machine learning (ML) and eXplainable Artificial Intelligence (XAI), on one side, and human minds, on the other side. The gaps are, first, a conceptual mismatch between ML/XAI outputs and human mental models and ways of reasoning, and second, a mismatch between the information quantity and level of detail and human capabilities to perceive and understand. A grand challenge is to adapt ML and XAI to human goals, concepts, values, and ways of thinking. Complementing the current efforts in XAI towards solving this challenge, VA can contribute by exploiting the potential of visualization as an effective way of communicating information to humans and a strong trigger of human abstractive perception and thinking. We propose a cross-disciplinary research framework and formulate research directions for VA.
Collapse
|
11
|
Shafqat S, Fayyaz M, Khattak HA, Bilal M, Khan S, Ishtiaq O, Abbasi A, Shafqat F, Alnumay WS, Chatterjee P. Leveraging Deep Learning for Designing Healthcare Analytics Heuristic for Diagnostics. Neural Process Lett 2021; 55:53-79. [PMID: 33551665 PMCID: PMC7852051 DOI: 10.1007/s11063-021-10425-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2021] [Indexed: 11/25/2022]
Abstract
Healthcare Informatics is a phenomenon being talked about from the early 21st century in the era in which we are living. With evolution of new computing technologies huge amount of data in healthcare is produced opening several research areas. Managing the massiveness of this data is required while extracting knowledge for decision making is the main concern of today. For this task researchers are doing explorations in big data analytics, deep learning (advanced form of machine learning known as deep neural nets), predictive analytics and various other algorithms to bring innovation in healthcare. Through all these innovations happening it is not wrong to establish that disease prediction with anticipation of its cure is no longer unrealistic. First, Dengue Fever (DF) and then Covid-19 likewise are new outbreak in infectious lethal diseases and diagnosing at all stages is crucial to decrease mortality rate. In case of Diabetes, clinicians and experts are finding challenging the timely diagnosis and analyzing the chances of developing underlying diseases. In this paper, Louvain Mani-Hierarchical Fold Learning healthcare analytics, a hybrid deep learning technique is proposed for medical diagnostics and is tested and validated using real-time dataset of 104 instances of patients with dengue fever made available by Holy Family Hospital, Pakistan and 810 instances found for infectious diseases including prognosis of; Covid-19, SARS, ARDS, Pneumocystis, Streptococcus, Chlamydophila, Klebsiella, Legionella, Lipoid, etc. on GitHub. Louvain Mani-Hierarchical Fold Learning healthcare analytics showed maximum 0.952 correlations between two clusters with Spearman when applied on 240 instances extracted from comorbidities diagnostic data model derived from 15696 endocrine records of multiple visits of 100 patients identified by a unique ID. Accuracy for induced rules is evaluated by Laplace (Fig. 8) as 0.727, 0.701 and 0.203 for 41, 18 and 24 rules, respectively. Endocrine diagnostic data is made available by Shifa International Hospital, Islamabad, Pakistan. Our results show that in future this algorithm may be tested for diagnostics on healthcare big data.
Collapse
Affiliation(s)
- Sarah Shafqat
- Department of basic and Applied Sciences, International Islamic University (IIU), Islamabad, Pakistan
- Smart e-Health, Islamabad, 44000 Pakistan
| | | | - Hasan Ali Khattak
- National University of Sciences & Technology (NUST), Islamabad, 44000 Pakistan
| | - Muhammad Bilal
- Dept. of Computer Engineering, Hankuk University of Foreign Studies, Yongin-si Gyeonggi-do, 17035 Korea
| | - Shahid Khan
- Shifa International Hospital, Islamabad, Pakistan
| | | | - Almas Abbasi
- Department of basic and Applied Sciences, International Islamic University (IIU), Islamabad, Pakistan
| | - Farzana Shafqat
- Smart e-Health, Islamabad, 44000 Pakistan
- Shifa International Hospital, Islamabad, Pakistan
| | - Waleed S. Alnumay
- Computer Science Department, King Saud University, Riyadh, Saudi Arabia
| | - Pushpita Chatterjee
- Future Networking Research Group, Ton Duc Thang University, Ho Chi Minh City, Vietnam
- Faculty of Electrical and Electronics Engineering, Ton Duc Thang University, Ho Chi Minh City, Vietnam
| |
Collapse
|
12
|
Knittel J, Lalama A, Koch S, Ertl T. Visual Neural Decomposition to Explain Multivariate Data Sets. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1374-1384. [PMID: 33048724 DOI: 10.1109/tvcg.2020.3030420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. More specifically, it is often valuable to understand which ranges of which input variables lead to particular values of a given target variable. Unfortunately, with an increasing number of independent variables, this process may become cumbersome and time-consuming due to the many possible combinations that have to be explored. In this paper, we propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables. We developed a visual model based on neural networks that can be explored in a guided way to help analysts find and understand such correlations. First, we train a neural network to predict the target from the input variables. Then, we visualize the inner workings of the resulting model to help understand relations within the data set. We further introduce a new regularization term for the backpropagation algorithm that encourages the neural network to learn representations that are easier to interpret visually. We apply our method to artificial and real-world data sets to show its utility.
Collapse
|
13
|
Cashman D, Xu S, Das S, Heimerl F, Liu C, Humayoun SR, Gleicher M, Endert A, Chang R. CAVA: A Visual Analytics System for Exploratory Columnar Data Augmentation Using Knowledge Graphs. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1731-1741. [PMID: 33048737 DOI: 10.1109/tvcg.2020.3030443] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Most visual analytics systems assume that all foraging for data happens before the analytics process; once analysis begins, the set of data attributes considered is fixed. Such separation of data construction from analysis precludes iteration that can enable foraging informed by the needs that arise in-situ during the analysis. The separation of the foraging loop from the data analysis tasks can limit the pace and scope of analysis. In this paper, we present CAVA, a system that integrates data curation and data augmentation with the traditional data exploration and analysis tasks, enabling information foraging in-situ during analysis. Identifying attributes to add to the dataset is difficult because it requires human knowledge to determine which available attributes will be helpful for the ensuing analytical tasks. CAVA crawls knowledge graphs to provide users with a a broad set of attributes drawn from external data to choose from. Users can then specify complex operations on knowledge graphs to construct additional attributes. CAVA shows how visual analytics can help users forage for attributes by letting users visually explore the set of available data, and by serving as an interface for query construction. It also provides visualizations of the knowledge graph itself to help users understand complex joins such as multi-hop aggregations. We assess the ability of our system to enable users to perform complex data combinations without programming in a user study over two datasets. We then demonstrate the generalizability of CAVA through two additional usage scenarios. The results of the evaluation confirm that CAVA is effective in helping the user perform data foraging that leads to improved analysis outcomes, and offer evidence in support of integrating data augmentation as a part of the visual analytics pipeline.
Collapse
|
14
|
Park H, Nam Y, Kim JH, Choo J. HyperTendril: Visual Analytics for User-Driven Hyperparameter Optimization of Deep Neural Networks. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1407-1416. [PMID: 33048706 DOI: 10.1109/tvcg.2020.3030380] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
To mitigate the pain of manually tuning hyperparameters of deep neural networks, automated machine learning (AutoML) methods have been developed to search for an optimal set of hyperparameters in large combinatorial search spaces. However, the search results of AutoML methods significantly depend on initial configurations, making it a non-trivial task to find a proper configuration. Therefore, human intervention via a visual analytic approach bears huge potential in this task. In response, we propose HyperTendril, a web-based visual analytics system that supports user-driven hyperparameter tuning processes in a model-agnostic environment. HyperTendril takes a novel approach to effectively steering hyperparameter optimization through an iterative, interactive tuning procedure that allows users to refine the search spaces and the configuration of the AutoML method based on their own insights from given results. Using HyperTendril, users can obtain insights into the complex behaviors of various hyperparameter search algorithms and diagnose their configurations. In addition, HyperTendril supports variable importance analysis to help the users refine their search spaces based on the analysis of relative importance of different hyperparameters and their interaction effects. We present the evaluation demonstrating how HyperTendril helps users steer their tuning processes via a longitudinal user study based on the analysis of interaction logs and in-depth interviews while we deploy our system in a professional industrial environment.
Collapse
|
15
|
Wang Q, Alexander W, Pegg J, Qu H, Chen M. HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1417-1426. [PMID: 33048739 DOI: 10.1109/tvcg.2020.3030449] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we present a visual analytics tool for enabling hypothesis-based evaluation of machine learning (ML) models. We describe a novel ML-testing framework that combines the traditional statistical hypothesis testing (commonly used in empirical research) with logical reasoning about the conclusions of multiple hypotheses. The framework defines a controlled configuration for testing a number of hypotheses as to whether and how some extra information about a "concept" or "feature" may benefit or hinder an ML model. Because reasoning multiple hypotheses is not always straightforward, we provide HypoML as a visual analysis tool, with which, the multi-thread testing results are first transformed to analytical results using statistical and logical inferences, and then to a visual representation for rapid observation of the conclusions and the logical flow between the testing results and hypotheses. We have applied HypoML to a number of hypothesized concepts, demonstrating the intuitive and explainable nature of the visual analysis.
Collapse
|
16
|
Gao W, Chen Y. Approximation analysis of ontology learning algorithm in linear combination setting. JOURNAL OF CLOUD COMPUTING: ADVANCES, SYSTEMS AND APPLICATIONS 2020. [DOI: 10.1186/s13677-020-00173-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
17
|
Krak I, Barmak O, Manziuk E. Using visual analytics to develop human and machine‐centric models: A review of approaches and proposed information technology. Comput Intell 2020. [DOI: 10.1111/coin.12289] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Iurii Krak
- Department of Theoretical CyberneticsTaras Shevchenko National University of Kyiv Kyiv Ukraine
| | - Olexander Barmak
- Department of Computer Science and Information TechnologiesNational University of Khmelnytskyi Khmelnytskyi Ukraine
| | - Eduard Manziuk
- Department of Computer Science and Information TechnologiesNational University of Khmelnytskyi Khmelnytskyi Ukraine
| |
Collapse
|
18
|
Spinner T, Schlegel U, Schafer H, El-Assady M. explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1064-1074. [PMID: 31442998 DOI: 10.1109/tvcg.2019.2934629] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We propose a framework for interactive and explainable machine learning that enables users to (1) understand machine learning models; (2) diagnose model limitations using different explainable AI methods; as well as (3) refine and optimize the models. Our framework combines an iterative XAI pipeline with eight global monitoring and steering mechanisms, including quality monitoring, provenance tracking, model comparison, and trust building. To operationalize the framework, we present explAIner, a visual analytics system for interactive and explainable machine learning that instantiates all phases of the suggested pipeline within the commonly used TensorBoard environment. We performed a user-study with nine participants across different expertise levels to examine their perception of our workflow and to collect suggestions to fill the gap between our system and framework. The evaluation confirms that our tightly integrated system leads to an informed machine learning process while disclosing opportunities for further extensions.
Collapse
|
19
|
Ming Y, Xu P, Cheng F, Qu H, Ren L. ProtoSteer: Steering Deep Sequence Model with Prototypes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:238-248. [PMID: 31514137 DOI: 10.1109/tvcg.2019.2934267] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recently we have witnessed growing adoption of deep sequence models (e.g. LSTMs) in many application domains, including predictive health care, natural language processing, and log analysis. However, the intricate working mechanism of these models confines their accessibility to the domain experts. Their black-box nature also makes it a challenging task to incorporate domain-specific knowledge of the experts into the model. In ProtoSteer (Prototype Steering), we tackle the challenge of directly involving the domain experts to steer a deep sequence model without relying on model developers as intermediaries. Our approach originates in case-based reasoning, which imitates the common human problem-solving process of consulting past experiences to solve new problems. We utilize ProSeNet (Prototype Sequence Network), which learns a small set of exemplar cases (i.e., prototypes) from historical data. In ProtoSteer they serve both as an efficient visual summary of the original data and explanations of model decisions. With ProtoSteer the domain experts can inspect, critique, and revise the prototypes interactively. The system then incorporates user-specified prototypes and incrementally updates the model. We conduct extensive case studies and expert interviews in application domains including sentiment analysis on texts and predictive diagnostics based on vehicle fault logs. The results demonstrate that involvements of domain users can help obtain more interpretable models with concise prototypes while retaining similar accuracy.
Collapse
|
20
|
Gehrmann S, Strobelt H, Kruger R, Pfister H, Rush AM. Visual Interaction with Deep Learning Models through Collaborative Semantic Inference. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:884-894. [PMID: 31425116 DOI: 10.1109/tvcg.2019.2934595] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Automation of tasks can have critical consequences when humans lose agency over decision processes. Deep learning models are particularly susceptible since current black-box approaches lack explainable reasoning. We argue that both the visual interface and model structure of deep learning systems need to take into account interaction design. We propose a framework of collaborative semantic inference (CSI) for the co-design of interactions and models to enable visual collaboration between humans and algorithms. The approach exposes the intermediate reasoning process of models which allows semantic interactions with the visual metaphors of a problem, which means that a user can both understand and control parts of the model reasoning process. We demonstrate the feasibility of CSI with a co-designed case study of a document summarization system.
Collapse
|
21
|
Snyder LS, Lin YS, Karimzadeh M, Goldwasser D, Ebert DS. Interactive Learning for Identifying Relevant Tweets to Support Real-time Situational Awareness. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:558-568. [PMID: 31442995 DOI: 10.1109/tvcg.2019.2934614] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Various domain users are increasingly leveraging real-time social media data to gain rapid situational awareness. However, due to the high noise in the deluge of data, effectively determining semantically relevant information can be difficult, further complicated by the changing definition of relevancy by each end user for different events. The majority of existing methods for short text relevance classification fail to incorporate users' knowledge into the classification process. Existing methods that incorporate interactive user feedback focus on historical datasets. Therefore, classifiers cannot be interactively retrained for specific events or user-dependent needs in real-time. This limits real-time situational awareness, as streaming data that is incorrectly classified cannot be corrected immediately, permitting the possibility for important incoming data to be incorrectly classified as well. We present a novel interactive learning framework to improve the classification process in which the user iteratively corrects the relevancy of tweets in real-time to train the classification model on-the-fly for immediate predictive improvements. We computationally evaluate our classification model adapted to learn at interactive rates. Our results show that our approach outperforms state-of-the-art machine learning models. In addition, we integrate our framework with the extended Social Media Analytics and Reporting Toolkit (SMART) 2.0 system, allowing the use of our interactive learning framework within a visual analytics system tailored for real-time situational awareness. To demonstrate our framework's effectiveness, we provide domain expert feedback from first responders who used the extended SMART 2.0 system.
Collapse
|
22
|
Jia S, Lin P, Li Z, Zhang J, Liu S. Visualizing surrogate decision trees of convolutional neural networks. J Vis (Tokyo) 2019. [DOI: 10.1007/s12650-019-00607-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|