1
|
Deng D, Zhang C, Zheng H, Pu Y, Ji S, Wu Y. AdversaFlow: Visual Red Teaming for Large Language Models with Multi-Level Adversarial Flow. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:492-502. [PMID: 39283796 DOI: 10.1109/tvcg.2024.3456150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Large Language Models (LLMs) are powerful but also raise significant security concerns, particularly regarding the harm they can cause, such as generating fake news that manipulates public opinion on social media and providing responses to unethical activities. Traditional red teaming approaches for identifying AI vulnerabilities rely on manual prompt construction and expertise. This paper introduces AdversaFlow, a novel visual analytics system designed to enhance LLM security against adversarial attacks through human-AI collaboration. AdversaFlow involves adversarial training between a target model and a red model, featuring unique multi-level adversarial flow and fluctuation path visualizations. These features provide insights into adversarial dynamics and LLM robustness, enabling experts to identify and mitigate vulnerabilities effectively. We present quantitative evaluations and case studies validating our system's utility and offering insights for future AI security solutions. Our method can enhance LLM security, supporting downstream scenarios like social media regulation by enabling more effective detection, monitoring, and mitigation of harmful content and behaviors.
Collapse
|
2
|
Oral E, Chawla R, Wijkstra M, Mahyar N, Dimara E. From Information to Choice: A Critical Inquiry Into Visualization Tools for Decision Making. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:359-369. [PMID: 37871054 DOI: 10.1109/tvcg.2023.3326593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
In the face of complex decisions, people often engage in a three-stage process that spans from (1) exploring and analyzing pertinent information (intelligence); (2) generating and exploring alternative options (design); and ultimately culminating in (3) selecting the optimal decision by evaluating discerning criteria (choice). We can fairly assume that all good visualizations aid in the "intelligence" stage by enabling data exploration and analysis. Yet, to what degree and how do visualization systems currently support the other decision making stages, namely "design" and "choice"? To further explore this question, we conducted a comprehensive review of decision-focused visualization tools by examining publications in major visualization journals and conferences, including VIS, EuroVis, and CHI, spanning all available years. We employed a deductive coding method and in-depth analysis to assess whether and how visualization tools support design and choice. Specifically, we examined each visualization tool by (i) its degree of visibility for displaying decision alternatives, criteria, and preferences, and (ii) its degree of flexibility for offering means to manipulate the decision alternatives, criteria, and preferences with interactions such as adding, modifying, changing mapping, and filtering. Our review highlights the opportunities and challenges that decision-focused visualization tools face in realizing their full potential to support all stages of the decision making process. It reveals a surprising scarcity of tools that support all stages, and while most tools excel in offering visibility for decision criteria and alternatives, the degree of flexibility to manipulate these elements is often limited, and the lack of tools that accommodate decision preferences and their elicitation is notable. Based on our findings, to better support the choice stage, future research could explore enhancing flexibility levels and variety, exploring novel visualization paradigms, increasing algorithmic support, and ensuring that this automation is user-controlled via the enhanced flexibility I evels. Our curated list of the 88 surveyed visualization tools is available in the OSF link (https://osf.io/nrasz/?view_only=b92a90a34ae241449b5f2cd33383bfcb).
Collapse
|
3
|
Gaba A, Kaufman Z, Cheung J, Shvakel M, Hall KW, Brun Y, Bearfield CX. My Model is Unfair, Do People Even Care? Visual Design Affects Trust and Perceived Bias in Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:327-337. [PMID: 37878441 DOI: 10.1109/tvcg.2023.3327192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
Machine learning technology has become ubiquitous, but, unfortunately, often exhibits bias. As a consequence, disparate stakeholders need to interact with and make informed decisions about using machine learning models in everyday systems. Visualization technology can support stakeholders in understanding and evaluating trade-offs between, for example, accuracy and fairness of models. This paper aims to empirically answer "Can visualization design choices affect a stakeholder's perception of model bias, trust in a model, and willingness to adopt a model?" Through a series of controlled, crowd-sourced experiments with more than 1,500 participants, we identify a set of strategies people follow in deciding which models to trust. Our results show that men and women prioritize fairness and performance differently and that visual design choices significantly affect that prioritization. For example, women trust fairer models more often than men do, participants value fairness more when it is explained using text than as a bar chart, and being explicitly told a model is biased has a bigger impact than showing past biased performance. We test the generalizability of our results by comparing the effect of multiple textual and visual design choices and offer potential explanations of the cognitive mechanisms behind the difference in fairness perception and trust. Our research guides design considerations to support future work developing visualization systems for machine learning.
Collapse
|
4
|
Jung J, Lee H, Jung H, Kim H. Essential properties and explanation effectiveness of explainable artificial intelligence in healthcare: A systematic review. Heliyon 2023; 9:e16110. [PMID: 37234618 PMCID: PMC10205582 DOI: 10.1016/j.heliyon.2023.e16110] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 03/26/2023] [Accepted: 05/05/2023] [Indexed: 05/28/2023] Open
Abstract
Background Significant advancements in the field of information technology have influenced the creation of trustworthy explainable artificial intelligence (XAI) in healthcare. Despite improved performance of XAI, XAI techniques have not yet been integrated into real-time patient care. Objective The aim of this systematic review is to understand the trends and gaps in research on XAI through an assessment of the essential properties of XAI and an evaluation of explanation effectiveness in the healthcare field. Methods A search of PubMed and Embase databases for relevant peer-reviewed articles on development of an XAI model using clinical data and evaluating explanation effectiveness published between January 1, 2011, and April 30, 2022, was conducted. All retrieved papers were screened independently by the two authors. Relevant papers were also reviewed for identification of the essential properties of XAI (e.g., stakeholders and objectives of XAI, quality of personalized explanations) and the measures of explanation effectiveness (e.g., mental model, user satisfaction, trust assessment, task performance, and correctability). Results Six out of 882 articles met the criteria for eligibility. Artificial Intelligence (AI) users were the most frequently described stakeholders. XAI served various purposes, including evaluation, justification, improvement, and learning from AI. Evaluation of the quality of personalized explanations was based on fidelity, explanatory power, interpretability, and plausibility. User satisfaction was the most frequently used measure of explanation effectiveness, followed by trust assessment, correctability, and task performance. The methods of assessing these measures also varied. Conclusion XAI research should address the lack of a comprehensive and agreed-upon framework for explaining XAI and standardized approaches for evaluating the effectiveness of the explanation that XAI provides to diverse AI stakeholders.
Collapse
Affiliation(s)
- Jinsun Jung
- College of Nursing, Seoul National University, Seoul, Republic of Korea
- Center for Human-Caring Nurse Leaders for the Future by Brain Korea 21 (BK 21) Four Project, College of Nursing, Seoul National University, Seoul, Republic of Korea
| | - Hyungbok Lee
- College of Nursing, Seoul National University, Seoul, Republic of Korea
- Emergency Nursing Department, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hyunggu Jung
- Department of Computer Science and Engineering, University of Seoul, Seoul, Republic of Korea
- Department of Artificial Intelligence, University of Seoul, Seoul, Republic of Korea
| | - Hyeoneui Kim
- College of Nursing, Seoul National University, Seoul, Republic of Korea
- Research Institute of Nursing Science, College of Nursing, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
5
|
Ghai B, Mueller K. D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:473-482. [PMID: 36155458 DOI: 10.1109/tvcg.2022.3209484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
With the rise of AI, algorithms have become better at learning underlying patterns from the training data including ingrained social biases based on gender, race, etc. Deployment of such algorithms to domains such as hiring, healthcare, law enforcement, etc. has raised serious concerns about fairness, accountability, trust and interpretability in machine learning algorithms. To alleviate this problem, we propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases from tabular datasets. It uses a graphical causal model to represent causal relationships among different features in the dataset and as a medium to inject domain knowledge. A user can detect the presence of bias against a group, say females, or a subgroup, say black females, by identifying unfair causal relationships in the causal network and using an array of fairness metrics. Thereafter, the user can mitigate bias by refining the causal model and acting on the unfair causal edges. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset based on the current causal model while ensuring a minimal change from the original dataset. Users can visually assess the impact of their interactions on different fairness metrics, utility metrics, data distortion, and the underlying data distribution. Once satisfied, they can download the debiased dataset and use it for any downstream application for fairer predictions. We evaluate D-BIAS by conducting experiments on 3 datasets and also a formal user study. We found that D-BIAS helps reduce bias significantly compared to the baseline debiasing approach across different fairness metrics while incurring little data distortion and a small loss in utility. Moreover, our human-in-the-loop based approach significantly outperforms an automated approach on trust, interpretability and accountability.
Collapse
|
6
|
Arioz U, Smrke U, Plohl N, Mlakar I. Scoping Review on the Multimodal Classification of Depression and Experimental Study on Existing Multimodal Models. Diagnostics (Basel) 2022; 12:2683. [PMID: 36359525 PMCID: PMC9689708 DOI: 10.3390/diagnostics12112683] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 10/27/2022] [Accepted: 10/31/2022] [Indexed: 12/26/2023] Open
Abstract
Depression is a prevalent comorbidity in patients with severe physical disorders, such as cancer, stroke, and coronary diseases. Although it can significantly impact the course of the primary disease, the signs of depression are often underestimated and overlooked. The aim of this paper was to review algorithms for the automatic, uniform, and multimodal classification of signs of depression from human conversations and to evaluate their accuracy. For the scoping review, the PRISMA guidelines for scoping reviews were followed. In the scoping review, the search yielded 1095 papers, out of which 20 papers (8.26%) included more than two modalities, and 3 of those papers provided codes. Within the scope of this review, supported vector machine (SVM), random forest (RF), and long short-term memory network (LSTM; with gated and non-gated recurrent units) models, as well as different combinations of features, were identified as the most widely researched techniques. We tested the models using the DAIC-WOZ dataset (original training dataset) and using the SymptomMedia dataset to further assess their reliability and dependency on the nature of the training datasets. The best performance was obtained by the LSTM with gated recurrent units (F1-score of 0.64 for the DAIC-WOZ dataset). However, with a drop to an F1-score of 0.56 for the SymptomMedia dataset, the method also appears to be the most data-dependent.
Collapse
Affiliation(s)
- Umut Arioz
- Faculty of Electrical Engineering and Computer Science, The University of Maribor, 2000 Maribor, Slovenia
| | - Urška Smrke
- Faculty of Electrical Engineering and Computer Science, The University of Maribor, 2000 Maribor, Slovenia
| | - Nejc Plohl
- Department of Psychology, Faculty of Arts, The University of Maribor, 2000 Maribor, Slovenia
| | - Izidor Mlakar
- Faculty of Electrical Engineering and Computer Science, The University of Maribor, 2000 Maribor, Slovenia
| |
Collapse
|
7
|
Nakao Y, Stumpf S, Ahmed S, Naseer A, Strappelli L. Towards Involving End-users in Interactive Human-in-the-loop AI Fairness. ACM T INTERACT INTEL 2022. [DOI: 10.1145/3514258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Ensuring fairness in artificial intelligence (AI) is important to counteract bias and discrimination in far-reaching applications. Recent work has started to investigate how humans judge fairness and how to support machine learning (ML) experts in making their AI models fairer. Drawing inspiration from an Explainable AI (XAI) approach called
explanatory debugging
used in interactive machine learning, our work explores designing interpretable and interactive human-in-the-loop interfaces that allow ordinary end-users without any technical or domain background to identify potential fairness issues and possibly fix them in the context of loan decisions. Through workshops with end-users, we co-designed and implemented a prototype system that allowed end-users to see why predictions were made, and then to change weights on features to ”debug” fairness issues. We evaluated the use of this prototype system through an online study. To investigate the implications of diverse human values about fairness around the globe, we also explored how cultural dimensions might play a role in using this prototype. Our results contribute to the design of interfaces to allow end-users to be involved in judging and addressing AI fairness through a human-in-the-loop approach.
Collapse
Affiliation(s)
- Yuri Nakao
- Research Center for AI Ethics, Fujitsu Limited, Japan
| | | | | | | | | |
Collapse
|
8
|
Xie T, Ma Y, Kang J, Tong H, Maciejewski R. FairRankVis: A Visual Analytics Framework for Exploring Algorithmic Fairness in Graph Mining Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:368-377. [PMID: 34587074 DOI: 10.1109/tvcg.2021.3114850] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Graph mining is an essential component of recommender systems and search engines. Outputs of graph mining models typically provide a ranked list sorted by each item's relevance or utility. However, recent research has identified issues of algorithmic bias in such models, and new graph mining algorithms have been proposed to correct for bias. As such, algorithm developers need tools that can help them uncover potential biases in their models while also exploring the impacts of correcting for biases when employing fairness-aware algorithms. In this paper, we present FairRankVis, a visual analytics framework designed to enable the exploration of multi-class bias in graph mining algorithms. We support both group and individual fairness levels of comparison. Our framework is designed to enable model developers to compare multi-class fairness between algorithms (for example, comparing PageRank with a debiased PageRank algorithm) to assess the impacts of algorithmic debiasing with respect to group and individual fairness. We demonstrate our framework through two usage scenarios inspecting algorithmic fairness.
Collapse
|
9
|
Wang X, He J, Jin Z, Yang M, Wang Y, Qu H. M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:802-812. [PMID: 34587037 DOI: 10.1109/tvcg.2021.3114794] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multimodal sentiment analysis aims to recognize people's attitudes from multiple communication channels such as verbal content (i.e., text), voice, and facial expressions. It has become a vibrant and important research topic in natural language processing. Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels. However, current multimodal models with strong performance are often deep-learning-based techniques and work like black boxes. It is not clear how models utilize multimodal information for sentiment predictions. Despite recent advances in techniques for enhancing the explainability of machine learning models, they often target unimodal scenarios (e.g., images, sentences), and little research has been done on explaining multimodal models. In this paper, we present an interactive visual analytics system, M2 Lens, to visualize and explain multimodal models for sentiment analysis. M2 Lens provides explanations on intra- and inter-modal interactions at the global, subset, and local levels. Specifically, it summarizes the influence of three typical interaction types (i.e., dominance, complement, and conflict) on the model predictions. Moreover, M2 Lens identifies frequent and influential multimodal features and supports the multi-faceted exploration of model behaviors from language, acoustic, and visual modalities. Through two case studies and expert interviews, we demonstrate our system can help users gain deep insights into the multimodal models for sentiment analysis.
Collapse
|
10
|
Abstract
Today's increased availability of large amounts of human behavioral data and advances in artificial intelligence (AI) are contributing to a growing reliance on algorithms to make consequential decisions for humans, including those related to access to credit or medical treatments, hiring, etc. Algorithmic decision-making processes might lead to more objective decisions than those made by humans who may be influenced by prejudice, conflicts of interest, or fatigue. However, algorithmic decision-making has been criticized for its potential to lead to privacy invasion, information asymmetry, opacity, and discrimination. In this paper, we describe available technical solutions in three large areas that we consider to be of critical importance to achieve a human-centric AI: (1) privacy and data ownership; (2) accountability and transparency; and (3) fairness. We also highlight the criticality and urgency to engage multi-disciplinary teams of researchers, practitioners, policy makers, and citizens to co-develop and evaluate in the real-world algorithmic decision-making processes designed to maximize fairness, accountability, and transparency while respecting privacy.
Collapse
Affiliation(s)
- Bruno Lepri
- Digital Society Center, Fondazione Bruno Kessler, Trento 38123, Italy
- Data-Pop Alliance, New York, NY, USA
| | - Nuria Oliver
- ELLIS (the European Laboratory for Learning and Intelligent Systems) Unit Alicante, Alicante 03690, Spain
- Data-Pop Alliance, New York, NY, USA
| | - Alex Pentland
- Data-Pop Alliance, New York, NY, USA
- MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
11
|
Wang Q, Xu Z, Chen Z, Wang Y, Liu S, Qu H. Visual Analysis of Discrimination in Machine Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1470-1480. [PMID: 33048751 DOI: 10.1109/tvcg.2020.3030471] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The growing use of automated decision-making in critical applications, such as crime prediction and college admission, has raised questions about fairness in machine learning. How can we decide whether different treatments are reasonable or discriminatory? In this paper, we investigate discrimination in machine learning from a visual analytics perspective and propose an interactive visualization tool, DiscriLens, to support a more comprehensive analysis. To reveal detailed information on algorithmic discrimination, DiscriLens identifies a collection of potentially discriminatory itemsets based on causal modeling and classification rules mining. By combining an extended Euler diagram with a matrix-based visualization, we develop a novel set visualization to facilitate the exploration and interpretation of discriminatory itemsets. A user study shows that users can interpret the visually encoded information in DiscriLens quickly and accurately. Use cases demonstrate that DiscriLens provides informative guidance in understanding and reducing algorithmic discrimination.
Collapse
|
12
|
Ma Y, Fan A, He J, Nelakurthi AR, Maciejewski R. A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1385-1395. [PMID: 33035164 DOI: 10.1109/tvcg.2020.3028888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Collapse
|
13
|
Cheng F, Ming Y, Qu H. DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1438-1447. [PMID: 33074811 DOI: 10.1109/tvcg.2020.3030342] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
With machine learning models being increasingly applied to various decision-making scenarios, people have spent growing efforts to make machine learning models more transparent and explainable. Among various explanation techniques, counterfactual explanations have the advantages of being human-friendly and actionable-a counterfactual explanation tells the user how to gain the desired prediction with minimal changes to the input. Besides, counterfactual explanations can also serve as efficient probes to the models' decisions. In this work, we exploit the potential of counterfactual explanations to understand and explore the behavior of machine learning models. We design DECE, an interactive visualization system that helps understand and explore a model's decisions on individual instances and data subsets, supporting users ranging from decision-subjects to model developers. DECE supports exploratory analysis of model decisions by combining the strengths of counterfactual explanations at instance- and subgroup-levels. We also introduce a set of interactions that enable users to customize the generation of counterfactual explanations to find more actionable ones that can suit their needs. Through three use cases and an expert interview, we demonstrate the effectiveness of DECE in supporting decision exploration tasks and instance explanations.
Collapse
|