1
|
Deng D, Zhang C, Zheng H, Pu Y, Ji S, Wu Y. AdversaFlow: Visual Red Teaming for Large Language Models with Multi-Level Adversarial Flow. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:492-502. [PMID: 39283796 DOI: 10.1109/tvcg.2024.3456150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Large Language Models (LLMs) are powerful but also raise significant security concerns, particularly regarding the harm they can cause, such as generating fake news that manipulates public opinion on social media and providing responses to unethical activities. Traditional red teaming approaches for identifying AI vulnerabilities rely on manual prompt construction and expertise. This paper introduces AdversaFlow, a novel visual analytics system designed to enhance LLM security against adversarial attacks through human-AI collaboration. AdversaFlow involves adversarial training between a target model and a red model, featuring unique multi-level adversarial flow and fluctuation path visualizations. These features provide insights into adversarial dynamics and LLM robustness, enabling experts to identify and mitigate vulnerabilities effectively. We present quantitative evaluations and case studies validating our system's utility and offering insights for future AI security solutions. Our method can enhance LLM security, supporting downstream scenarios like social media regulation by enabling more effective detection, monitoring, and mitigation of harmful content and behaviors.
Collapse
|
2
|
de Hond YJ, van Haaren PM, Tijssen RH, Hurkmans CW. Uncertainty estimation in female pelvic synthetic computed tomography generated from iterative reconstructed cone-beam computed tomography. Phys Imaging Radiat Oncol 2025; 33:100743. [PMID: 40123768 PMCID: PMC11926433 DOI: 10.1016/j.phro.2025.100743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Revised: 02/19/2025] [Accepted: 02/26/2025] [Indexed: 03/25/2025] Open
Abstract
Background and Purpose Iterative reconstruction (IR) can be used to improve cone-beam computed tomography (CBCT) image quality and from such iterative reconstructed (iCBCT) images, synthetic CT (sCT) images can be generated to enable accurate dose calculations. The aim of this study was to evaluate the uncertainty in generating sCT from iCBCT using vendor-supplied software for online adaptive radiotherapy. Materials and Methods Projection data from 20 female pelvic CBCTs were used to reconstruct iCBCT images. The process was repeated with 128 different IR parameter combinations. From these iCBCTs, sCTs were generated. Voxel value variation in the 128 iCBCT and 128 sCT images per patient was quantified by the standard deviation (STD). Additional sub-analysis was performed per parameter category. Results Generated sCTs had significantly higher maximum STD-values, median of 438 HU, compared to input iCBCT, median of 198 HU, indicating limited robustness to parameter changes. The highest STD-values of sCTs were within bone and soft-tissue compared to air. Variations in sCT numbers were parameter dependent. Scatter correction produced the highest variance in sCTs (median: 358 HU) despite no visible changes in iCBCTs, whereas total variation regularization resulted in the lowest variance in sCTs (median: 233 HU) despite increased iCBCT blurriness. Conclusions Variations in iCBCT reconstruction parameters affected the CT number representation in the sCT. The sCT variance depended on the parameter category, with subtle iCBCT changes leading to significant density alterations in sCT. Therefore, it is recommended to evaluate both iCBCT and sCT generation, especially when updating software or settings.
Collapse
Affiliation(s)
- Yvonne J.M. de Hond
- Department of Radiation Oncology, Catharina Hospital Eindhoven, the Netherlands
| | | | - Rob H.N. Tijssen
- Department of Radiation Oncology, Catharina Hospital Eindhoven, the Netherlands
| | - Coen W. Hurkmans
- Department of Radiation Oncology, Catharina Hospital Eindhoven, the Netherlands
| |
Collapse
|
3
|
Li Y, Wang J, Aboagye P, Yeh CCM, Zheng Y, Wang L, Zhang W, Ma KL. Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:2875-2887. [PMID: 38625780 PMCID: PMC11412260 DOI: 10.1109/tvcg.2024.3388514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Recent advancements in pre-trained language-image models have ushered in a new era of visual comprehension. Leveraging the power of these models, this article tackles two issues within the realm of visual analytics: (1) the efficient exploration of large-scale image datasets and identification of data biases within them; (2) the evaluation of image captions and steering of their generation process. On the one hand, by visually examining the captions generated from language-image models for an image dataset, we gain deeper insights into the visual contents, unearthing data biases that may be entrenched within the dataset. On the other hand, by depicting the association between visual features and textual captions, we expose the weaknesses of pre-trained language-image models in their captioning capability and propose an interactive interface to steer caption generation. The two parts have been coalesced into a coordinated visual analytics system, fostering the mutual enrichment of visual and textual contents. We validate the effectiveness of the system with domain practitioners through concrete case studies with large-scale image datasets.
Collapse
|
4
|
Lyu L, Pang C, Wang J. Understanding the role of pathways in a deep neural network. Neural Netw 2024; 172:106095. [PMID: 38199152 DOI: 10.1016/j.neunet.2024.106095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 11/30/2023] [Accepted: 12/31/2023] [Indexed: 01/12/2024]
Abstract
Deep neural networks have demonstrated superior performance in artificial intelligence applications, but the opaqueness of their inner working mechanism is one major drawback in their application. The prevailing unit-based interpretation is a statistical observation of stimulus-response data, which fails to show a detailed internal process of inherent mechanisms of neural networks. In this work, we analyze a convolutional neural network (CNN) trained in the classification task and present an algorithm to extract the diffusion pathways of individual pixels to identify the locations of pixels in an input image associated with object classes. The pathways allow us to test the causal components which are important for classification and the pathway-based representations are clearly distinguishable between categories. We find that the few largest pathways of an individual pixel from an image tend to cross the feature maps in each layer that is important for classification. And the large pathways of images of the same category are more consistent in their trends than those of different categories. We also apply the pathways to understanding adversarial attacks, object completion, and movement perception. Further, the total number of pathways on feature maps in all layers can clearly discriminate the original, deformed, and target samples.
Collapse
Affiliation(s)
- Lei Lyu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
| | - Chen Pang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
| | - Jihua Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
| |
Collapse
|
5
|
Xu P, Liu Q, Bao H, Zhang R, Gu L, Wang G. FDSR: An Interpretable Frequency Division Stepwise Process Based Single-Image Super-Resolution Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:1710-1725. [PMID: 38416622 DOI: 10.1109/tip.2024.3368960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]
Abstract
Deep learning has excelled in single-image super-resolution (SISR) applications, yet the lack of interpretability in most deep learning-based SR networks hinders their applicability, especially in fields like medical imaging that require transparent computation. To address these problems, we present an interpretable frequency division SR network that operates in the image frequency domain. It comprises a frequency division module and a step-wise reconstruction method, which divides the image into different frequencies and performs reconstruction accordingly. We develop a frequency division loss function to ensure that each reconstruction module (ReM) operates solely at one image frequency. These methods establish an interpretable framework for SR networks, visualizing the image reconstruction process and reducing the black box nature of SR networks. Additionally, we revisited the subpixel layer upsampling process by deriving its inverse process and designing a displacement generation module. This interpretable upsampling process incorporates subpixel information and is similar to pre-upsampling frameworks. Furthermore, we develop a new ReM based on interpretable Hessian attention to enhance network performance. Extensive experiments demonstrate that our network, without the frequency division loss, outperforms state-of-the-art methods qualitatively and quantitatively. The inclusion of the frequency division loss enhances the network's interpretability and robustness, and only slightly decreases the PSNR and SSIM metrics by an average of 0.48 dB and 0.0049, respectively.
Collapse
|
6
|
Li Y, Wang J, Fujiwara T, Ma KL. Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks. ACM T INTERACT INTEL 2023. [DOI: 10.1145/3587470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
Adversarial attacks on a convolutional neural network (CNN)—injecting human-imperceptible perturbations into an input image—could fool a high-performance CNN into making incorrect predictions. The success of adversarial attacks raises serious concerns about the robustness of CNNs, and prevents them from being used in safety-critical applications, such as medical diagnosis and autonomous driving. Our work introduces a visual analytics approach to understanding adversarial attacks by answering two questions: (1)
which neurons are more vulnerable to attacks
and (2)
which image features do these vulnerable neurons capture during the prediction?
For the first question, we introduce multiple perturbation-based measures to break down the attacking magnitude into individual CNN neurons and rank the neurons by their vulnerability levels. For the second, we identify image features (e.g., cat ears) that highly stimulate a user-selected neuron to augment and validate the neuron’s responsibility. Furthermore, we support an interactive exploration of a large number of neurons by aiding with hierarchical clustering based on the neurons’ roles in the prediction. To this end, a visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks. We validate the effectiveness of our system through multiple case studies as well as feedback from domain experts.
Collapse
Affiliation(s)
- Yiran Li
- University of California, Davis, USA
| | | | | | | |
Collapse
|
7
|
Wang Q, Huang K, Chandak P, Zitnik M, Gehlenborg N. Extending the Nested Model for User-Centric XAI: A Design Study on GNN-based Drug Repurposing. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:1266-1276. [PMID: 36223348 DOI: 10.1109/tvcg.2022.3209435] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Whether AI explanations can help users achieve specific tasks efficiently (i.e., usable explanations) is significantly influenced by their visual presentation. While many techniques exist to generate explanations, it remains unclear how to select and visually present AI explanations based on the characteristics of domain users. This paper aims to understand this question through a multidisciplinary design study for a specific problem: explaining graph neural network (GNN) predictions to domain experts in drug repurposing, i.e., reuse of existing drugs for new diseases. Building on the nested design model of visualization, we incorporate XAI design considerations from a literature review and from our collaborators' feedback into the design process. Specifically, we discuss XAI-related design considerations for usable visual explanations at each design layer: target user, usage context, domain explanation, and XAI goal at the domain layer; format, granularity, and operation of explanations at the abstraction layer; encodings and interactions at the visualization layer; and XAI and rendering algorithm at the algorithm layer. We present how the extended nested model motivates and informs the design of DrugExplorer, an XAI tool for drug repurposing. Based on our domain characterization, DrugExplorer provides path-based explanations and presents them both as individual paths and meta-paths for two key XAI operations, why and what else. DrugExplorer offers a novel visualization design called MetaMatrix with a set of interactions to help domain users organize and compare explanation paths at different levels of granularity to generate domain-meaningful insights. We demonstrate the effectiveness of the selected visual presentation and DrugExplorer as a whole via a usage scenario, a user study, and expert interviews. From these evaluations, we derive insightful observations and reflections that can inform the design of XAI visualizations for other scientific applications.
Collapse
|
8
|
Cakmak E, Jackle D, Schreck T, Keim DA, Fuchs J. Multiscale Visualization: A Structured Literature Analysis. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4918-4929. [PMID: 34478370 DOI: 10.1109/tvcg.2021.3109387] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiscale visualizations are typically used to analyze multiscale processes and data in various application domains, such as the visual exploration of hierarchical genome structures in molecular biology. However, creating such multiscale visualizations remains challenging due to the plethora of existing work and the expression ambiguity in visualization research. Up to today, there has been little work to compare and categorize multiscale visualizations to understand their design practices. In this article, we present a structured literature analysis to provide an overview of common design practices in multiscale visualization research. We systematically reviewed and categorized 122 published journal or conference articles between 1995 and 2020. We organized the reviewed articles in a taxonomy that reveals common design factors. Researchers and practitioners can use our taxonomy to explore existing work to create new multiscale navigation and visualization techniques. Based on the reviewed articles, we examine research trends and highlight open research challenges.
Collapse
|
9
|
Representation and analysis of time-series data via deep embedding and visual exploration. J Vis (Tokyo) 2022. [DOI: 10.1007/s12650-022-00890-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
10
|
Chen C, Wu J, Wang X, Xiang S, Zhang SH, Tang Q, Liu S. Towards Better Caption Supervision for Object Detection. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1941-1954. [PMID: 34962870 DOI: 10.1109/tvcg.2021.3138933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
As training high-performance object detectors requires expensive bounding box annotations, recent methods resort to free-available image captions. However, detectors trained on caption supervision perform poorly because captions are usually noisy and cannot provide precise location information. To tackle this issue, we present a visual analysis method, which tightly integrates caption supervision with object detection to mutually enhance each other. In particular, object labels are first extracted from captions, which are utilized to train the detectors. Then, the objects detected from images are fed into caption supervision for further improvement. To effectively loop users into the object detection process, a node-link-based set visualization supported by a multi-type relational co-clustering algorithm is developed to explain the relationships between the extracted labels and the images with detected objects. The co-clustering algorithm clusters labels and images simultaneously by utilizing both their representations and their relationships. Quantitative evaluations and a case study are conducted to demonstrate the efficiency and effectiveness of the developed method in improving the performance of object detectors.
Collapse
|
11
|
符 帅, 李 明, 边 兆, 马 建. [Performance of low-dose CT image reconstruction for detecting intracerebral hemorrhage: selection of dose, algorithms and their combinations]. NAN FANG YI KE DA XUE XUE BAO = JOURNAL OF SOUTHERN MEDICAL UNIVERSITY 2022; 42:223-231. [PMID: 35365446 PMCID: PMC8983357 DOI: 10.12122/j.issn.1673-4254.2022.02.08] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Indexed: 06/14/2023]
Abstract
OBJECTIVE To investigate the performance of different low-dose CT image reconstruction algorithms for detecting intracerebral hemorrhage. METHODS Low-dose CT imaging simulation was performed on CT images of intracerebral hemorrhage at 30%, 25% and 20% of normal dose level (defined as 100% dose). Seven algorithms were tested to reconstruct low-dose CT images for noise suppression, including filtered back projection algorithm (FBP), penalized weighted least squares-total variation (PWLS-TV), non-local mean filter (NLM), block matching 3D (BM3D), residual encoding-decoding convolutional neural network (REDCNN), the FBP convolutional neural network (FBPConvNet) and image restoration iterative residual convolutional network (IRLNet). A deep learning-based model (CNN-LSTM) was used to detect intracerebral hemorrhage on normal dose CT images and low-dose CT images reconstructed using the 7 algorithms. The performance of different reconstruction algorithms for detecting intracerebral hemorrhage was evaluated by comparing the results between normal dose CT images and low-dose CT images. RESULTS At different dose levels, the low-dose CT images reconstructed by FBP had accuracies of detecting intracerebral hemorrhage of 82.21%, 74.61% and 65.55% at 30%, 25% and 20% dose levels, respectively. At the same dose level (30% dose), the images reconstructed by FBP, PWLS-TV, NLM, BM3D, REDCNN, FBPConvNet and IRLNet algorithms had accuracies for detecting intracerebral hemorrhage of 82.21%, 86.80%, 89.37%, 81.43%, 90.05%, 90.72% and 93.51%, respectively. The images reconstructed by IRLNet at 30%, 25% and 20% dose levels had accuracies for detecting intracerebral hemorrhage of 93.51%, 93.51% and 93.06%, respectively. CONCLUSION The performance of reconstructed low-dose CT images for detecting intracerebral hemorrhage is significantly affected by both dose and reconstruction algorithms. In clinical practice, choosing appropriate dose level and reconstruction algorithm can greatly reduce the radiation dose and ensure the detection performance of CT imaging for intracerebral hemorrhage.
Collapse
Affiliation(s)
- 帅 符
- 南方医科大学生物医学工程学院,广东 广州 510515School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
- 广州市医用放射成像与检测技术重点实验室,广东 广州 510515Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| | - 明强 李
- 琶洲实验室,广东 广州 510515Pazhou Lab, Guangzhou 510515, China
| | - 兆英 边
- 南方医科大学生物医学工程学院,广东 广州 510515School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
- 广州市医用放射成像与检测技术重点实验室,广东 广州 510515Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| | - 建华 马
- 南方医科大学生物医学工程学院,广东 广州 510515School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
- 广州市医用放射成像与检测技术重点实验室,广东 广州 510515Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology, Guangzhou 510515, China
| |
Collapse
|
12
|
He W, Zou L, Shekar AK, Gou L, Ren L. Where Can We Help? A Visual Analytics Approach to Diagnosing and Improving Semantic Segmentation of Movable Objects. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:1040-1050. [PMID: 34587077 DOI: 10.1109/tvcg.2021.3114855] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Semantic segmentation is a critical component in autonomous driving and has to be thoroughly evaluated due to safety concerns. Deep neural network (DNN) based semantic segmentation models are widely used in autonomous driving. However, it is challenging to evaluate DNN-based models due to their black-box-like nature, and it is even more difficult to assess model performance for crucial objects, such as lost cargos and pedestrians, in autonomous driving applications. In this work, we propose VASS, a Visual Analytics approach to diagnosing and improving the accuracy and robustness of Semantic Segmentation models, especially for critical objects moving in various driving scenes. The key component of our approach is a context-aware spatial representation learning that extracts important spatial information of objects, such as position, size, and aspect ratio, with respect to given scene contexts. Based on this spatial representation, we first use it to create visual summarization to analyze models' performance. We then use it to guide the generation of adversarial examples to evaluate models' spatial robustness and obtain actionable insights. We demonstrate the effectiveness of VASS via two case studies of lost cargo detection and pedestrian detection in autonomous driving. For both cases, we show quantitative evaluation on the improvement of models' performance with actionable insights obtained from VASS.
Collapse
|
13
|
Zeng W, Lin C, Lin J, Jiang J, Xia J, Turkay C, Chen W. Revisiting the Modifiable Areal Unit Problem in Deep Traffic Prediction with Visual Analytics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:839-848. [PMID: 33074818 DOI: 10.1109/tvcg.2020.3030410] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep learning methods are being increasingly used for urban traffic prediction where spatiotemporal traffic data is aggregated into sequentially organized matrices that are then fed into convolution-based residual neural networks. However, the widely known modifiable areal unit problem within such aggregation processes can lead to perturbations in the network inputs. This issue can significantly destabilize the feature embeddings and the predictions - rendering deep networks much less useful for the experts. This paper approaches this challenge by leveraging unit visualization techniques that enable the investigation of many-to-many relationships between dynamically varied multi-scalar aggregations of urban traffic data and neural network predictions. Through regular exchanges with a domain expert, we design and develop a visual analytics solution that integrates 1) a Bivariate Map equipped with an advanced bivariate colormap to simultaneously depict input traffic and prediction errors across space, 2) a Moran's I Scatterplot that provides local indicators of spatial association analysis, and 3) a Multi-scale Attribution View that arranges non-linear dot plots in a tree layout to promote model analysis and comparison across scales. We evaluate our approach through a series of case studies involving a real-world dataset of Shenzhen taxi trips, and through interviews with domain experts. We observe that geographical scale variations have important impact on prediction performances, and interactive visual exploration of dynamically varying inputs and outputs benefit experts in the development of deep traffic prediction models.
Collapse
|