1
|
Bertucci D, Hamid MM, Anand Y, Ruangrotsakun A, Tabatabai D, Perez M, Kahng M. DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:320-330. [PMID: 36166545 DOI: 10.1109/tvcg.2022.3209425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this paper, we present DendroMap, a novel approach to interactively exploring large-scale image datasets for machine learning (ML). ML practitioners often explore image datasets by generating a grid of images or projecting high-dimensional representations of images into 2-D using dimensionality reduction techniques (e.g., t-SNE). However, neither approach effectively scales to large datasets because images are ineffectively organized and interactions are insufficiently supported. To address these challenges, we develop DendroMap by adapting Treemaps, a well-known visualization technique. DendroMap effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images. It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests at multiple levels of abstraction. Our case studies with widely-used image datasets for deep learning demonstrate that users can discover insights about datasets and trained models by examining the diversity of images, identifying underperforming subgroups, and analyzing classification errors. We conducted a user study that evaluates the effectiveness of DendroMap in grouping and searching tasks by comparing it with a gridified version of t-SNE and found that participants preferred DendroMap. DendroMap is available at https://div-lab.github.io/dendromap/.
Collapse
|
2
|
Artificial Intelligence-Based Medical Data Mining. J Pers Med 2022; 12:jpm12091359. [PMID: 36143144 PMCID: PMC9501106 DOI: 10.3390/jpm12091359] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/02/2022] [Accepted: 08/17/2022] [Indexed: 11/17/2022] Open
Abstract
Understanding published unstructured textual data using traditional text mining approaches and tools is becoming a challenging issue due to the rapid increase in electronic open-source publications. The application of data mining techniques in the medical sciences is an emerging trend; however, traditional text-mining approaches are insufficient to cope with the current upsurge in the volume of published data. Therefore, artificial intelligence-based text mining tools are being developed and used to process large volumes of data and to explore the hidden features and correlations in the data. This review provides a clear-cut and insightful understanding of how artificial intelligence-based data-mining technology is being used to analyze medical data. We also describe a standard process of data mining based on CRISP-DM (Cross-Industry Standard Process for Data Mining) and the most common tools/libraries available for each step of medical data mining.
Collapse
|
3
|
Knittel J, Koch S, Tang T, Chen W, Wu Y, Liu S, Ertl T. Real-Time Visual Analysis of High-Volume Social Media Posts. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:879-889. [PMID: 34587041 DOI: 10.1109/tvcg.2021.3114800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Breaking news and first-hand reports often trend on social media platforms before traditional news outlets cover them. The real-time analysis of posts on such platforms can reveal valuable and timely insights for journalists, politicians, business analysts, and first responders, but the high number and diversity of new posts pose a challenge. In this work, we present an interactive system that enables the visual analysis of streaming social media data on a large scale in real-time. We propose an efficient and explainable dynamic clustering algorithm that powers a continuously updated visualization of the current thematic landscape as well as detailed visual summaries of specific topics of interest. Our parallel clustering strategy provides an adaptive stream with a digestible but diverse selection of recent posts related to relevant topics. We also integrate familiar visual metaphors that are highly interlinked for enabling both explorative and more focused monitoring tasks. Analysts can gradually increase the resolution to dive deeper into particular topics. In contrast to previous work, our system also works with non-geolocated posts and avoids extensive preprocessing such as detecting events. We evaluated our dynamic clustering algorithm and discuss several use cases that show the utility of our system.
Collapse
|
4
|
Takahashi S, Uchita A, Watanabe K, Arikawa M. Gaze-driven placement of items for proactive visual exploration. J Vis (Tokyo) 2021; 25:613-633. [PMID: 34785979 PMCID: PMC8581132 DOI: 10.1007/s12650-021-00808-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 09/11/2021] [Accepted: 10/06/2021] [Indexed: 11/17/2022]
Abstract
Recent advances in digital signage technology have improved the ability to visually select specific items within a group. Although this is due to the ability to dynamically update the display of items, the corresponding layout schemes remain a subject of research. This paper explores the sophisticated layout of items by respecting the underlying context of searching for favorite items. Our study begins by formulating the static placement of items as an optimization problem that incorporates aesthetic layout criteria as constraints. This is further extended to accommodate the dynamic placement of items for more proactive visual exploration based on the ongoing search context. Our animated layout is driven by analyzing the distribution of eye gaze through an eye-tracking device, by which we infer how the most attractive items lead to the finally wanted ones. We create a planar layout of items as a context map to establish association rules to dynamically replace existing items with new ones. For this purpose, we extract the set of important topics from a set of annotated texts associated with the items using matrix factorization. We also conduct user studies to evaluate the validity of the design criteria incorporated into both static and dynamic placement of items. After discussing the pros and cons of the proposed approach and possible themes for future research, we conclude this paper.
Collapse
Affiliation(s)
- Shigeo Takahashi
- Department of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu, 965-8580 Japan
| | - Akane Uchita
- Department of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu, 965-8580 Japan
| | - Kazuho Watanabe
- Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, 441-8580 Japan
| | - Masatoshi Arikawa
- Graduate School of Engineering Science, Akita University, Akita, 010-8502 Japan
| |
Collapse
|
5
|
Kim H, Drake B, Endert A, Park H. ArchiText: Interactive Hierarchical Topic Modeling. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:3644-3655. [PMID: 32191890 DOI: 10.1109/tvcg.2020.2981456] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Human-in-the-loop topic modeling allows users to explore and steer the process to produce better quality topics that align with their needs. When integrated into visual analytic systems, many existing automated topic modeling algorithms are given interactive parameters to allow users to tune or adjust them. However, this has limitations when the algorithms cannot be easily adapted to changes, and it is difficult to realize interactivity closely supported by underlying algorithms. Instead, we emphasize the concept of tight integration, which advocates for the need to co-develop interactive algorithms and interactive visual analytic systems in parallel to allow flexibility and scalability. In this article, we describe design goals for efficiently and effectively executing the concept of tight integration among computation, visualization, and interaction for hierarchical topic modeling of text data. We propose computational base operations for interactive tasks to achieve the design goals. To instantiate our concept, we present ArchiText, a prototype system for interactive hierarchical topic modeling, which offers fast, flexible, and algorithmically valid analysis via tight integration. Utilizing interactive hierarchical topic modeling, our technique lets users generate, explore, and flexibly steer hierarchical topics to discover more informed topics and their document memberships.
Collapse
|
6
|
Lafia S, Kuhn W, Caylor K, Hemphill L. Mapping research topics at multiple levels of detail. PATTERNS (NEW YORK, N.Y.) 2021; 2:100210. [PMID: 33748794 PMCID: PMC7961178 DOI: 10.1016/j.patter.2021.100210] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 12/09/2020] [Accepted: 01/20/2021] [Indexed: 11/18/2022]
Abstract
The institutional review of interdisciplinary bodies of research lacks methods to systematically produce higher-level abstractions. Abstraction methods, like the "distant reading" of corpora, are increasingly important for knowledge discovery in the sciences and humanities. We demonstrate how abstraction methods complement the metrics on which research reviews currently rely. We model cross-disciplinary topics of research publications and projects emerging at multiple levels of detail in the context of an institutional review of the Earth Research Institute (ERI) at the University of California at Santa Barbara. From these, we design science maps that reveal the latent thematic structure of ERI's interdisciplinary research and enable reviewers to "read" a body of research at multiple levels of detail. We find that our approach provides decision support and reveals trends that strengthen the institutional review process by exposing regions of thematic expertise, distributions and clusters of work, and the evolution of these aspects.
Collapse
Affiliation(s)
- Sara Lafia
- ICPSR, University of Michigan, Ann Arbor, MI 48109, USA
| | - Werner Kuhn
- Department of Geography, University of California, Santa Barbara, CA 93106, USA
| | - Kelly Caylor
- Department of Geography, University of California, Santa Barbara, CA 93106, USA
| | | |
Collapse
|
7
|
SemanticAxis: exploring multi-attribute data by semantic construction and ranking analysis. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00733-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
8
|
Chen S, Andrienko N, Andrienko G, Adilova L, Barlet J, Kindermann J, Nguyen PH, Thonnard O, Turkay C. LDA Ensembles for Interactive Exploration and Categorization of Behaviors. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:2775-2792. [PMID: 30869622 DOI: 10.1109/tvcg.2019.2904069] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We define behavior as a set of actions performed by some actor during a period of time. We consider the problem of analyzing a large collection of behaviors by multiple actors, more specifically, identifying typical behaviors and spotting anomalous behaviors. We propose an approach leveraging topic modeling techniques - LDA (Latent Dirichlet Allocation) Ensembles - to represent categories of typical behaviors by topics that are obtained through topic modeling a behavior collection. When such methods are applied to text in natural languages, the quality of the extracted topics are usually judged based on the semantic relatedness of the terms pertinent to the topics. This criterion, however, is not necessarily applicable to topics extracted from non-textual data, such as action sets, since relationships between actions may not be obvious. We have developed a suite of visual and interactive techniques supporting the construction of an appropriate combination of topics based on other criteria, such as distinctiveness and coverage of the behavior set. Two case studies on analyzing operation behaviors in the security management system and visiting behaviors in an amusement park, and the expert evaluation of the first case study demonstrate the effectiveness of our approach.
Collapse
|
9
|
Han Q, Thom D, John M, Koch S, Heimerl F, Ertl T. Visual Quality Guidance for Document Exploration with Focus+Context Techniques. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:2715-2731. [PMID: 30676964 DOI: 10.1109/tvcg.2019.2895073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Magic lens based focus+context techniques are powerful means for exploring document spatializations. Typically, they only offer additional summarized or abstracted views on focused documents. As a consequence, users might miss important information that is either not shown in aggregated form or that never happens to get focused. In this work, we present the design process and user study results for improving a magic lens based document exploration approach with exemplary visual quality cues to guide users in steering the exploration and support them in interpreting the summarization results. We contribute a thorough analysis of potential sources of information loss involved in these techniques, which include the visual spatialization of text documents, user-steered exploration, and the visual summarization. With lessons learned from previous research, we highlight the various ways those information losses could hamper the exploration. Furthermore, we formally define measures for the aforementioned different types of information losses and bias. Finally, we present the visual cues to depict these quality measures that are seamlessly integrated into the exploration approach. These visual cues guide users during the exploration and reduce the risk of misinterpretation and accelerate insight generation. We conclude with the results of a controlled user study and discuss the benefits and challenges of integrating quality guidance in exploration techniques.
Collapse
|
10
|
Abstract
Stability in social, technical, and financial systems, as well as the capacity of organizations to work across borders, requires consistency in public policy across jurisdictions. The diffusion of laws and regulations across political boundaries can reduce the tension that arises between innovation and consistency. Policy diffusion has been a topic of focus across the social sciences for several decades, but due to limitations of data and computational capacity, researchers have not taken a comprehensive and data-intensive look at the aggregate, cross-policy patterns of diffusion. This work combines visual analytics and text and network analyses to help understand how policies, as represented in digitized text, spread across states. As a result, our approach can quickly guide analysts to progressively gain insights into policy adoption data. We evaluate the effectiveness of our system via case studies with a real-world policy dataset and qualitative interviews with domain experts.
Collapse
Affiliation(s)
- Yongsu Ahn
- University of Pittsburgh, Pittsburgh, PA
| | - Yu-Ru Lin
- University of Pittsburgh, Pittsburgh, PA
| |
Collapse
|
11
|
Sherkat E, Milios EE, Minghim R. A Visual Analytics Approach for Interactive Document Clustering. ACM T INTERACT INTEL 2020. [DOI: 10.1145/3241380] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Document clustering is a necessary step in various analytical and automated activities. When guided by the user, algorithms are tailored to imprint a perspective on the clustering process that reflects the user’s understanding of the dataset. More than just allow for customized adjustment of the clusters, a visual analytics approach will provide tools for the user to draw new insights on the collection. While contributing his or her perspective, the user will also acquire a deeper understanding of the data set. To that effect, we propose a novel visual analytics system for interactive document clustering. We built our system on top of clustering algorithms that can adapt to user’s feedback. In the proposed system, initial clustering is created based on the user-defined number of clusters and the selected clustering algorithm. A set of coordinated visualizations allow the examination of the dataset and the results of the clustering. The visualization provides the user with the highlights of individual documents and understanding of the evolution of documents over the time period to which they relate. The users then interact with the process by means of changing key-terms that drive the process according to their knowledge of the documents domain. In key-term-based interaction, the user assigns a set of key-terms to each target cluster to guide the clustering algorithm. We have improved that process with a novel algorithm for choosing proper seeds for the clustering. Results demonstrate that not only the system has improved considerably its precision, but also its effectiveness in the document-based decision making. A set of quantitative experiments and a user study have been conducted to show the advantages of the approach for document analytics based on clustering. We performed and reported on the use of the framework in a real decision-making scenario that relates users discussion by email to decision making in improving patient care. Results show that the framework is useful even for more complex data sets such as email conversations.
Collapse
|
12
|
Dessavre DG, Ramirez-Marquez JE. Nar-A-Viz: A methodology to visually extract the narrative structure of text. COMPUT SPEECH LANG 2019. [DOI: 10.1016/j.csl.2019.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
13
|
Xie X, Cai X, Zhou J, Cao N, Wu Y. A Semantic-Based Method for Visualizing Large Image Collections. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2362-2377. [PMID: 29993720 DOI: 10.1109/tvcg.2018.2835485] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Interactive visualization of large image collections is important and useful in many applications, such as personal album management and user profiling on images. However, most prior studies focus on using low-level visual features of images, such as texture and color histogram, to create visualizations without considering the more important semantic information embedded in images. This paper proposes a novel visual analytic system to analyze images in a semantic-aware manner. The system mainly comprises two components: a semantic information extractor and a visual layout generator. The semantic information extractor employs an image captioning technique based on convolutional neural network (CNN) to produce descriptive captions for images, which can be transformed into semantic keywords. The layout generator employs a novel co-embedding model to project images and the associated semantic keywords to the same 2D space. Inspired by the galaxy metaphor, we further turn the projected 2D space to a galaxy visualization of images, in which semantic keywords and images are visually encoded as stars and planets. Our system naturally supports multi-scale visualization and navigation, in which users can immediately see a semantic overview of an image collection and drill down for detailed inspection of a certain group of images. Users can iteratively refine the visual layout by integrating their domain knowledge into the co-embedding process. Two task-based evaluations are conducted to demonstrate the effectiveness of our system.
Collapse
|
14
|
Liu S, Wang X, Collins C, Dou W, Ouyang F, El-Assady M, Jiang L, Keim DA. Bridging Text Visualization and Mining: A Task-Driven Survey. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2482-2504. [PMID: 29993887 DOI: 10.1109/tvcg.2018.2834341] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Visual text analytics has recently emerged as one of the most prominent topics in both academic research and the commercial world. To provide an overview of the relevant techniques and analysis tasks, as well as the relationships between them, we comprehensively analyzed 263 visualization papers and 4,346 mining papers published between 1992-2017 in two fields: visualization and text mining. From the analysis, we derived around 300 concepts (visualization techniques, mining techniques, and analysis tasks) and built a taxonomy for each type of concept. The co-occurrence relationships between the concepts were also extracted. Our research can be used as a stepping-stone for other researchers to 1) understand a common set of concepts used in this research topic; 2) facilitate the exploration of the relationships between visualization techniques, mining techniques, and analysis tasks; 3) understand the current practice in developing visual text analytics tools; 4) seek potential research opportunities by narrowing the gulf between visualization and mining techniques based on the analysis tasks; and 5) analyze other interdisciplinary research areas in a similar way. We have also contributed a web-based visualization tool for analyzing and understanding research trends and opportunities in visual text analytics.
Collapse
|
15
|
Ji X, Shen HW, Ritter A, Machiraju R, Yen PY. Visual Exploration of Neural Document Embedding in Information Retrieval: Semantics and Feature Selection. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2019; 25:2181-2192. [PMID: 30892213 DOI: 10.1109/tvcg.2019.2903946] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Neural embeddings are widely used in language modeling and feature generation with superior computational power. Particularly, neural document embedding - converting texts of variable-length to semantic vector representations - has shown to benefit widespread downstream applications, e.g., information retrieval (IR). However, the black-box nature makes it difficult to understand how the semantics are encoded and employed. We propose visual exploration of neural document embedding to gain insights into the underlying embedding space, and promote the utilization in prevalent IR applications. In this study, we take an IR application-driven view, which is further motivated by biomedical IR in healthcare decision-making, and collaborate with domain experts to design and develop a visual analytics system. This system visualizes neural document embeddings as a configurable document map and enables guidance and reasoning; facilitates to explore the neural embedding space and identify salient neural dimensions (semantic features) per task and domain interest; and supports advisable feature selection (semantic analysis) along with instant visual feedback to promote IR performance. We demonstrate the usefulness and effectiveness of this system and present inspiring findings in use cases. This work will help designers/developers of downstream applications gain insights and confidence in neural document embedding, and exploit that to achieve more favorable performance in application domains.
Collapse
|
16
|
|
17
|
El-Assady M, Sperrle F, Deussen O, Keim D, Collins C. Visual Analytics for Topic Model Optimization based on User-Steerable Speculative Execution. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:374-384. [PMID: 30235133 DOI: 10.1109/tvcg.2018.2864769] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
To effectively assess the potential consequences of human interventions in model-driven analytics systems, we establish the concept of speculative execution as a visual analytics paradigm for creating user-steerable preview mechanisms. This paper presents an explainable, mixed-initiative topic modeling framework that integrates speculative execution into the algorithmic decisionmaking process. Our approach visualizes the model-space of our novel incremental hierarchical topic modeling algorithm, unveiling its inner-workings. We support the active incorporation of the user's domain knowledge in every step through explicit model manipulation interactions. In addition, users can initialize the model with expected topic seeds, the backbone priors. For a more targeted optimization, the modeling process automatically triggers a speculative execution of various optimization strategies, and requests feedback whenever the measured model quality deteriorates. Users compare the proposed optimizations to the current model state and preview their effect on the next model iterations, before applying one of them. This supervised human-in-the-loop process targets maximum improvement for minimum feedback and has proven to be effective in three independent studies that confirm topic model quality improvements.
Collapse
|
18
|
Suh S, Shin S, Lee J, Reddy CK, Choo J. Localized user-driven topic discovery via boosted ensemble of nonnegative matrix factorization. Knowl Inf Syst 2018. [DOI: 10.1007/s10115-017-1147-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
19
|
Liu S, Chen C, Lu Y, Ouyang F, Wang B. An Interactive Method to Improve Crowdsourced Annotations. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:235-245. [PMID: 30130224 DOI: 10.1109/tvcg.2018.2864843] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In order to effectively infer correct labels from noisy crowdsourced annotations, learning-from-crowds models have introduced expert validation. However, little research has been done on facilitating the validation procedure. In this paper, we propose an interactive method to assist experts in verifying uncertain instance labels and unreliable workers. Given the instance labels and worker reliability inferred from a learning-from-crowds model, candidate instances and workers are selected for expert validation. The influence of verified results is propagated to relevant instances and workers through the learning-from-crowds model. To facilitate the validation of annotations, we have developed a confusion visualization to indicate the confusing classes for further exploration, a constrained projection method to show the uncertain labels in context, and a scatter-plot-based visualization to illustrate worker reliability. The three visualizations are tightly integrated with the learning-from-crowds model to provide an iterative and progressive environment for data validation. Two case studies were conducted that demonstrate our approach offers an efficient method for validating and improving crowdsourced annotations.
Collapse
|
20
|
Dowling M, Wenskovitch J, Fry JT, Leman S, House L, North C. SIRIUS: Dual, Symmetric, Interactive Dimension Reductions. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 25:172-182. [PMID: 30136978 DOI: 10.1109/tvcg.2018.2865047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Much research has been done regarding how to visualize and interact with observations and attributes of high-dimensional data for exploratory data analysis. From the analyst's perceptual and cognitive perspective, current visualization approaches typically treat the observations of the high-dimensional dataset very differently from the attributes. Often, the attributes are treated as inputs (e.g., sliders), and observations as outputs (e.g., projection plots), thus emphasizing investigation of the observations. However, there are many cases in which analysts wish to investigate both the observations and the attributes of the dataset, suggesting a symmetry between how analysts think about attributes and observations. To address this, we define SIRIUS (Symmetric Interactive Representations In a Unified System), a symmetric, dual projection technique to support exploratory data analysis of high-dimensional data. We provide an example implementation of SIRIUS and demonstrate how this symmetry affords additional insights.
Collapse
|
21
|
|