1
|
Li G, Togo R, Ogawa T, Haseyama M. Importance-aware adaptive dataset distillation. Neural Netw 2024; 172:106154. [PMID: 38309137 DOI: 10.1016/j.neunet.2024.106154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 01/04/2024] [Accepted: 01/28/2024] [Indexed: 02/05/2024]
Abstract
Herein, we propose a novel dataset distillation method for constructing small informative datasets that preserve the information of the large original datasets. The development of deep learning models is enabled by the availability of large-scale datasets. Despite unprecedented success, large-scale datasets considerably increase the storage and transmission costs, resulting in a cumbersome model training process. Moreover, using raw data for training raises privacy and copyright concerns. To address these issues, a new task named dataset distillation has been introduced, aiming to synthesize a compact dataset that retains the essential information from the large original dataset. State-of-the-art (SOTA) dataset distillation methods have been proposed by matching gradients or network parameters obtained during training on real and synthetic datasets. The contribution of different network parameters to the distillation process varies, and uniformly treating them leads to degraded distillation performance. Based on this observation, we propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance by automatically assigning importance weights to different network parameters during distillation, thereby synthesizing more robust distilled datasets. IADD demonstrates superior performance over other SOTA dataset distillation methods based on parameter matching on multiple benchmark datasets and outperforms them in terms of cross-architecture generalization. In addition, the analysis of self-adaptive weights demonstrates the effectiveness of IADD. Furthermore, the effectiveness of IADD is validated in a real-world medical application such as COVID-19 detection.
Collapse
Affiliation(s)
- Guang Li
- Education and Research Center for Mathematical and Data Science, Hokkaido University, N-12, W-7, Kita-Ku, Sapporo, 060-0812, Japan.
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| |
Collapse
|
2
|
Watanabe Y, Togo R, Maeda K, Ogawa T, Haseyama M. Text-Guided Image Editing Based on Post Score for Gaining Attention on Social Media. Sensors (Basel) 2024; 24:921. [PMID: 38339636 PMCID: PMC10857700 DOI: 10.3390/s24030921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/25/2024] [Accepted: 01/29/2024] [Indexed: 02/12/2024]
Abstract
Text-guided image editing has been highlighted in the fields of computer vision and natural language processing in recent years. The approach takes an image and text prompt as input and aims to edit the image in accordance with the text prompt while preserving text-unrelated regions. The results of text-guided image editing differ depending on the way the text prompt is represented, even if it has the same meaning. It is up to the user to decide which result best matches the intended use of the edited image. This paper assumes a situation in which edited images are posted to social media and proposes a novel text-guided image editing method to help the edited images gain attention from a greater audience. In the proposed method, we apply the pre-trained text-guided image editing method and obtain multiple edited images from the multiple text prompts generated from a large language model. The proposed method leverages the novel model that predicts post scores representing engagement rates and selects one image that will gain the most attention from the audience on social media among these edited images. Subject experiments on a dataset of real Instagram posts demonstrate that the edited images of the proposed method accurately reflect the content of the text prompts and provide a positive impression to the audience on social media compared to those of previous text-guided image editing methods.
Collapse
Affiliation(s)
- Yuto Watanabe
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan;
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| | - Keisuke Maeda
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| |
Collapse
|
3
|
Watanabe Y, Togo R, Maeda K, Ogawa T, Haseyama M. Manipulation Direction: Evaluating Text-Guided Image Manipulation Based on Similarity between Changes in Image and Text Modalities. Sensors (Basel) 2023; 23:9287. [PMID: 38005673 PMCID: PMC10675000 DOI: 10.3390/s23229287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 10/26/2023] [Accepted: 11/15/2023] [Indexed: 11/26/2023]
Abstract
At present, text-guided image manipulation is a notable subject of study in the vision and language field. Given an image and text as inputs, these methods aim to manipulate the image according to the text, while preserving text-irrelevant regions. Although there has been extensive research to improve the versatility and performance of text-guided image manipulation, research on its performance evaluation is inadequate. This study proposes Manipulation Direction (MD), a logical and robust metric, which evaluates the performance of text-guided image manipulation by focusing on changes between image and text modalities. Specifically, we define MD as the consistency of changes between images and texts occurring before and after manipulation. By using MD to evaluate the performance of text-guided image manipulation, we can comprehensively evaluate how an image has changed before and after the image manipulation and whether this change agrees with the text. Extensive experiments on Multi-Modal-CelebA-HQ and Caltech-UCSD Birds confirmed that there was an impressive correlation between our calculated MD scores and subjective scores for the manipulated images compared to the existing metrics.
Collapse
Affiliation(s)
- Yuto Watanabe
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan;
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| | - Keisuke Maeda
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (R.T.); (K.M.); (T.O.)
| |
Collapse
|
4
|
Li G, Togo R, Ogawa T, Haseyama M. Self-supervised learning for gastritis detection with gastric X-ray images. Int J Comput Assist Radiol Surg 2023; 18:1841-1848. [PMID: 37040011 DOI: 10.1007/s11548-023-02891-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 03/27/2023] [Indexed: 04/12/2023]
Abstract
PURPOSE Manual annotation of gastric X-ray images by doctors for gastritis detection is time-consuming and expensive. To solve this, a self-supervised learning method is developed in this study. The effectiveness of the proposed self-supervised learning method in gastritis detection is verified using a few annotated gastric X-ray images. METHODS In this study, we develop a novel method that can perform explicit self-supervised learning and learn discriminative representations from gastric X-ray images. Models trained based on the proposed method were fine-tuned on datasets comprising a few annotated gastric X-ray images. Five self-supervised learning methods, i.e., SimSiam, BYOL, PIRL-jigsaw, PIRL-rotation, and SimCLR, were compared with the proposed method. Furthermore, three previous methods, one pretrained on ImageNet, one trained from scratch, and one semi-supervised learning method, were compared with the proposed method. RESULTS The proposed method's harmonic mean score of sensitivity and specificity after fine-tuning with the annotated data of 10, 20, 30, and 40 patients were 0.875, 0.911, 0.915, and 0.931, respectively. The proposed method outperformed all comparative methods, including the five self-supervised learning and three previous methods. Experimental results showed the effectiveness of the proposed method in gastritis detection using a few annotated gastric X-ray images. CONCLUSIONS This paper proposes a novel self-supervised learning method based on a teacher-student architecture for gastritis detection using gastric X-ray images. The proposed method can perform explicit self-supervised learning and learn discriminative representations from gastric X-ray images. The proposed method exhibits potential clinical use in gastritis detection using a few annotated gastric X-ray images.
Collapse
Affiliation(s)
- Guang Li
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan.
| |
Collapse
|
5
|
Yoshida M, Togo R, Ogawa T, Haseyama M. Off-Screen Sound Separation Based on Audio-visual Pre-training Using Binaural Audio. Sensors (Basel) 2023; 23:s23094540. [PMID: 37177744 PMCID: PMC10181533 DOI: 10.3390/s23094540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 04/26/2023] [Accepted: 05/05/2023] [Indexed: 05/15/2023]
Abstract
This study proposes a novel off-screen sound separation method based on audio-visual pre-training. In the field of audio-visual analysis, researchers have leveraged visual information for audio manipulation tasks, such as sound source separation. Although such audio manipulation tasks are based on correspondences between audio and video, these correspondences are not always established. Specifically, sounds coming from outside a screen have no audio-visual correspondences and thus interfere with conventional audio-visual learning. The proposed method separates such off-screen sounds based on their arrival directions using binaural audio, which provides us with three-dimensional sensation. Furthermore, we propose a new pre-training method that can consider the off-screen space and use the obtained representation to improve off-screen sound separation. Consequently, the proposed method can separate off-screen sounds irrespective of the direction from which they arrive. We conducted our evaluation using generated video data to circumvent the problem of difficulty in collecting ground truth for off-screen sounds. We confirmed the effectiveness of our methods through off-screen sound detection and separation tasks.
Collapse
Affiliation(s)
- Masaki Yoshida
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| |
Collapse
|
6
|
Li G, Togo R, Ogawa T, Haseyama M. COVID-19 detection based on self-supervised transfer learning using chest X-ray images. Int J Comput Assist Radiol Surg 2023; 18:715-722. [PMID: 36538184 PMCID: PMC9765379 DOI: 10.1007/s11548-022-02813-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022]
Abstract
PURPOSE Considering several patients screened due to COVID-19 pandemic, computer-aided detection has strong potential in assisting clinical workflow efficiency and reducing the incidence of infections among radiologists and healthcare providers. Since many confirmed COVID-19 cases present radiological findings of pneumonia, radiologic examinations can be useful for fast detection. Therefore, chest radiography can be used to fast screen COVID-19 during the patient triage, thereby determining the priority of patient's care to help saturated medical facilities in a pandemic situation. METHODS In this paper, we propose a new learning scheme called self-supervised transfer learning for detecting COVID-19 from chest X-ray (CXR) images. We compared six self-supervised learning (SSL) methods (Cross, BYOL, SimSiam, SimCLR, PIRL-jigsaw, and PIRL-rotation) with the proposed method. Additionally, we compared six pretrained DCNNs (ResNet18, ResNet50, ResNet101, CheXNet, DenseNet201, and InceptionV3) with the proposed method. We provide quantitative evaluation on the largest open COVID-19 CXR dataset and qualitative results for visual inspection. RESULTS Our method achieved a harmonic mean (HM) score of 0.985, AUC of 0.999, and four-class accuracy of 0.953. We also used the visualization technique Grad-CAM++ to generate visual explanations of different classes of CXR images with the proposed method to increase the interpretability. CONCLUSIONS Our method shows that the knowledge learned from natural images using transfer learning is beneficial for SSL of the CXR images and boosts the performance of representation learning for COVID-19 detection. Our method promises to reduce the incidence of infections among radiologists and healthcare providers.
Collapse
Affiliation(s)
- Guang Li
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, Sapporo, Japan
| |
Collapse
|
7
|
Zhu H, Togo R, Ogawa T, Haseyama M. Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation. Sensors (Basel) 2023; 23:1057. [PMID: 36772095 PMCID: PMC9919063 DOI: 10.3390/s23031057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/07/2023] [Accepted: 01/14/2023] [Indexed: 06/18/2023]
Abstract
Auxiliary clinical diagnosis has been researched to solve unevenly and insufficiently distributed clinical resources. However, auxiliary diagnosis is still dominated by human physicians, and how to make intelligent systems more involved in the diagnosis process is gradually becoming a concern. An interactive automated clinical diagnosis with a question-answering system and a question generation system can capture a patient's conditions from multiple perspectives with less physician involvement by asking different questions to drive and guide the diagnosis. This clinical diagnosis process requires diverse information to evaluate a patient from different perspectives to obtain an accurate diagnosis. Recently proposed medical question generation systems have not considered diversity. Thus, we propose a diversity learning-based visual question generation model using a multi-latent space to generate informative question sets from medical images. The proposed method generates various questions by embedding visual and language information in different latent spaces, whose diversity is trained by our newly proposed loss. We have also added control over the categories of generated questions, making the generated questions directional. Furthermore, we use a new metric named similarity to accurately evaluate the proposed model's performance. The experimental results on the Slake and VQA-RAD datasets demonstrate that the proposed method can generate questions with diverse information. Our model works with an answering model for interactive automated clinical diagnosis and generates datasets to replace the process of annotation that incurs huge labor costs.
Collapse
Affiliation(s)
- He Zhu
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| |
Collapse
|
8
|
Maeda K, Togo R, Ogawa T, Adachi SI, Yoshizawa F, Haseyama M. Trial Analysis of the Relationship between Taste and Biological Information Obtained While Eating Strawberries for Sensory Evaluation. Sensors (Basel) 2022; 22:9496. [PMID: 36502199 PMCID: PMC9738716 DOI: 10.3390/s22239496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/27/2022] [Accepted: 12/01/2022] [Indexed: 06/17/2023]
Abstract
This paper presents a trial analysis of the relationship between taste and biological information obtained while eating strawberries (for a sensory evaluation). This study used the visual analog scale (VAS); we collected questionnaires used in previous studies and human brain activity obtained while eating strawberries. In our analysis, we assumed that brain activity is highly correlated with taste. Then, the relationships between brain activity and other data, such as VAS and questionnaires, could be analyzed through a canonical correlation analysis, which is a multivariate analysis. Through an analysis of brain activity, the potential relationship with "taste" (that is not revealed by the initial simple correlation analysis) can be discovered. This is the main contribution of this study. In the experiments, we discovered the potential relationship between cultural factors (in the questionnaires) and taste. We also found a strong relationship between taste and individual information. In particular, the analysis of cross-loading between brain activity and individual information suggests that acidity and the sugar-to-acid ratio are related to taste.
Collapse
Affiliation(s)
- Keisuke Maeda
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan
| | - Shin-ichi Adachi
- Center for Bioscience Research and Education, Utsunomiya University, 350, Mine-machi, Utsunomiya 321-8505, Japan
- Faculty of Health Sciences for Welfare, Kansai University of Welfare Sciences, 3-11-1, Asahigaoka, Kashiwabara, Osaka 582-0026, Japan
| | - Fumiaki Yoshizawa
- School of Agriculture, Utsunomiya University, 350, Mine-machi, Utsunomiya 321-8505, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan
| |
Collapse
|
9
|
Li G, Togo R, Ogawa T, Haseyama M. Compressed gastric image generation based on soft-label dataset distillation for medical data sharing. Comput Methods Programs Biomed 2022; 227:107189. [PMID: 36323177 DOI: 10.1016/j.cmpb.2022.107189] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 07/07/2022] [Accepted: 10/17/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Sharing of medical data is required to enable the cross-agency flow of healthcare information and construct high-accuracy computer-aided diagnosis systems. However, the large sizes of medical datasets, the massive amount of memory of saved deep convolutional neural network (DCNN) models, and patients' privacy protection are problems that can lead to inefficient medical data sharing. Therefore, this study proposes a novel soft-label dataset distillation method for medical data sharing. METHODS The proposed method distills valid information of medical image data and generates several compressed images with different data distributions for anonymous medical data sharing. Furthermore, our method can extract essential weights of DCNN models to reduce the memory required to save trained models for efficient medical data sharing. RESULTS The proposed method can compress tens of thousands of images into several soft-label images and reduce the size of a trained model to a few hundredths of its original size. The compressed images obtained after distillation have been visually anonymized; therefore, they do not contain the private information of the patients. Furthermore, we can realize high-detection performance with a small number of compressed images. CONCLUSIONS The experimental results show that the proposed method can improve the efficiency and security of medical data sharing.
Collapse
Affiliation(s)
- Guang Li
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| |
Collapse
|
10
|
Maeda K, Takada S, Haruyama T, Togo R, Ogawa T, Haseyama M. Distress Detection in Subway Tunnel Images via Data Augmentation Based on Selective Image Cropping and Patching. Sensors (Basel) 2022; 22:8932. [PMID: 36433529 PMCID: PMC9699127 DOI: 10.3390/s22228932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 11/11/2022] [Accepted: 11/14/2022] [Indexed: 06/16/2023]
Abstract
Distresses, such as cracks, directly reflect the structural integrity of subway tunnels. Therefore, the detection of subway tunnel distress is an essential task in tunnel structure maintenance. This paper presents the performance improvement of deep learning-based distress detection to support the maintenance of subway tunnels through a new data augmentation method, selective image cropping and patching (SICAP). Specifically, we generate effective data for training the distress detection model by focusing on the distressed regions via SICAP. After the data augmentation, we train a distress detection model using the expanded training data. The new image generated based on SICAP does not change the pixel values of the original image. Thus, there is little loss of information, and the generated images are effective in constructing a robust model for various subway tunnel lines. We conducted experiments with some comparative methods. The experimental results show that the detection performance can be improved by our data augmentation.
Collapse
Affiliation(s)
- Keisuke Maeda
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Saya Takada
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Tomoki Haruyama
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan
| |
Collapse
|
11
|
Wang A, Togo R, Ogawa T, Haseyama M. Defect Detection of Subway Tunnels Using Advanced U-Net Network. Sensors (Basel) 2022; 22:s22062330. [PMID: 35336501 PMCID: PMC8955254 DOI: 10.3390/s22062330] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 03/08/2022] [Accepted: 03/13/2022] [Indexed: 12/02/2022]
Abstract
In this paper, we present a novel defect detection model based on an improved U-Net architecture. As a semantic segmentation task, the defect detection task has the problems of background–foreground imbalance, multi-scale targets, and feature similarity between the background and defects in the real-world data. Conventionally, general convolutional neural network (CNN)-based networks mainly focus on natural image tasks, which are insensitive to the problems in our task. The proposed method has a network design for multi-scale segmentation based on the U-Net architecture including an atrous spatial pyramid pooling (ASPP) module and an inception module, and can detect various types of defects compared to conventional simple CNN-based methods. Through the experiments using a real-world subway tunnel image dataset, the proposed method showed a better performance than that of general semantic segmentation including state-of-the-art methods. Additionally, we showed that our method can achieve excellent detection balance among multi-scale defects.
Collapse
Affiliation(s)
- An Wang
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan
- Correspondence:
| | - Ren Togo
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan; (R.T.); (T.O.); (M.H.)
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan; (R.T.); (T.O.); (M.H.)
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Japan; (R.T.); (T.O.); (M.H.)
| |
Collapse
|
12
|
Li Z, Kitajima K, Hirata K, Togo R, Takenaka J, Miyoshi Y, Kudo K, Ogawa T, Haseyama M. Preliminary study of AI-assisted diagnosis using FDG-PET/CT for axillary lymph node metastasis in patients with breast cancer. EJNMMI Res 2021; 11:10. [PMID: 33492478 PMCID: PMC7835273 DOI: 10.1186/s13550-021-00751-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 01/11/2021] [Indexed: 01/10/2023] Open
Abstract
Background To improve the diagnostic accuracy of axillary lymph node (LN) metastasis in breast cancer patients using 2-[18F]FDG-PET/CT, we constructed an artificial intelligence (AI)-assisted diagnosis system that uses deep-learning technologies. Materials and methods Two clinicians and the new AI system retrospectively analyzed and diagnosed 414 axillae of 407 patients with biopsy-proven breast cancer who had undergone 2-[18F]FDG-PET/CT before a mastectomy or breast-conserving surgery with a sentinel lymph node (LN) biopsy and/or axillary LN dissection. We designed and trained a deep 3D convolutional neural network (CNN) as the AI model. The diagnoses from the clinicians were blended with the diagnoses from the AI model to improve the diagnostic accuracy. Results Although the AI model did not outperform the clinicians, the diagnostic accuracies of the clinicians were considerably improved by collaborating with the AI model: the two clinicians' sensitivities of 59.8% and 57.4% increased to 68.6% and 64.2%, respectively, whereas the clinicians' specificities of 99.0% and 99.5% remained unchanged. Conclusions It is expected that AI using deep-learning technologies will be useful in diagnosing axillary LN metastasis using 2-[18F]FDG-PET/CT. Even if the diagnostic performance of AI is not better than that of clinicians, taking AI diagnoses into consideration may positively impact the overall diagnostic accuracy.
Collapse
Affiliation(s)
- Zongyao Li
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, 060-0814, Japan
| | - Kazuhiro Kitajima
- Department of Radiology, Division of Nuclear Medicine and PET Center, Hyogo College of Medicine, 1-1 Mukogawa-cho, Nishinomiya, Hyogo, 663-8501, Japan
| | - Kenji Hirata
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 15, Nishi 7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan.
| | - Ren Togo
- Education and Research Center for Mathematical and Data Science, Hokkaido University, N-12, W-7, Kita-ku, Sapporo, 060-0812, Japan
| | - Junki Takenaka
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 15, Nishi 7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan
| | - Yasuo Miyoshi
- Department of Breast and Endocrine Surgery, Hyogo College of Medicine, 1-1 Mukogawa-cho, Nishinomiya, Hyogo, 663-8501, Japan
| | - Kohsuke Kudo
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 15, Nishi 7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan.,Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, 060-0814, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, 060-0814, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, 060-0814, Japan
| |
Collapse
|
13
|
Togo R, Watanabe H, Ogawa T, Haseyama M. Deep convolutional neural network-based anomaly detection for organ classification in gastric X-ray examination. Comput Biol Med 2020; 123:103903. [PMID: 32658795 DOI: 10.1016/j.compbiomed.2020.103903] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 06/18/2020] [Accepted: 06/29/2020] [Indexed: 12/01/2022]
Abstract
AIM The aim of this study was to determine whether our deep convolutional neural network-based anomaly detection model can distinguish differences in esophagus images and stomach images obtained from gastric X-ray examinations. METHODS A total of 6012 subjects were analyzed as our study subjects. Since the number of esophagus X-ray images is much smaller than the number of gastric X-ray images taken in X-ray examinations, we took an anomaly detection approach to realize the task of organ classification. We constructed a deep autoencoding gaussian mixture model (DAGMM) with a convolutional autoencoder architecture. The trained model can produce an anomaly score for a given test X-ray image. For comparison, the original DAGMM, AnoGAN, and a One-Class Support Vector Machine (OCSVM) that were trained with features obtained by a pre-trained Inception-v3 network were used. RESULTS Sensitivity, specificity, and the calculated harmonic mean of the proposed method were 0.956, 0.980, and 0.968, respectively. Those of the original DAGMM were 0.932, 0.883, and 0.907, respectively. Those of AnoGAN were 0.835, 0.833, and 0.834, respectively, and those of OCSVM were 0.932, 0.935, and 0.934, respectively. Experimental results showed the effectiveness of the proposed method for an organ classification task. CONCLUSION Our deep convolutional neural network-based anomaly detection model has shown the potential for clinical use in organ classification.
Collapse
Affiliation(s)
- Ren Togo
- Education and Research Center for Mathematical and Data Science, Hokkaido University, N-12, W-7, Kita-ku, Sapporo, 060-0812, Japan.
| | - Haruna Watanabe
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan.
| |
Collapse
|
14
|
Kanai M, Togo R, Ogawa T, Haseyama M. Chronic atrophic gastritis detection with a convolutional neural network considering stomach regions. World J Gastroenterol 2020; 26:3650-3659. [PMID: 32742133 PMCID: PMC7366055 DOI: 10.3748/wjg.v26.i25.3650] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 04/03/2020] [Accepted: 06/17/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The risk of gastric cancer increases in patients with Helicobacter pylori-associated chronic atrophic gastritis (CAG). X-ray examination can evaluate the condition of the stomach, and it can be used for gastric cancer mass screening. However, skilled doctors for interpretation of X-ray examination are decreasing due to the diverse of inspections.
AIM To evaluate the effectiveness of stomach regions that are automatically estimated by a deep learning-based model for CAG detection.
METHODS We used 815 gastric X-ray images (GXIs) obtained from 815 subjects. The ground truth of this study was the diagnostic results in X-ray and endoscopic examinations. For a part of GXIs for training, the stomach regions are manually annotated. A model for automatic estimation of the stomach regions is trained with the GXIs. For the rest of them, the stomach regions are automatically estimated. Finally, a model for automatic CAG detection is trained with all GXIs for training.
RESULTS In the case that the stomach regions were manually annotated for only 10 GXIs and 30 GXIs, the harmonic mean of sensitivity and specificity of CAG detection were 0.955 ± 0.002 and 0.963 ± 0.004, respectively.
CONCLUSION By estimating stomach regions automatically, our method contributes to the reduction of the workload of manual annotation and the accurate detection of the CAG.
Collapse
Affiliation(s)
- Misaki Kanai
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo 0600814, Hokkaido, Japan
| | - Ren Togo
- Education and Research Center for Mathematical and Data Science, Hokkaido University, Sapporo 0600812, Hokkaido, Japan
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, Sapporo 0600814, Hokkaido, Japan
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, Sapporo 0600814, Hokkaido, Japan
| |
Collapse
|
15
|
Togo R, Yamamichi N, Mabe K, Takahashi Y, Takeuchi C, Kato M, Sakamoto N, Ishihara K, Ogawa T, Haseyama M. Detection of gastritis by a deep convolutional neural network from double-contrast upper gastrointestinal barium X-ray radiography. J Gastroenterol 2019; 54:321-329. [PMID: 30284046 DOI: 10.1007/s00535-018-1514-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 09/12/2018] [Indexed: 02/04/2023]
Abstract
BACKGROUND Deep learning has become a new trend of image recognition tasks in the field of medicine. We developed an automated gastritis detection system using double-contrast upper gastrointestinal barium X-ray radiography. METHODS A total of 6520 gastric X-ray images obtained from 815 subjects were analyzed. We designed a deep convolutional neural network (DCNN)-based gastritis detection scheme and evaluated the effectiveness of our method. The detection performance of our method was compared with that of ABC (D) stratification. RESULTS Sensitivity, specificity, and harmonic mean of sensitivity and specificity of our method were 0.962, 0.983, and 0.972, respectively, and those of ABC (D) stratification were 0.925, 0.998, and 0.960, respectively. Although there were 18 false negative cases in ABC (D) stratification, 14 of those 18 cases were correctly classified into the positive group by our method. CONCLUSIONS Deep learning techniques may be effective for evaluation of gastritis/non-gastritis. Collaborative use of DCNN-based gastritis detection systems and ABC (D) stratification will provide more reliable gastric cancer risk information.
Collapse
Affiliation(s)
- Ren Togo
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan
| | - Nobutake Yamamichi
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Katsuhiro Mabe
- Department of Gastroenterology, National Hospital Organization Hakodate Hospital, 18-16, Kawahara-cho, Hakodate City, Hokkaido, 041-8512, Japan.
| | - Yu Takahashi
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Chihiro Takeuchi
- Department of Gastroenterology, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Mototsugu Kato
- Department of Gastroenterology, National Hospital Organization Hakodate Hospital, 18-16, Kawahara-cho, Hakodate City, Hokkaido, 041-8512, Japan
| | - Naoya Sakamoto
- Department of Gastroenterology, Hokkaido University Graduate School of Medicine, Sapporo, 060-8648, Japan
| | - Kenta Ishihara
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan
| | - Takahiro Ogawa
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan
| | - Miki Haseyama
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo, Hokkaido, 060-0814, Japan
| |
Collapse
|
16
|
Togo R, Hirata K, Manabe O, Ohira H, Tsujino I, Magota K, Ogawa T, Haseyama M, Shiga T. Cardiac sarcoidosis classification with deep convolutional neural network-based features using polar maps. Comput Biol Med 2018; 104:81-86. [PMID: 30447397 DOI: 10.1016/j.compbiomed.2018.11.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 10/20/2018] [Accepted: 11/10/2018] [Indexed: 01/01/2023]
Abstract
AIMS The aim of this study was to determine whether deep convolutional neural network (DCNN)-based features can represent the difference between cardiac sarcoidosis (CS) and non-CS using polar maps. METHODS A total of 85 patients (33 CS patients and 52 non-CS patients) were analyzed as our study subjects. One radiologist reviewed PET/CT images and defined the left ventricle region for the construction of polar maps. We extracted high-level features from the polar maps through the Inception-v3 network and evaluated their effectiveness by applying them to a CS classification task. Then we introduced the ReliefF algorithm in our method. The standardized uptake value (SUV)-based classification method and the coefficient of variance (CoV)-based classification method were used as comparative methods. RESULTS Sensitivity, specificity and the harmonic mean of sensitivity and specificity of our method with the ReliefF algorithm were 0.839, 0.870 and 0.854, respectively. Those of the SUVmax-based classification method were 0.468, 0.710 and 0.564, respectively, and those of the CoV-based classification method were 0.655, 0.750 and 0.699, respectively. CONCLUSION The DCNN-based high-level features may be more effective than low-level features used in conventional quantitative analysis methods for CS classification.
Collapse
Affiliation(s)
- Ren Togo
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido, 060-0814, Japan.
| | - Kenji Hirata
- Department of Nuclear Medicine, Hokkaido University Graduate School of Medicine, Hokkaido, 060-8638, Japan
| | - Osamu Manabe
- Department of Nuclear Medicine, Hokkaido University Graduate School of Medicine, Hokkaido, 060-8638, Japan.
| | - Hiroshi Ohira
- First Department of Medicine, Hokkaido University Hospital, Hokkaido, 060-8638, Japan
| | - Ichizo Tsujino
- First Department of Medicine, Hokkaido University Hospital, Hokkaido, 060-8638, Japan
| | - Keiichi Magota
- Division of Medical Imaging and Technology, Hokkaido University Hospital, Hokkaido, 060-8638, Japan
| | - Takahiro Ogawa
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido, 060-0814, Japan
| | - Miki Haseyama
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido, 060-0814, Japan
| | - Tohru Shiga
- Department of Nuclear Medicine, Hokkaido University Graduate School of Medicine, Hokkaido, 060-8638, Japan
| |
Collapse
|
17
|
Togo R, Ishihara K, Mabe K, Oizumi H, Ogawa T, Kato M, Sakamoto N, Nakajima S, Asaka M, Haseyama M. Preliminary study of automatic gastric cancer risk classification from photofluorography. World J Gastrointest Oncol 2018; 10:62-70. [PMID: 29467917 PMCID: PMC5807881 DOI: 10.4251/wjgo.v10.i2.62] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2017] [Revised: 12/05/2017] [Accepted: 12/13/2017] [Indexed: 02/05/2023] Open
Abstract
AIM To perform automatic gastric cancer risk classification using photofluorography for realizing effective mass screening as a preliminary study.
METHODS We used data for 2100 subjects including X-ray images, pepsinogen I and II levels, PGI/PGII ratio, Helicobacter pylori (H. pylori) antibody, H. pylori eradication history and interview sheets. We performed two-stage classification with our system. In the first stage, H. pylori infection status classification was performed, and H. pylori-infected subjects were automatically detected. In the second stage, we performed atrophic level classification to validate the effectiveness of our system.
RESULTS Sensitivity, specificity and Youden index (YI) of H. pylori infection status classification were 0.884, 0.895 and 0.779, respectively, in the first stage. In the second stage, sensitivity, specificity and YI of atrophic level classification for H. pylori-infected subjects were 0.777, 0.824 and 0.601, respectively.
CONCLUSION Although further improvements of the system are needed, experimental results indicated the effectiveness of machine learning techniques for estimation of gastric cancer risk.
Collapse
Affiliation(s)
- Ren Togo
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido 060-0814, Japan
| | - Kenta Ishihara
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido 060-0814, Japan
| | - Katsuhiro Mabe
- Department of Gastroenterology, National Hospital Organization Hakodate Hospital, Hokkaido 041-8512, Japan
| | - Harufumi Oizumi
- Medical Examination Center of the Yamagata City Medical Association, Yamagata 990-2473, Japan
| | - Takahiro Ogawa
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido 060-0814, Japan
| | - Mototsugu Kato
- Department of Gastroenterology, National Hospital Organization Hakodate Hospital, Hokkaido 041-8512, Japan
| | - Naoya Sakamoto
- Department of Gastroenterology, Hokkaido University Graduate School of Medicine, Hokkaido 060-8648, Japan
| | - Shigemi Nakajima
- Department of General Medicine, Japan Community Healthcare Organization Shiga Hospital, Shiga 520-0846, Japan
| | - Masahiro Asaka
- Health Sciences University of Hokkaido, Hokkaido 061-0293, Japan
| | - Miki Haseyama
- Graduate School of Information Science and Technology, Hokkaido University, Hokkaido 060-0814, Japan
| |
Collapse
|
18
|
Togo R, Ishihara K, Ogawa T, Haseyama M. Estimation of salient regions related to chronic gastritis using gastric X-ray images. Comput Biol Med 2016; 77:9-15. [DOI: 10.1016/j.compbiomed.2016.07.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Revised: 07/13/2016] [Accepted: 07/25/2016] [Indexed: 01/10/2023]
|
19
|
Umeki H, Takarabe M, Nishimura R, Togo R. [Unforgettable people and incidents--recording of details not expressed by figures]. Hokenfu Zasshi 1984; 40:903-907. [PMID: 6569187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
|