1
|
Shu C, Liu Y, Zheng K, Tang X, Li M, Shen Y, Zhou Y, Du W, Ma N, Zhao J. Diagnosis and Treatment of Primary Tracheobronchial Tumors. Cancer Med 2025; 14:e70893. [PMID: 40289301 PMCID: PMC12034573 DOI: 10.1002/cam4.70893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 03/20/2025] [Accepted: 04/07/2025] [Indexed: 04/30/2025] Open
Abstract
BACKGROUND Primary tracheobronchial tumors (PTBTs) are rare but life-threatening, accounting for approximately 0.2% of all respiratory neoplasms. Owing to their nonspecific clinical symptoms, PTBTs are often initially misdiagnosed as bronchial asthma or bronchitis in the early stages. In addition, standardized treatments for PTBTs are currently lacking. AIMS This study aimed to provide a comprehensive review of this diagnostic challenge and treatment modalities of PTBTs. METHODS Drawing on the latest literature and clinical guidelines, we carried out a comprehensive and systematic analysis of PTBTs, focusing on diagnostic modalities, and evidence-based treatment options. RESULTS AND CONCLUSIONS Primary diagnostic methods for PTBTs include pulmonary function tests, chest radiography, computed tomography, and fiberoptic bronchoscopy. Computed tomography, and fiberoptic bronchoscopy may be the most valuable diagnostic tools for patients with PTBTs or those highly suspected of having PTBTs. Currently, there are no consensus guidelines for PTBTs, and surgery is the most effective method for treating PTBTs if the patients have indications for surgery. In addition, radiotherapy, chemotherapy and interventional therapy may be useful complementary treatments for inoperable patients. Immunotherapy may be a significant management strategy for PTBTs in the future. Further researches should concentrate on both the early identification and enhanced therapeutic management of these tumors to improve survival and diminish morbidity and mortality rates by investigating the optimal design of systematic therapy.
Collapse
Affiliation(s)
- Chen Shu
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
- Department of Cardiothoracic SurgeryThe 902nd Hospital of the Chinese People's Liberation Army Joint Logistic Support ForceBengbuAnhuiChina
| | - Yu‐jian Liu
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
- Department of Cardiothoracic SurgeryCentral Theater Command General Hospital of Chinese People's Liberation ArmyWuhanHubeiChina
| | - Kai‐fu Zheng
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
- Department of General SurgeryThe 991st Hospital of the Chinese People's Liberation Army Joint Logistic Support ForceXiangyangHubeiChina
| | - Xi‐yang Tang
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
| | - Meng‐chao Li
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
| | - Yang Shen
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
| | - Yu‐long Zhou
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
| | - Wei‐guang Du
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
| | - Nan Ma
- Department of OphthalmologyTangdu Hospital, The Fourth Military Medical UniversityShaanxiChina
| | - Jin‐bo Zhao
- Department of Thoracic SurgeryTangdu Hospital, The Fourth Military Medical UniversityXi'anShaanxiChina
| |
Collapse
|
2
|
Chu H, Qi X, Wang H, Liang Y. Multi-label pathology editing of chest X-rays with a Controlled Diffusion Model. Med Image Anal 2025; 103:103584. [PMID: 40288335 DOI: 10.1016/j.media.2025.103584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2024] [Revised: 04/03/2025] [Accepted: 04/04/2025] [Indexed: 04/29/2025]
Abstract
Large-scale generative models have garnered significant attention in the field of medical imaging, particularly for image editing utilizing diffusion models. However, current research has predominantly concentrated on pathological editing involving single or a limited number of labels, making it challenging to achieve precise modifications. Inaccurate alterations may lead to substantial discrepancies between the generated and original images, thereby impacting the clinical applicability of these models. This paper presents a diffusion model with untangling capabilities applied to chest X-ray image editing, incorporating a mask-based mechanism for bone and organ information. We successfully perform multi-label pathological editing of chest X-ray images without compromising the integrity of the original thoracic structure. The proposed technology comprises a chest X-ray image classifier and an intricate organ mask; the classifier supplies essential feature labels that require untangling for the stabilized diffusion model, while the complex organ mask facilitates directed and controllable edits to chest X-rays. We assessed the outcomes of our proposed algorithm, named Chest X-rays_Mpe, using MS-SSIM and CLIP scores alongside qualitative evaluations conducted by radiology experts. The results indicate that our approach surpasses existing algorithms across both quantitative and qualitative metrics.
Collapse
Affiliation(s)
- Huan Chu
- School of Network Security and Information Technology, Yili Normal University, Yining, 835000, China; Key Laboratory of Intelligent Computing Research and Application, Yining, 835399, China
| | - Xiaolong Qi
- School of Network Security and Information Technology, Yili Normal University, Yining, 835000, China; State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China; Key Laboratory of Intelligent Computing Research and Application, Yining, 835399, China.
| | - Huiling Wang
- School of Network Security and Information Technology, Yili Normal University, Yining, 835000, China; State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China; Key Laboratory of Intelligent Computing Research and Application, Yining, 835399, China
| | - Yi Liang
- School of Network Security and Information Technology, Yili Normal University, Yining, 835000, China; State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
| |
Collapse
|
3
|
Rao VM, Hla M, Moor M, Adithan S, Kwak S, Topol EJ, Rajpurkar P. Multimodal generative AI for medical image interpretation. Nature 2025; 639:888-896. [PMID: 40140592 DOI: 10.1038/s41586-025-08675-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 01/20/2025] [Indexed: 03/28/2025]
Abstract
Accurately interpreting medical images and generating insightful narrative reports is indispensable for patient care but places heavy burdens on clinical experts. Advances in artificial intelligence (AI), especially in an area that we refer to as multimodal generative medical image interpretation (GenMI), create opportunities to automate parts of this complex process. In this Perspective, we synthesize progress and challenges in developing AI systems for generation of medical reports from images. We focus extensively on radiology as a domain with enormous reporting needs and research efforts. In addition to analysing the strengths and applications of new models for medical report generation, we advocate for a novel paradigm to deploy GenMI in a manner that empowers clinicians and their patients. Initial research suggests that GenMI could one day match human expert performance in generating reports across disciplines, such as radiology, pathology and dermatology. However, formidable obstacles remain in validating model accuracy, ensuring transparency and eliciting nuanced impressions. If carefully implemented, GenMI could meaningfully assist clinicians in improving quality of care, enhancing medical education, reducing workloads, expanding specialty access and providing real-time expertise. Overall, we highlight opportunities alongside key challenges for developing multimodal generative AI that complements human experts for reliable medical report writing.
Collapse
Affiliation(s)
- Vishwanatha M Rao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Michael Hla
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Computer Science, Harvard College, Cambridge, MA, USA
| | - Michael Moor
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Biosystems Science and Engineering, ETH Zurich, Zurich, Switzerland
| | - Subathra Adithan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Radiodiagnosis, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Stephen Kwak
- Department of Radiology, Johns Hopkins University, Baltimore, MD, USA
| | | | - Pranav Rajpurkar
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
4
|
Sathish R, Sheet D. A wrapper method for finding optimal subset of multimodal Magnetic Resonance Imaging sequences for ischemic stroke lesion segmentation. Comput Biol Med 2025; 185:109590. [PMID: 39724829 DOI: 10.1016/j.compbiomed.2024.109590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 11/11/2024] [Accepted: 12/15/2024] [Indexed: 12/28/2024]
Abstract
Multimodal data, while being information-rich, contains complementary as well as redundant information. Depending on the target problem some modalities are more informative and thus relevant for decision-making. Identifying the optimal subset of modalities best suited to solve a particular task significantly reduces the complexity of acquisition without compromising performance. In this work, we propose a wrapper method for examining the importance of Magnetic Resonance Imaging (MRI) sequences for ischemic stroke lesion segmentation using a deep neural network trained for segmentation. Importance score for each modality is computed through a combinatorial dropout of input modalities at inference coupled with a systematic evaluation its impact on the model's performance. Experimental evaluation of the proposed method is performed on two publicly available datasets: (i) ISLES15 - comprising seven MRI sequences for 30 cases and (ii) ISLES22 - comprising of three MRI sequences for 250 cases. We identified DWI, Tmax and T1c as the optimal set of MRI sequences for core-penumbra delineation and Tmax as the optimal sequence for lesion segmentation in ISLES15 dataset. In ISLES22 dataset, DWI was identified as the optimal sequence for lesion segmentation. In addition to the exhaustive experimental validation, visually interpretable evidence for accuracy of the identified optimal subset is provided in the form of saliency maps.
Collapse
Affiliation(s)
- Rachana Sathish
- Department of Electrical Engineering, Indian Institute of Technology Kharagpur, India.
| | - Debdoot Sheet
- Department of Electrical Engineering, Indian Institute of Technology Kharagpur, India
| |
Collapse
|
5
|
Parise O, Kronenberger R, Parise G, de Asmundis C, Gelsomino S, La Meir M. CTGAN-driven synthetic data generation: A multidisciplinary, expert-guided approach (TIMA). COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 259:108523. [PMID: 39608216 DOI: 10.1016/j.cmpb.2024.108523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 11/11/2024] [Accepted: 11/18/2024] [Indexed: 11/30/2024]
Abstract
OBJECTIVE We generated synthetic data starting from a population of two hundred thirty-eight adults SARS-CoV-2 positive patients admitted to the University Hospital of Brussels, Belgium, in 2020, utilizing a Conditional Tabular Generative Adversarial Network (CTGAN)-based technique with the aim of testing the performance, representativeness, realism, novelty, and diversity of synthetic data generated from a small patient sample. A Multidisciplinary Approach (TIMA) incorporates active participation from a medical team throughout the various stages of this process. METHODS The TIMA committee scrutinized data for inconsistencies, implementing stringent rules for variables unlearned by the system. A sensitivity analysis determined 100,000 epochs, leading to the generation of 10,000 synthetic data. The model's performance was tested using a general-purpose dataset, comparing real and synthetic data. RESULTS Outcomes indicate the robustness of our model, with an average contingency score of 0.94 across variable pairs in synthetic and real data. Continuous variables exhibited a median correlation similarity score of 0.97. Novelty received a top score of 1. Principal Component Analysis (PCA) on synthetic values demonstrated diversity, as no patient pair displayed a zero or close-to-zero value distance. Remarkably, the TIMA committee's evaluation revealed that synthetic data was recognized as authentic by nearly 100%. CONCLUSIONS Our trained model exhibited commendable performance, yielding high representativeness in the synthetic dataset compared to the original. The synthetic dataset proved realistic, boasting elevated levels of novelty and diversity.
Collapse
Affiliation(s)
- Orlando Parise
- Department of Cardiac Surgery, Universitair Ziekenhuis Brussel, Laarbeeklaan 101, Brussels 1090, Belgium; Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Universiteitssingel 50, Maastricht 6229 ER, the Netherlands.
| | - Rani Kronenberger
- Department of Cardiac Surgery, Universitair Ziekenhuis Brussel, Laarbeeklaan 101, Brussels 1090, Belgium
| | - Gianmarco Parise
- Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Universiteitssingel 50, Maastricht 6229 ER, the Netherlands
| | - Carlo de Asmundis
- Heart Rhythm Management Centre, Postgraduate in Cardiac Electrophysiology and Pacing, Universitair Ziekenhuis Brussel, Laarbeeklaan 101, Brussels 1090, Belgium
| | - Sandro Gelsomino
- Department of Cardiac Surgery, Universitair Ziekenhuis Brussel, Laarbeeklaan 101, Brussels 1090, Belgium; Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Universiteitssingel 50, Maastricht 6229 ER, the Netherlands
| | - Mark La Meir
- Department of Cardiac Surgery, Universitair Ziekenhuis Brussel, Laarbeeklaan 101, Brussels 1090, Belgium
| |
Collapse
|
6
|
Siddiqui AA, Tirunagari S, Zia T, Windridge D. A latent diffusion approach to visual attribution in medical imaging. Sci Rep 2025; 15:962. [PMID: 39762275 PMCID: PMC11704132 DOI: 10.1038/s41598-024-81646-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 11/28/2024] [Indexed: 01/11/2025] Open
Abstract
Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detection of diseased tissue deployed in standard machine vision pipelines (which are less straightforwardly interpretable/explainable to clinicians). We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific large language models, in order to generate normal counterparts of abnormal images. The discrepancy between the two hence gives rise to a mapping indicating the diagnostically-relevant image components. To achieve this, we deploy image priors in conjunction with appropriate conditioning mechanisms in order to control the image generative process, including natural language text prompts acquired from medical science and applied radiology. We perform experiments and quantitatively evaluate our results on the COVID-19 Radiography Database containing labelled chest X-rays with differing pathologies via the Frechet Inception Distance (FID), Structural Similarity (SSIM) and Multi Scale Structural Similarity Metric (MS-SSIM) metrics obtained between real and generated images. The resulting system also exhibits a range of latent capabilities including zero-shot localized disease induction, which are evaluated with real examples from the cheXpert dataset.
Collapse
|
7
|
Hanaoka S, Nomura Y, Hayashi N, Sato I, Miki S, Yoshikawa T, Shibata H, Nakao T, Takenaga T, Koyama H, Cho S, Kanemaru N, Fujimoto K, Sakamoto N, Nishiyama T, Matsuzaki H, Yamamichi N, Abe O. Deep generative abnormal lesion emphasization validated by nine radiologists and 1000 chest X-rays with lung nodules. PLoS One 2024; 19:e0315646. [PMID: 39666722 PMCID: PMC11637395 DOI: 10.1371/journal.pone.0315646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 11/25/2024] [Indexed: 12/14/2024] Open
Abstract
A general-purpose method of emphasizing abnormal lesions in chest radiographs, named EGGPALE (Extrapolative, Generative and General-Purpose Abnormal Lesion Emphasizer), is presented. The proposed EGGPALE method is composed of a flow-based generative model and L-infinity-distance-based extrapolation in a latent space. The flow-based model is trained using only normal chest radiographs, and an invertible mapping function from the image space to the latent space is determined. In the latent space, a given unseen image is extrapolated so that the image point moves away from the normal chest X-ray hyperplane. Finally, the moved point is mapped back to the image space and the corresponding emphasized image is created. The proposed method was evaluated by an image interpretation experiment with nine radiologists and 1,000 chest radiographs, of which positive suspected lung cancer cases and negative cases were validated by computed tomography examinations. The sensitivity of EGGPALE-processed images showed +0.0559 average improvement compared with that of the original images, with -0.0192 deterioration of average specificity. The area under the receiver operating characteristic curve of the ensemble of nine radiologists showed a statistically significant improvement. From these results, the feasibility of EGGPALE for enhancing abnormal lesions was validated. Our code is available at https://github.com/utrad-ical/Eggpale.
Collapse
Affiliation(s)
- Shouhei Hanaoka
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Yukihiro Nomura
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Naoto Hayashi
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Issei Sato
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
- Department of Computer Science, Graduate School of Information Science and Technology, The University of Tokyo, Bunkyo-ku, Japan
| | - Soichiro Miki
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Takeharu Yoshikawa
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Hisaichi Shibata
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Takahiro Nakao
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Tomomi Takenaga
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Hiroaki Koyama
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | | | - Noriko Kanemaru
- Kanto Rosai Hospital, Kawasaki City, Kanagawa Prefecture, Japan
| | - Kotaro Fujimoto
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
- Teikyo University Hospital, Itabashi-ku, Tokyo, Japan
| | - Naoya Sakamoto
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Tomoya Nishiyama
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| | - Hirotaka Matsuzaki
- Center for Epidemiology and Preventive Medicine, Graduate School of Medicine, Tokyo, Bunkyo-ku, Tokyo, Japan
- Department of Respiratory Medicine, Graduate School of Medicine, Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Nobutake Yamamichi
- Center for Epidemiology and Preventive Medicine, Graduate School of Medicine, Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Osamu Abe
- Department of Radiology and Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
8
|
Pozzi M, Noei S, Robbi E, Cima L, Moroni M, Munari E, Torresani E, Jurman G. Generating and evaluating synthetic data in digital pathology through diffusion models. Sci Rep 2024; 14:28435. [PMID: 39557989 PMCID: PMC11574254 DOI: 10.1038/s41598-024-79602-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 11/11/2024] [Indexed: 11/20/2024] Open
Abstract
Synthetic data is becoming a valuable tool for computational pathologists, aiding in tasks like data augmentation and addressing data scarcity and privacy. However, its use necessitates careful planning and evaluation to prevent the creation of clinically irrelevant artifacts.This manuscript introduces a comprehensive pipeline for generating and evaluating synthetic pathology data using a diffusion model. The pipeline features a multifaceted evaluation strategy with an integrated explainability procedure, addressing two key aspects of synthetic data use in the medical domain.The evaluation of the generated data employs an ensemble-like approach. The first step includes assessing the similarity between real and synthetic data using established metrics. The second step involves evaluating the usability of the generated images in deep learning models accompanied with explainable AI methods. The final step entails verifying their histopathological realism through questionnaires answered by professional pathologists. We show that each of these evaluation steps are necessary as they provide complementary information on the generated data's quality.The pipeline is demonstrated on the public GTEx dataset of 650 Whole Slide Images (WSIs), including five different tissues. An equal number of tiles from each tissue are generated and their reliability is assessed using the proposed evaluation pipeline, yielding promising results.In summary, the proposed workflow offers a comprehensive solution for generative AI in digital pathology, potentially aiding the community in their transition towards digitalization and data-driven modeling.
Collapse
Affiliation(s)
- Matteo Pozzi
- Data Science for Health Unit, Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento, 38123, Italy
- Department for Computational and Integrative Biology, Università degli Studi di Trento, Via Sommarive, 9, Povo, Trento, 38123, Italy
| | - Shahryar Noei
- Data Science for Health Unit, Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento, 38123, Italy
| | - Erich Robbi
- Data Science for Health Unit, Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento, 38123, Italy
- Department of Information Engineering and Computer Science, Università degli Studi di Trento, Via Sommarive, 9, Povo, Trento, 38123, Italy
| | - Luca Cima
- Department of Diagnostic and Public Health, Section of Pathology, University and Hospital Trust of Verona, Verona, Italy
| | - Monica Moroni
- Data Science for Health Unit, Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento, 38123, Italy
| | - Enrico Munari
- Department of Diagnostic and Public Health, Section of Pathology, University and Hospital Trust of Verona, Verona, Italy
| | - Evelin Torresani
- Pathology Unit, Department of Laboratory Medicine, Santa Chiara Hospital, APSS, Trento, Italy
| | - Giuseppe Jurman
- Data Science for Health Unit, Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento, 38123, Italy.
| |
Collapse
|
9
|
Mahawar J, Paul A. Generalizable diagnosis of chest radiographs through attention-guided decomposition of images utilizing self-consistency loss. Comput Biol Med 2024; 180:108922. [PMID: 39089108 DOI: 10.1016/j.compbiomed.2024.108922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 08/03/2024]
Abstract
BACKGROUND Chest X-ray (CXR) is one of the most commonly performed imaging tests worldwide. Due to its wide usage, there is a growing need for automated and generalizable methods to accurately diagnose these images. Traditional methods for chest X-ray analysis often struggle with generalization across diverse datasets due to variations in imaging protocols, patient demographics, and the presence of overlapping anatomical structures. Therefore, there is a significant demand for advanced diagnostic tools that can consistently identify abnormalities across different patient populations and imaging settings. We propose a method that can provide a generalizable diagnosis of chest X-ray. METHOD Our method utilizes an attention-guided decomposer network (ADSC) to extract disease maps from chest X-ray images. The ADSC employs one encoder and multiple decoders, incorporating a novel self-consistency loss to ensure consistent functionality across its modules. The attention-guided encoder captures salient features of abnormalities, while three distinct decoders generate a normal synthesized image, a disease map, and a reconstructed input image, respectively. A discriminator differentiates the real and the synthesized normal chest X-rays, enhancing the quality of generated images. The disease map along with the original chest X-ray image are fed to a DenseNet-121 classifier modified for multi-class classification of the input X-ray. RESULTS Experimental results on multiple publicly available datasets demonstrate the effectiveness of our approach. For multi-class classification, we achieve up to a 3% improvement in AUROC score for certain abnormalities compared to the existing methods. For binary classification (normal versus abnormal), our method surpasses existing approaches across various datasets. In terms of generalizability, we train our model on one dataset and tested it on multiple datasets. The standard deviation of AUROC scores for different test datasets is calculated to measure the variability of performance across datasets. Our model exhibits superior generalization across datasets from diverse sources. CONCLUSIONS Our model shows promising results for the generalizable diagnosis of chest X-rays. The impacts of using the attention mechanism and the self-consistency loss in our method are evident from the results. In the future, we plan to incorporate Explainable AI techniques to provide explanations for model decisions. Additionally, we aim to design data augmentation techniques to reduce class imbalance in our model.
Collapse
Affiliation(s)
- Jayant Mahawar
- Department of Computer Science and Engineering, Indian Institute of Technology Jodhpur, N.H. 62, Nagaur Road, Karwar, Jodhpur, 342030, Rajasthan, India.
| | - Angshuman Paul
- Department of Computer Science and Engineering, Indian Institute of Technology Jodhpur, N.H. 62, Nagaur Road, Karwar, Jodhpur, 342030, Rajasthan, India.
| |
Collapse
|
10
|
Park Y, Kang S, Kim MJ, Lee Y, Kim HS, Yi J. Visual defect obfuscation based self-supervised anomaly detection. Sci Rep 2024; 14:18872. [PMID: 39143358 PMCID: PMC11325017 DOI: 10.1038/s41598-024-69698-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 08/07/2024] [Indexed: 08/16/2024] Open
Abstract
Due to scarcity of anomaly situations in the early manufacturing stage, an unsupervised anomaly detection (UAD) approach is widely adopted which only uses normal samples for training. This approach is based on the assumption that the trained UAD model will accurately reconstruct normal patterns but struggles with unseen anomalies. To enhance the UAD performance, reconstruction-by-inpainting based methods have recently been investigated, especially on the masking strategy of suspected defective regions. However, there are still issues to overcome: (1) time-consuming inference due to multiple masking, (2) output inconsistency by random masking, and (3) inaccurate reconstruction of normal patterns for large masked areas. Motivated by this, this study proposes a novel reconstruction-by-inpainting method, dubbed Excision And Recovery (EAR), that features single deterministic masking based on the ImageNet pre-trained DINO-ViT and visual obfuscation for hint-providing. Experimental results on the MVTec AD dataset show that deterministic masking by pre-trained attention effectively cuts out suspected defective regions and resolves the aforementioned issues 1 and 2. Also, hint-providing by mosaicing proves to enhance the performance than emptying those regions by binary masking, thereby overcomes issue 3. The proposed approach achieves a high performance without any change of the model structure. Promising results are shown through laboratory tests with public industrial datasets. To suggest EAR be possibly adopted in various industries as a practically deployable solution, future steps include evaluating its applicability in relevant manufacturing environments.
Collapse
Affiliation(s)
- YeongHyeon Park
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
- SK Planet Co., Ltd., Seongnam, 13487, Republic of Korea
| | - Sungho Kang
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| | - Myung Jin Kim
- SK Planet Co., Ltd., Seongnam, 13487, Republic of Korea
| | - Yeonho Lee
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| | | | - Juneho Yi
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, Republic of Korea.
| |
Collapse
|
11
|
Kobayashi K, Gu L, Hataya R, Mizuno T, Miyake M, Watanabe H, Takahashi M, Takamizawa Y, Yoshida Y, Nakamura S, Kouno N, Bolatkan A, Kurose Y, Harada T, Hamamoto R. Sketch-based semantic retrieval of medical images. Med Image Anal 2024; 92:103060. [PMID: 38104401 DOI: 10.1016/j.media.2023.103060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 08/31/2023] [Accepted: 12/05/2023] [Indexed: 12/19/2023]
Abstract
The volume of medical images stored in hospitals is rapidly increasing; however, the utilization of these accumulated medical images remains limited. Existing content-based medical image retrieval (CBMIR) systems typically require example images, leading to practical limitations, such as the lack of customizable, fine-grained image retrieval, the inability to search without example images, and difficulty in retrieving rare cases. In this paper, we introduce a sketch-based medical image retrieval (SBMIR) system that enables users to find images of interest without the need for example images. The key concept is feature decomposition of medical images, which allows the entire feature of a medical image to be decomposed into and reconstructed from normal and abnormal features. Building on this concept, our SBMIR system provides an easy-to-use two-step graphical user interface: users first select a template image to specify a normal feature and then draw a semantic sketch of the disease on the template image to represent an abnormal feature. The system integrates both types of input to construct a query vector and retrieves reference images. For evaluation, ten healthcare professionals participated in a user test using two datasets. Consequently, our SBMIR system enabled users to overcome previous challenges, including image retrieval based on fine-grained image characteristics, image retrieval without example images, and image retrieval for rare cases. Our SBMIR system provides on-demand, customizable medical image retrieval, thereby expanding the utility of medical image databases.
Collapse
Affiliation(s)
- Kazuma Kobayashi
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
| | - Lin Gu
- Machine Intelligence for Medical Engineering Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan.
| | - Ryuichiro Hataya
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
| | - Takaaki Mizuno
- Department of Experimental Therapeutics, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
| | - Mototaka Miyake
- Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
| | - Hirokazu Watanabe
- Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
| | - Masamichi Takahashi
- Department of Neurosurgery and Neuro-Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
| | - Yasuyuki Takamizawa
- Department of Colorectal Surgery, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
| | - Yukihiro Yoshida
- Department of Thoracic Surgery, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
| | - Satoshi Nakamura
- Radiation Safety and Quality Assurance Division, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; Division of Research and Development for Boron Neutron Capture Therapy, National Cancer Center, Exploratory Oncology Research & Clinical Trial Center, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; Medical Physics Laboratory, Division of Health Science, Graduate School of Medicine, Osaka University, Yamadaoka 1-7, Suita-shi, Osaka 565-0871, Japan.
| | - Nobuji Kouno
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; Department of Surgery, Kyoto University Graduate School of Medicine, 54 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan.
| | - Amina Bolatkan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
| | - Yusuke Kurose
- Machine Intelligence for Medical Engineering Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan.
| | - Tatsuya Harada
- Machine Intelligence for Medical Engineering Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan; Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan.
| | - Ryuji Hamamoto
- Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
| |
Collapse
|
12
|
Sunilkumar AP, Keshari Parida B, You W. Recent Advances in Dental Panoramic X-Ray Synthesis and Its Clinical Applications. IEEE ACCESS 2024; 12:141032-141051. [DOI: 10.1109/access.2024.3422650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Affiliation(s)
- Anusree P. Sunilkumar
- Department of Information and Communication Engineering, Artificial Intelligence and Image Processing Laboratory (AIIP Laboratory), Sun Moon University, Asan-si, Republic of Korea
| | - Bikram Keshari Parida
- Department of Information and Communication Engineering, Artificial Intelligence and Image Processing Laboratory (AIIP Laboratory), Sun Moon University, Asan-si, Republic of Korea
| | - Wonsang You
- Department of Information and Communication Engineering, Artificial Intelligence and Image Processing Laboratory (AIIP Laboratory), Sun Moon University, Asan-si, Republic of Korea
| |
Collapse
|
13
|
Kim K, Lee JH, Je Oh S, Chung MJ. AI-based computer-aided diagnostic system of chest digital tomography synthesis: Demonstrating comparative advantage with X-ray-based AI systems. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107643. [PMID: 37348439 DOI: 10.1016/j.cmpb.2023.107643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 05/26/2023] [Accepted: 06/03/2023] [Indexed: 06/24/2023]
Abstract
BACKGROUND Compared with chest X-ray (CXR) imaging, which is a single image projected from the front of the patient, chest digital tomosynthesis (CDTS) imaging can be more advantageous for lung lesion detection because it acquires multiple images projected from multiple angles of the patient. Various clinical comparative analysis and verification studies have been reported to demonstrate this, but there is no artificial intelligence (AI)-based comparative analysis studies. Existing AI-based computer-aided detection (CAD) systems for lung lesion diagnosis have been developed mainly based on CXR images; however, CAD-based on CDTS, which uses multi-angle images of patients in various directions, has not been proposed and verified for its usefulness compared to CXR-based counterparts. BACKGROUND AND OBJECTIVE This study develops and tests a CDTS-based AI CAD system to detect lung lesions to demonstrate performance improvements compared to CXR-based AI CAD. METHODS We used multiple (e.g., five) projection images as input for the CDTS-based AI model and a single-projection image as input for the CXR-based AI model to compare and evaluate the performance between models. Multiple/single projection input images were obtained by virtual projection on the three-dimensional (3D) stack of computed tomography (CT) slices of each patient's lungs from which the bed area was removed. These multiple images result from shooting from the front and left and right 30/60∘. The projected image captured from the front was used as the input for the CXR-based AI model. The CDTS-based AI model used all five projected images. The proposed CDTS-based AI model consisted of five AI models that received images in each of the five directions, and obtained the final prediction result through an ensemble of five models. Each model used WideResNet-50. To train and evaluate CXR- and CDTS-based AI models, 500 healthy data, 206 tuberculosis data, and 242 pneumonia data were used, and three three-fold cross-validation was applied. RESULTS The proposed CDTS-based AI CAD system yielded sensitivities of 0.782 and 0.785 and accuracies of 0.895 and 0.837 for the (binary classification) performance of detecting tuberculosis and pneumonia, respectively, against normal subjects. These results show higher performance than the sensitivity of 0.728 and 0.698 and accuracies of 0.874 and 0.826 for detecting tuberculosis and pneumonia through the CXR-based AI CAD, which only uses a single projection image in the frontal direction. We found that CDTS-based AI CAD improved the sensitivity of tuberculosis and pneumonia by 5.4% and 8.7% respectively, compared to CXR-based AI CAD without loss of accuracy. CONCLUSIONS This study comparatively proves that CDTS-based AI CAD technology can improve performance more than CXR. These results suggest that we can enhance the clinical application of CDTS. Our code is available at https://github.com/kskim-phd/CDTS-CAD-P.
Collapse
Affiliation(s)
- Kyungsu Kim
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul 06351, Republic of Korea; Department of Data Convergence and Future Medicine, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea.
| | - Ju Hwan Lee
- Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul 06351, Republic of Korea
| | - Seong Je Oh
- Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul 06351, Republic of Korea
| | - Myung Jin Chung
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul 06351, Republic of Korea; Department of Data Convergence and Future Medicine, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea; Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea.
| |
Collapse
|
14
|
He X, Cai W, Li F, Fan Q, Zhang P, Cuaron JJ, Cerviño LI, Moran JM, Li X, Li T. Patient specific prior cross attention for kV decomposition in paraspinal motion tracking. Med Phys 2023; 50:5343-5353. [PMID: 37538040 PMCID: PMC11167561 DOI: 10.1002/mp.16644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 06/20/2023] [Accepted: 06/28/2023] [Indexed: 08/05/2023] Open
Abstract
BACKGROUND X-ray image quality is critical for accurate intrafraction motion tracking in radiation therapy. PURPOSE This study aims to develop a deep-learning algorithm to improve kV image contrast by decomposing the image into bony and soft tissue components. In particular, we designed a priori attention mechanism in the neural network framework for optimal decomposition. We show that a patient-specific prior cross-attention (PCAT) mechanism can boost the performance of kV image decomposition. We demonstrate its use in paraspinal SBRT motion tracking with online kV imaging. METHODS Online 2D kV projections were acquired during paraspinal SBRT for patient motion monitoring. The patient-specific prior images were generated by randomly shifting and rotating spine-only DRR created from the setup CBCT, simulating potential motions. The latent features of the prior images were incorporated into the PCAT using multi-head cross attention. The neural network aimed to learn to selectively amplify the transmission of the projection image features that correlate with features of the priori. The PCAT network structure consisted of (1) a dual-branch generator that separates the spine and soft tissue component of the kV projection image and (2) a dual-function discriminator (DFD) that provides the realness score of the predicted images. For supervision, we used a loss combining mean absolute error loss, discriminator loss, perceptual loss, total variation, and mean squared error loss for soft tissues. The proposed PCAT approach was benchmarked against previous work using the ResNet generative adversarial network (ResNetGAN) without prior information. RESULTS The trained PCAT had improved performance in effectively retaining and preserving the spine structure and texture information while suppressing the soft tissues from the kV projection images. The decomposed spine-only x-ray images had the submillimeter matching accuracy at all beam angles. The decomposed spine-only x-ray significantly reduced the maximum errors to 0.44 mm (<2 pixels) in comparison to 0.92 mm (∼4 pixels) of ResNetGAN. The PCAT decomposed spine images also had higher PSNR and SSIM (p-value < 0.001). CONCLUSION The PCAT selectively learned the important latent features by incorporating the patient-specific prior knowledge into the deep learning algorithm, significantly improving the robustness of the kV projection image decomposition, and leading to improved motion tracking accuracy in paraspinal SBRT.
Collapse
Affiliation(s)
- Xiuxiu He
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Weixing Cai
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Feifei Li
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Qiyong Fan
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Pengpeng Zhang
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - John J. Cuaron
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Laura I. Cerviño
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Jean M. Moran
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Xiang Li
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Tianfang Li
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| |
Collapse
|
15
|
Xu K, Li T, Khan MS, Gao R, Antic SL, Huo Y, Sandler KL, Maldonado F, Landman BA. Body composition assessment with limited field-of-view computed tomography: A semantic image extension perspective. Med Image Anal 2023; 88:102852. [PMID: 37276799 PMCID: PMC10527087 DOI: 10.1016/j.media.2023.102852] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 01/30/2023] [Accepted: 05/23/2023] [Indexed: 06/07/2023]
Abstract
Field-of-view (FOV) tissue truncation beyond the lungs is common in routine lung screening computed tomography (CT). This poses limitations for opportunistic CT-based body composition (BC) assessment as key anatomical structures are missing. Traditionally, extending the FOV of CT is considered as a CT reconstruction problem using limited data. However, this approach relies on the projection domain data which might not be available in application. In this work, we formulate the problem from the semantic image extension perspective which only requires image data as inputs. The proposed two-stage method identifies a new FOV border based on the estimated extent of the complete body and imputes missing tissues in the truncated region. The training samples are simulated using CT slices with complete body in FOV, making the model development self-supervised. We evaluate the validity of the proposed method in automatic BC assessment using lung screening CT with limited FOV. The proposed method effectively restores the missing tissues and reduces BC assessment error introduced by FOV tissue truncation. In the BC assessment for large-scale lung screening CT datasets, this correction improves both the intra-subject consistency and the correlation with anthropometric approximations. The developed method is available at https://github.com/MASILab/S-EFOV.
Collapse
Affiliation(s)
- Kaiwen Xu
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States.
| | - Thomas Li
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States
| | - Mirza S Khan
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Riqiang Gao
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States
| | - Sanja L Antic
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Yuankai Huo
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States
| | - Kim L Sandler
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Fabien Maldonado
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Bennett A Landman
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States; Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| |
Collapse
|
16
|
Du Y, Wang L, Meng D, Chen B, An C, Liu H, Liu W, Xu Y, Fan Y, Feng D, Wang X, Xu X. Individualized Statistical Modeling of Lesions in Fundus Images for Anomaly Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1185-1196. [PMID: 36446017 DOI: 10.1109/tmi.2022.3225422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Anomaly detection in fundus images remains challenging due to the fact that fundus images often contain diverse types of lesions with various properties in locations, sizes, shapes, and colors. Current methods achieve anomaly detection mainly through reconstructing or separating the fundus image background from a fundus image under the guidance of a set of normal fundus images. The reconstruction methods, however, ignore the constraint from lesions. The separation methods primarily model the diverse lesions with pixel-based independent and identical distributed (i.i.d.) properties, neglecting the individualized variations of different types of lesions and their structural properties. And hence, these methods may have difficulty to well distinguish lesions from fundus image backgrounds especially with the normal personalized variations (NPV). To address these challenges, we propose a patch-based non-i.i.d. mixture of Gaussian (MoG) to model diverse lesions for adapting to their statistical distribution variations in different fundus images and their patch-like structural properties. Further, we particularly introduce the weighted Schatten p-norm as the metric of low-rank decomposition for enhancing the accuracy of the learned fundus image backgrounds and reducing false-positives caused by NPV. With the individualized modeling of the diverse lesions and the background learning, fundus image backgrounds and NPV are finely learned and subsequently distinguished from diverse lesions, to ultimately improve the anomaly detection. The proposed method is evaluated on two real-world databases and one artificial database, outperforming the state-of-the-art methods.
Collapse
|
17
|
Liu S, Cai T, Tang X, Zhang Y, Wang C. COVID-19 diagnosis via chest X-ray image classification based on multiscale class residual attention. Comput Biol Med 2022; 149:106065. [PMID: 36081225 PMCID: PMC9433340 DOI: 10.1016/j.compbiomed.2022.106065] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 08/07/2022] [Accepted: 08/27/2022] [Indexed: 12/11/2022]
Abstract
Aiming at detecting COVID-19 effectively, a multiscale class residual attention (MCRA) network is proposed via chest X-ray (CXR) image classification. First, to overcome the data shortage and improve the robustness of our network, a pixel-level image mixing of local regions was introduced to achieve data augmentation and reduce noise. Secondly, multi-scale fusion strategy was adopted to extract global contextual information at different scales and enhance semantic representation. Last but not least, class residual attention was employed to generate spatial attention for each class, which can avoid inter-class interference and enhance related features to further improve the COVID-19 detection. Experimental results show that our network achieves superior diagnostic performance on COVIDx dataset, and its accuracy, PPV, sensitivity, specificity and F1-score are 97.71%, 96.76%, 96.56%, 98.96% and 96.64%, respectively; moreover, the heat maps can endow our deep model with somewhat interpretability.
Collapse
Affiliation(s)
- Shangwang Liu
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China; Engineering Lab of Intelligence Business & Internet of Things, Henan Province, China.
| | - Tongbo Cai
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China; Engineering Lab of Intelligence Business & Internet of Things, Henan Province, China
| | - Xiufang Tang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China; Engineering Lab of Intelligence Business & Internet of Things, Henan Province, China
| | - Yangyang Zhang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China; Engineering Lab of Intelligence Business & Internet of Things, Henan Province, China
| | - Changgeng Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China; Engineering Lab of Intelligence Business & Internet of Things, Henan Province, China
| |
Collapse
|
18
|
Fan L, Wang Z, Zhou J. LDADN: a local discriminant auxiliary disentangled network for key-region-guided chest X-ray image synthesis augmented in pneumoconiosis detection. BIOMEDICAL OPTICS EXPRESS 2022; 13:4353-4369. [PMID: 36032572 PMCID: PMC9408261 DOI: 10.1364/boe.461888] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 06/29/2022] [Accepted: 07/05/2022] [Indexed: 06/15/2023]
Abstract
Pneumoconiosis is deemed one of China's most common and serious occupational diseases. Its high prevalence and treatment cost create enormous pressure on socio-economic development. However, due to the scarcity of labeled data and class-imbalanced training sets, the computer-aided diagnostic based on chest X-ray (CXR) images of pneumoconiosis remains a challenging task. Current CXR data augmentation solutions cannot sufficiently extract small-scaled features in lesion areas and synthesize high-quality images. Thus, it may cause error detection in the diagnosis phase. In this paper, we propose a local discriminant auxiliary disentangled network (LDADN) to synthesize CXR images and augment in pneumoconiosis detection. This model enables the high-frequency transfer of details by leveraging batches of mutually independent local discriminators. Cooperating with local adversarial learning and the Laplacian filter, the feature in the lesion area can be disentangled by a single network. The results show that LDADN is superior to other compared models in the quantitative assessment metrics. When used for data augmentation, the model synthesized image significantly boosts the performance of the detection accuracy to 99.31%. Furthermore, this study offers beneficial references for insufficient label or class imbalanced medical image data analysis.
Collapse
|
19
|
Du Y, Wang L, Chen B, An C, Liu H, Fan Y, Wang X, Xu X. Anomaly detection in fundus images by self-adaptive decomposition via local and color based sparse coding. BIOMEDICAL OPTICS EXPRESS 2022; 13:4261-4277. [PMID: 36032576 PMCID: PMC9408254 DOI: 10.1364/boe.461224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 06/19/2022] [Accepted: 07/11/2022] [Indexed: 06/15/2023]
Abstract
Anomaly detection in color fundus images is challenging due to the diversity of anomalies. The current studies detect anomalies from fundus images by learning their background images, however, ignoring the affluent characteristics of anomalies. In this paper, we propose a simultaneous modeling strategy in both sequential sparsity and local and color saliency property of anomalies are utilized for the multi-perspective anomaly modeling. In the meanwhile, the Schatten p-norm based metric is employed to better learn the heterogeneous background images, from where the anomalies are better discerned. Experiments and comparisons demonstrate the outperforming and effectiveness of the proposed method.
Collapse
Affiliation(s)
- Yuchen Du
- Department of Automation,
Shanghai Jiao Tong University, 800
Dongchuan Road, Shanghai, 200240, China
- Department of Ophthalmology, Shanghai Key
Laboratory of Ocular Fundus Diseases, Shanghai
Engineering Center for Visual Science and Photo Medicine, Shanghai
General Hospital, SJTU School of Medicine, 100 Haining
Road, Shanghai, 200080, China
- Department of Preventative Ophthalmology,
Shanghai Eye Diseases Prevention and Treatment Center,
Shanghai Eye Hospital, 380 Kangding Road, Shanghai,
200040, China
- National Clinical Research
Center for Eye Diseases, 380 Kangding Road, 200040,
China
| | - Lisheng Wang
- Department of Automation,
Shanghai Jiao Tong University, 800
Dongchuan Road, Shanghai, 200240, China
| | - Benzhi Chen
- Department of Automation,
Shanghai Jiao Tong University, 800
Dongchuan Road, Shanghai, 200240, China
| | - Chengyang An
- Department of Automation,
Shanghai Jiao Tong University, 800
Dongchuan Road, Shanghai, 200240, China
| | - Hao Liu
- Department of Automation,
Shanghai Jiao Tong University, 800
Dongchuan Road, Shanghai, 200240, China
| | - Ying Fan
- Department of Ophthalmology, Shanghai Key
Laboratory of Ocular Fundus Diseases, Shanghai
Engineering Center for Visual Science and Photo Medicine, Shanghai
General Hospital, SJTU School of Medicine, 100 Haining
Road, Shanghai, 200080, China
- Department of Preventative Ophthalmology,
Shanghai Eye Diseases Prevention and Treatment Center,
Shanghai Eye Hospital, 380 Kangding Road, Shanghai,
200040, China
- National Clinical Research
Center for Eye Diseases, 380 Kangding Road, 200040,
China
| | - Xiuying Wang
- School of Computer Science,
The University of Sydney, Sydney, NSW 2006,
Australia
| | - Xun Xu
- Department of Ophthalmology, Shanghai Key
Laboratory of Ocular Fundus Diseases, Shanghai
Engineering Center for Visual Science and Photo Medicine, Shanghai
General Hospital, SJTU School of Medicine, 100 Haining
Road, Shanghai, 200080, China
- Department of Preventative Ophthalmology,
Shanghai Eye Diseases Prevention and Treatment Center,
Shanghai Eye Hospital, 380 Kangding Road, Shanghai,
200040, China
- National Clinical Research
Center for Eye Diseases, 380 Kangding Road, 200040,
China
| |
Collapse
|
20
|
Zhu H, He X, Wang M, Zhang M, Qing L. Medical visual question answering via corresponding feature fusion combined with semantic attention. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:10192-10212. [PMID: 36031991 DOI: 10.3934/mbe.2022478] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Medical visual question answering (Med-VQA) aims to leverage a pre-trained artificial intelligence model to answer clinical questions raised by doctors or patients regarding radiology images. However, owing to the high professional requirements in the medical field and the difficulty of annotating medical data, Med-VQA lacks sufficient large-scale, well-annotated radiology images for training. Researchers have mainly focused on improving the ability of the model's visual feature extractor to address this problem. However, there are few researches focused on the textual feature extraction, and most of them underestimated the interactions between corresponding visual and textual features. In this study, we propose a corresponding feature fusion (CFF) method to strengthen the interactions of specific features from corresponding radiology images and questions. In addition, we designed a semantic attention (SA) module for textual feature extraction. This helps the model consciously focus on the meaningful words in various questions while reducing the attention spent on insignificant information. Extensive experiments demonstrate that the proposed method can achieve competitive results in two benchmark datasets and outperform existing state-of-the-art methods on answer prediction accuracy. Experimental results also prove that our model is capable of semantic understanding during answer prediction, which has certain advantages in Med-VQA.
Collapse
Affiliation(s)
- Han Zhu
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610064, China
| | - Xiaohai He
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610064, China
| | - Meiling Wang
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610064, China
| | - Mozhi Zhang
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | - Linbo Qing
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610064, China
| |
Collapse
|
21
|
Liu X, Sanchez P, Thermos S, O'Neil AQ, Tsaftaris SA. Learning disentangled representations in the imaging domain. Med Image Anal 2022; 80:102516. [PMID: 35751992 DOI: 10.1016/j.media.2022.102516] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 04/05/2022] [Accepted: 06/10/2022] [Indexed: 12/12/2022]
Abstract
Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities.
Collapse
Affiliation(s)
- Xiao Liu
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK.
| | - Pedro Sanchez
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Spyridon Thermos
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Alison Q O'Neil
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; Canon Medical Research Europe, Edinburgh EH6 5NP, UK
| | - Sotirios A Tsaftaris
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; The Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
22
|
van Velzen SGM, de Vos BD, Noothout JMH, Verkooijen HM, Viergever MA, Išgum I. Generative models for reproducible coronary calcium scoring. J Med Imaging (Bellingham) 2022; 9:052406. [PMID: 35664539 DOI: 10.1117/1.jmi.9.5.052406] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 05/12/2022] [Indexed: 11/14/2022] Open
Abstract
Purpose: Coronary artery calcium (CAC) score, i.e., the amount of CAC quantified in CT, is a strong and independent predictor of coronary heart disease (CHD) events. However, CAC scoring suffers from limited interscan reproducibility, which is mainly due to the clinical definition requiring application of a fixed intensity level threshold for segmentation of calcifications. This limitation is especially pronounced in non-electrocardiogram-synchronized computed tomography (CT) where lesions are more impacted by cardiac motion and partial volume effects. Therefore, we propose a CAC quantification method that does not require a threshold for segmentation of CAC. Approach: Our method utilizes a generative adversarial network (GAN) where a CT with CAC is decomposed into an image without CAC and an image showing only CAC. The method, using a cycle-consistent GAN, was trained using 626 low-dose chest CTs and 514 radiotherapy treatment planning (RTP) CTs. Interscan reproducibility was compared to clinical calcium scoring in RTP CTs of 1662 patients, each having two scans. Results: A lower relative interscan difference in CAC mass was achieved by the proposed method: 47% compared to 89% manual clinical calcium scoring. The intraclass correlation coefficient of Agatston scores was 0.96 for the proposed method compared to 0.91 for automatic clinical calcium scoring. Conclusions: The increased interscan reproducibility achieved by our method may lead to increased reliability of CHD risk categorization and improved accuracy of CHD event prediction.
Collapse
Affiliation(s)
- Sanne G M van Velzen
- Amsterdam UMC location University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands.,Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam, The Netherlands.,University of Amsterdam, Informatics Institute, Faculty of Science, Amsterdam, The Netherlands.,Utrecht University, University Medical Center Utrecht, Image Sciences Institute, Utrecht, The Netherlands
| | - Bob D de Vos
- Amsterdam UMC location University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands.,Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam, The Netherlands.,University of Amsterdam, Informatics Institute, Faculty of Science, Amsterdam, The Netherlands
| | - Julia M H Noothout
- Amsterdam UMC location University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands.,Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam, The Netherlands.,University of Amsterdam, Informatics Institute, Faculty of Science, Amsterdam, The Netherlands
| | - Helena M Verkooijen
- University Medical Center Utrecht, Imaging Division, Utrecht, The Netherlands
| | - Max A Viergever
- Utrecht University, University Medical Center Utrecht, Image Sciences Institute, Utrecht, The Netherlands
| | - Ivana Išgum
- Amsterdam UMC location University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands.,Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam, The Netherlands.,University of Amsterdam, Informatics Institute, Faculty of Science, Amsterdam, The Netherlands.,Amsterdam UMC location University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
| |
Collapse
|
23
|
Zhao SX, Chen Y, Yang KF, Luo Y, Ma BY, Li YJ. A Local and Global Feature Disentangled Network: Toward Classification of Benign-Malignant Thyroid Nodules From Ultrasound Image. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1497-1509. [PMID: 34990353 DOI: 10.1109/tmi.2022.3140797] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Thyroid nodules are one of the most common nodular lesions. The incidence of thyroid cancer has increased rapidly in the past three decades and is one of the cancers with the highest incidence. As a non-invasive imaging modality, ultrasonography can identify benign and malignant thyroid nodules, and it can be used for large-scale screening. In this study, inspired by the domain knowledge of sonographers when diagnosing ultrasound images, a local and global feature disentangled network (LoGo-Net) is proposed to classify benign and malignant thyroid nodules. This model imitates the dual-pathway structure of human vision and establishes a new feature extraction method to improve the recognition performance of nodules. We use the tissue-anatomy disentangled (TAD) block to connect the dual pathways, which decouples the cues of local and global features based on the self-attention mechanism. To verify the effectiveness of the model, we constructed a large-scale dataset and conducted extensive experiments. The results show that our method achieves an accuracy of 89.33%, which has the potential to be used in the clinical practice of doctors, including early cancer screening procedures in remote or resource-poor areas.
Collapse
|
24
|
Sharma CM, Goyal L, Chariar VM, Sharma N. Lung Disease Classification in CXR Images Using Hybrid Inception-ResNet-v2 Model and Edge Computing. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:9036457. [PMID: 35368941 PMCID: PMC8968389 DOI: 10.1155/2022/9036457] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 02/10/2022] [Accepted: 02/17/2022] [Indexed: 11/20/2022]
Abstract
Chest X-ray (CXR) imaging is one of the most widely used and economical tests to diagnose a wide range of diseases. However, even for expert radiologists, it is a challenge to accurately diagnose diseases from CXR samples. Furthermore, there remains an acute shortage of trained radiologists worldwide. In the present study, a range of machine learning (ML), deep learning (DL), and transfer learning (TL) approaches have been evaluated to classify diseases in an openly available CXR image dataset. A combination of the synthetic minority over-sampling technique (SMOTE) and weighted class balancing is used to alleviate the effects of class imbalance. A hybrid Inception-ResNet-v2 transfer learning model coupled with data augmentation and image enhancement gives the best accuracy. The model is deployed in an edge environment using Amazon IoT Core to automate the task of disease detection in CXR images with three categories, namely pneumonia, COVID-19, and normal. Comparative analysis has been given in various metrics such as precision, recall, accuracy, AUC-ROC score, etc. The proposed technique gives an average accuracy of 98.66%. The accuracies of other TL models, namely SqueezeNet, VGG19, ResNet50, and MobileNetV2 are 97.33%, 91.66%, 90.33%, and 76.00%, respectively. Further, a DL model, trained from scratch, gives an accuracy of 92.43%. Two feature-based ML classification techniques, namely support vector machine with local binary pattern (SVM + LBP) and decision tree with histogram of oriented gradients (DT + HOG) yield an accuracy of 87.98% and 86.87%, respectively.
Collapse
|
25
|
Chain Graph Explanation of Neural Network Based on Feature-Level Class Confusion. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Despite increasing interest in developing interpretable machine learning methods, most recent studies have provided explanations only for single instances, require additional datasets, and are sensitive to hyperparameters. This paper proposes a confusion graph that reveals model weaknesses by constructing a confusion dictionary. Unlike other methods, which focus on the performance variation caused by single-neuron suppression, it defines the role of each neuron in two different perspectives: ‘correction’ and ‘violation’. Furthermore, our method can identify the class relationships in similar positions at the feature level, which can suggest improvements to the model. Finally, the proposed graph construction is model-agnostic and does not require additional data or tedious hyperparameter tuning. Experimental results show that the information loss from omitting the channels guided by the proposed graph can result in huge performance degradation, from 91% to 33%, while the proposed graph only retains 1% of total neurons.
Collapse
|
26
|
Salahuddin Z, Woodruff HC, Chatterjee A, Lambin P. Transparency of deep neural networks for medical image analysis: A review of interpretability methods. Comput Biol Med 2022; 140:105111. [PMID: 34891095 DOI: 10.1016/j.compbiomed.2021.105111] [Citation(s) in RCA: 98] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 12/01/2021] [Accepted: 12/02/2021] [Indexed: 02/03/2023]
Abstract
Artificial Intelligence (AI) has emerged as a useful aid in numerous clinical applications for diagnosis and treatment decisions. Deep neural networks have shown the same or better performance than clinicians in many tasks owing to the rapid increase in the available data and computational power. In order to conform to the principles of trustworthy AI, it is essential that the AI system be transparent, robust, fair, and ensure accountability. Current deep neural solutions are referred to as black-boxes due to a lack of understanding of the specifics concerning the decision-making process. Therefore, there is a need to ensure the interpretability of deep neural networks before they can be incorporated into the routine clinical workflow. In this narrative review, we utilized systematic keyword searches and domain expertise to identify nine different types of interpretability methods that have been used for understanding deep learning models for medical image analysis applications based on the type of generated explanations and technical similarities. Furthermore, we report the progress made towards evaluating the explanations produced by various interpretability methods. Finally, we discuss limitations, provide guidelines for using interpretability methods and future directions concerning the interpretability of deep neural networks for medical imaging analysis.
Collapse
Affiliation(s)
- Zohaib Salahuddin
- The D-Lab, Department of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University, Maastricht, the Netherlands.
| | - Henry C Woodruff
- The D-Lab, Department of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University, Maastricht, the Netherlands; Department of Radiology and Nuclear Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, the Netherlands
| | - Avishek Chatterjee
- The D-Lab, Department of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University, Maastricht, the Netherlands
| | - Philippe Lambin
- The D-Lab, Department of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University, Maastricht, the Netherlands; Department of Radiology and Nuclear Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, the Netherlands
| |
Collapse
|
27
|
Li X, Jiang Y, Rodriguez-Andina JJ, Luo H, Yin S, Kaynak O. When medical images meet generative adversarial network: recent development and research opportunities. DISCOVER ARTIFICIAL INTELLIGENCE 2021; 1:5. [DOI: 10.1007/s44163-021-00006-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 07/12/2021] [Indexed: 11/27/2022]
Abstract
AbstractDeep learning techniques have promoted the rise of artificial intelligence (AI) and performed well in computer vision. Medical image analysis is an important application of deep learning, which is expected to greatly reduce the workload of doctors, contributing to more sustainable health systems. However, most current AI methods for medical image analysis are based on supervised learning, which requires a lot of annotated data. The number of medical images available is usually small and the acquisition of medical image annotations is an expensive process. Generative adversarial network (GAN), an unsupervised method that has become very popular in recent years, can simulate the distribution of real data and reconstruct approximate real data. GAN opens some exciting new ways for medical image generation, expanding the number of medical images available for deep learning methods. Generated data can solve the problem of insufficient data or imbalanced data categories. Adversarial training is another contribution of GAN to medical imaging that has been applied to many tasks, such as classification, segmentation, or detection. This paper investigates the research status of GAN in medical images and analyzes several GAN methods commonly applied in this area. The study addresses GAN application for both medical image synthesis and adversarial learning for other medical image tasks. The open challenges and future research directions are also discussed.
Collapse
|
28
|
Lee S, Summers RM. Clinical Artificial Intelligence Applications in Radiology: Chest and Abdomen. Radiol Clin North Am 2021; 59:987-1002. [PMID: 34689882 DOI: 10.1016/j.rcl.2021.07.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Organ segmentation, chest radiograph classification, and lung and liver nodule detections are some of the popular artificial intelligence (AI) tasks in chest and abdominal radiology due to the wide availability of public datasets. AI algorithms have achieved performance comparable to humans in less time for several organ segmentation tasks, and some lesion detection and classification tasks. This article introduces the current published articles of AI applied to chest and abdominal radiology, including organ segmentation, lesion detection, classification, and predicting prognosis.
Collapse
Affiliation(s)
- Sungwon Lee
- Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Building 10, Room 1C224D, 10 Center Drive, Bethesda, MD 20892-1182, USA
| | - Ronald M Summers
- Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Building 10, Room 1C224D, 10 Center Drive, Bethesda, MD 20892-1182, USA.
| |
Collapse
|
29
|
Decomposing normal and abnormal features of medical images for content-based image retrieval of glioma imaging. Med Image Anal 2021; 74:102227. [PMID: 34543911 DOI: 10.1016/j.media.2021.102227] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 11/20/2022]
Abstract
In medical imaging, the characteristics purely derived from a disease should reflect the extent to which abnormal findings deviate from the normal features. Indeed, physicians often need corresponding images without abnormal findings of interest or, conversely, images that contain similar abnormal findings regardless of normal anatomical context. This is called comparative diagnostic reading of medical images, which is essential for a correct diagnosis. To support comparative diagnostic reading, content-based image retrieval (CBIR) that can selectively utilize normal and abnormal features in medical images as two separable semantic components will be useful. In this study, we propose a neural network architecture to decompose the semantic components of medical images into two latent codes: normal anatomy code and abnormal anatomy code. The normal anatomy code represents counterfactual normal anatomies that should have existed if the sample is healthy, whereas the abnormal anatomy code attributes to abnormal changes that reflect deviation from the normal baseline. By calculating the similarity based on either normal or abnormal anatomy codes or the combination of the two codes, our algorithm can retrieve images according to the selected semantic component from a dataset consisting of brain magnetic resonance images of gliomas. Moreover, it can utilize a synthetic query vector combining normal and abnormal anatomy codes from two different query images. To evaluate whether the retrieved images are acquired according to the targeted semantic component, the overlap of the ground-truth labels is calculated as metrics of the semantic consistency. Our algorithm provides a flexible CBIR framework by handling the decomposed features with qualitatively and quantitatively remarkable results.
Collapse
|
30
|
Chen RJ, Lu MY, Chen TY, Williamson DFK, Mahmood F. Synthetic data in machine learning for medicine and healthcare. Nat Biomed Eng 2021; 5:493-497. [PMID: 34131324 PMCID: PMC9353344 DOI: 10.1038/s41551-021-00751-8] [Citation(s) in RCA: 211] [Impact Index Per Article: 52.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The proliferation of synthetic data in artificial intelligence for medicine and healthcare raises concerns about the vulnerabilities of the software and the challenges of current policy.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
31
|
Shrivastava AD, Kell DB. FragNet, a Contrastive Learning-Based Transformer Model for Clustering, Interpreting, Visualizing, and Navigating Chemical Space. Molecules 2021; 26:2065. [PMID: 33916824 PMCID: PMC8038408 DOI: 10.3390/molecules26072065] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/29/2021] [Accepted: 04/01/2021] [Indexed: 12/12/2022] Open
Abstract
The question of molecular similarity is core in cheminformatics and is usually assessed via a pairwise comparison based on vectors of properties or molecular fingerprints. We recently exploited variational autoencoders to embed 6M molecules in a chemical space, such that their (Euclidean) distance within the latent space so formed could be assessed within the framework of the entire molecular set. However, the standard objective function used did not seek to manipulate the latent space so as to cluster the molecules based on any perceived similarity. Using a set of some 160,000 molecules of biological relevance, we here bring together three modern elements of deep learning to create a novel and disentangled latent space, viz transformers, contrastive learning, and an embedded autoencoder. The effective dimensionality of the latent space was varied such that clear separation of individual types of molecules could be observed within individual dimensions of the latent space. The capacity of the network was such that many dimensions were not populated at all. As before, we assessed the utility of the representation by comparing clozapine with its near neighbors, and we also did the same for various antibiotics related to flucloxacillin. Transformers, especially when as here coupled with contrastive learning, effectively provide one-shot learning and lead to a successful and disentangled representation of molecular latent spaces that at once uses the entire training set in their construction while allowing "similar" molecules to cluster together in an effective and interpretable way.
Collapse
Affiliation(s)
- Aditya Divyakant Shrivastava
- Department of Computer Science and Engineering, Nirma University, Ahmedabad 382481, India;
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St., Liverpool L69 7ZB, UK
| | - Douglas B. Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St., Liverpool L69 7ZB, UK
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs Lyngby, Denmark
- Mellizyme Ltd., Liverpool Science Park, IC1, 131 Mount Pleasant, Liverpool L3 5TF, UK
| |
Collapse
|