1
|
Wang Y, Du B, Wang W, Xu C. Multi-tailed vision transformer for efficient inference. Neural Netw 2024; 174:106235. [PMID: 38564978 DOI: 10.1016/j.neunet.2024.106235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 01/22/2024] [Accepted: 03/11/2024] [Indexed: 04/04/2024]
Abstract
Recently, Vision Transformer (ViT) has achieved promising performance in image recognition and gradually serves as a powerful backbone in various vision tasks. To satisfy the sequential input of Transformer, the tail of ViT first splits each image into a sequence of visual tokens with a fixed length. Then, the following self-attention layers construct the global relationship between tokens to produce useful representation for the downstream tasks. Empirically, representing the image with more tokens leads to better performance, yet the quadratic computational complexity of self-attention layer to the number of tokens could seriously influence the efficiency of ViT's inference. For computational reduction, a few pruning methods progressively prune uninformative tokens in the Transformer encoder, while leaving the number of tokens before the Transformer untouched. In fact, fewer tokens as the input for the Transformer encoder can directly reduce the following computational cost. In this spirit, we propose a Multi-Tailed Vision Transformer (MT-ViT) in the paper. MT-ViT adopts multiple tails to produce visual sequences of different lengths for the following Transformer encoder. A tail predictor is introduced to decide which tail is the most efficient for the image to produce accurate prediction. Both modules are optimized in an end-to-end fashion, with the Gumbel-Softmax trick. Experiments on ImageNet-1K demonstrate that MT-ViT can achieve a significant reduction on FLOPs with no degradation of the accuracy and outperform compared methods in both accuracy and FLOPs.
Collapse
Affiliation(s)
- Yunke Wang
- School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Wuhan institute of Data Intelligence, Wuhan University, Wuhan, 430072, China.
| | - Bo Du
- School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Wuhan institute of Data Intelligence, Wuhan University, Wuhan, 430072, China.
| | - Wenyuan Wang
- School of Electric Information, Wuhan University, Wuhan, 430072, China.
| | - Chang Xu
- School of Computer Science, The University of Sydney, Sydney, Australia.
| |
Collapse
|
2
|
Gohla G, Hauser TK, Bombach P, Feucht D, Estler A, Bornemann A, Zerweck L, Weinbrenner E, Ernemann U, Ruff C. Speeding Up and Improving Image Quality in Glioblastoma MRI Protocol by Deep Learning Image Reconstruction. Cancers (Basel) 2024; 16:1827. [PMID: 38791906 PMCID: PMC11119715 DOI: 10.3390/cancers16101827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 04/29/2024] [Accepted: 05/08/2024] [Indexed: 05/26/2024] Open
Abstract
A fully diagnostic MRI glioma protocol is key to monitoring therapy assessment but is time-consuming and especially challenging in critically ill and uncooperative patients. Artificial intelligence demonstrated promise in reducing scan time and improving image quality simultaneously. The purpose of this study was to investigate the diagnostic performance, the impact on acquisition acceleration, and the image quality of a deep learning optimized glioma protocol of the brain. Thirty-three patients with histologically confirmed glioblastoma underwent standardized brain tumor imaging according to the glioma consensus recommendations on a 3-Tesla MRI scanner. Conventional and deep learning-reconstructed (DLR) fluid-attenuated inversion recovery, and T2- and T1-weighted contrast-enhanced Turbo spin echo images with an improved in-plane resolution, i.e., super-resolution, were acquired. Two experienced neuroradiologists independently evaluated the image datasets for subjective image quality, diagnostic confidence, tumor conspicuity, noise levels, artifacts, and sharpness. In addition, the tumor volume was measured in the image datasets according to Response Assessment in Neuro-Oncology (RANO) 2.0, as well as compared between both imaging techniques, and various clinical-pathological parameters were determined. The average time saving of DLR sequences was 30% per MRI sequence. Simultaneously, DLR sequences showed superior overall image quality (all p < 0.001), improved tumor conspicuity and image sharpness (all p < 0.001, respectively), and less image noise (all p < 0.001), while maintaining diagnostic confidence (all p > 0.05), compared to conventional images. Regarding RANO 2.0, the volume of non-enhancing non-target lesions (p = 0.963), enhancing target lesions (p = 0.993), and enhancing non-target lesions (p = 0.951) did not differ between reconstruction types. The feasibility of the deep learning-optimized glioma protocol was demonstrated with a 30% reduction in acquisition time on average and an increased in-plane resolution. The evaluated DLR sequences improved subjective image quality and maintained diagnostic accuracy in tumor detection and tumor classification according to RANO 2.0.
Collapse
Affiliation(s)
- Georg Gohla
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| | - Till-Karsten Hauser
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| | - Paula Bombach
- Department of Neurology and Interdisciplinary Neuro-Oncology, University Hospital Tübingen, Hoppe-Seyler-Str. 3, 72076 Tübingen, Germany;
- Hertie Institute for Clinical Brain Research, Eberhard Karls University Tübingen Center of Neuro-Oncology, Ottfried-Müller-Straße 27, 72076 Tübingen, Germany
- Center for Neuro-Oncology, Comprehensive Cancer Center Tübingen-Stuttgart, University Hospital of Tuebingen, Eberhard Karls University of Tübingen, Herrenberger Straße 23, 72070 Tübingen, Germany
| | - Daniel Feucht
- Department of Neurosurgery, University Hospital Tübingen, Hoppe-Seyler-Str. 3, 72076 Tübingen, Germany;
| | - Arne Estler
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| | - Antje Bornemann
- Department of Neuropathology, Institute of Pathology and Neuropathology, University Hospital Tübingen, Calwerstraße 3, 72076 Tübingen, Germany;
| | - Leonie Zerweck
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| | - Eliane Weinbrenner
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| | - Ulrike Ernemann
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| | - Christer Ruff
- Department of Diagnostic and Interventional Neuroradiology, Eberhard Karls-University Tübingen, 72076 Tübingen, Germany; (T.-K.H.); (A.E.); (L.Z.); (E.W.); (U.E.); (C.R.)
| |
Collapse
|
3
|
Grigas O, Damaševičius R, Maskeliūnas R. Positive Effect of Super-Resolved Structural Magnetic Resonance Imaging for Mild Cognitive Impairment Detection. Brain Sci 2024; 14:381. [PMID: 38672031 PMCID: PMC11048389 DOI: 10.3390/brainsci14040381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 04/09/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024] Open
Abstract
This paper presents a novel approach to improving the detection of mild cognitive impairment (MCI) through the use of super-resolved structural magnetic resonance imaging (MRI) and optimized deep learning models. The study introduces enhancements to the perceptual quality of super-resolved 2D structural MRI images using advanced loss functions, modifications to the upscaler part of the generator, and experiments with various discriminators within a generative adversarial training setting. It empirically demonstrates the effectiveness of super-resolution in the MCI detection task, showcasing performance improvements across different state-of-the-art classification models. The paper also addresses the challenge of accurately capturing perceptual image quality, particularly when images contain checkerboard artifacts, and proposes a methodology that incorporates hyperparameter optimization through a Pareto optimal Markov blanket (POMB). This approach systematically explores the hyperparameter space, focusing on reducing overfitting and enhancing model generalizability. The research findings contribute to the field by demonstrating that super-resolution can significantly improve the quality of MRI images for MCI detection, highlighting the importance of choosing an adequate discriminator and the potential of super-resolution as a preprocessing step to boost classification model performance.
Collapse
Affiliation(s)
- Ovidijus Grigas
- Faculty of Informatics, Kaunas University of Technology, 50254 Kaunas, Lithuania; (O.G.); (R.M.)
| | - Robertas Damaševičius
- Faculty of Informatics, Kaunas University of Technology, 50254 Kaunas, Lithuania; (O.G.); (R.M.)
- Faculty of Applied Mathematics, Silesian University of Technology, 44-100 Gliwice, Poland
| | - Rytis Maskeliūnas
- Faculty of Informatics, Kaunas University of Technology, 50254 Kaunas, Lithuania; (O.G.); (R.M.)
| |
Collapse
|
4
|
Han Z, Huang W. Arbitrary scale super-resolution diffusion model for brain MRI images. Comput Biol Med 2024; 170:108003. [PMID: 38262200 DOI: 10.1016/j.compbiomed.2024.108003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/22/2023] [Accepted: 01/13/2024] [Indexed: 01/25/2024]
Abstract
Given the constraints posed by hardware capacity, scan duration, and patient cooperation, the reconstruction of magnetic resonance imaging (MRI) images emerges as a pivotal aspect of medical imaging research. Currently, deep learning-based super-resolution (SR) methods have been widely discussed in medical image processing due to their ability to reconstruct high-quality, high resolution (HR) images from low resolution (LR) inputs. However, most existing MRI SR methods are designed for specific magnifications and cannot generate MRI images at arbitrary scales, which hinders the radiologists from fully visualizing the lesions. Moreover, current arbitrary scale SR methods often suffer from issues like excessive smoothing and artifacts. In this paper, we propose an Arbitrary Scale Super-Resolution Diffusion Model (ASSRDM), which combines implicit neural representation with the denoising diffusion probabilistic model to achieve arbitrary-scale, high-fidelity medical images SR. Moreover, we formulate a continuous resolution regulation mechanism, comprising a multi-scale LR guidance network and a scaling factor. The scaling factor finely adjusts the resolution and dynamically influences the weighting of LR details and synthesized features in the final output. This capability allows the model to seamlessly adapt to the requirements of continuous resolution adjustments. Additionally, the multi-scale LR guidance network provides the denoising block with multi-resolution LR features to enrich texture information and restore high-frequency details. Extensive experiments conducted on the IXI and fastMRI datasets demonstrate that our ASSRDM exhibits superior performance compared to existing techniques and has tremendous potential in clinical practice.
Collapse
Affiliation(s)
- Zhitao Han
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Wenhui Huang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|