1
|
Zhang Y, Balestra G, Zhang K, Wang J, Rosati S, Giannini V. MultiTrans: Multi-branch transformer network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108280. [PMID: 38878361 DOI: 10.1016/j.cmpb.2024.108280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 05/13/2024] [Accepted: 06/06/2024] [Indexed: 07/28/2024]
Abstract
BACKGROUND AND OBJECTIVE Transformer, which is notable for its ability of global context modeling, has been used to remedy the shortcomings of Convolutional neural networks (CNN) and break its dominance in medical image segmentation. However, the self-attention module is both memory and computational inefficient, so many methods have to build their Transformer branch upon largely downsampled feature maps or adopt the tokenized image patches to fit their model into accessible GPUs. This patch-wise operation restricts the network in extracting pixel-level intrinsic structural or dependencies inside each patch, hurting the performance of pixel-level classification tasks. METHODS To tackle these issues, we propose a memory- and computation-efficient self-attention module to enable reasoning on relatively high-resolution features, promoting the efficiency of learning global information while effective grasping fine spatial details. Furthermore, we design a novel Multi-Branch Transformer (MultiTrans) architecture to provide hierarchical features for handling objects with variable shapes and sizes in medical images. By building four parallel Transformer branches on different levels of CNN, our hybrid network aggregates both multi-scale global contexts and multi-scale local features. RESULTS MultiTrans achieves the highest segmentation accuracy on three medical image datasets with different modalities: Synapse, ACDC and M&Ms. Compared to the Standard Self-Attention (SSA), the proposed Efficient Self-Attention (ESA) can largely reduce the training memory and computational complexity while even slightly improve the accuracy. Specifically, the training memory cost, FLOPs and Params of our ESA are 18.77%, 20.68% and 74.07% of the SSA. CONCLUSIONS Experiments on three medical image datasets demonstrate the generality and robustness of the designed network. The ablation study shows the efficiency and effectiveness of our proposed ESA. Code is available at: https://github.com/Yanhua-Zhang/MultiTrans-extension.
Collapse
Affiliation(s)
- Yanhua Zhang
- Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy; School of Astronautics, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| | - Gabriella Balestra
- Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy.
| | - Ke Zhang
- School of Astronautics, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| | - Jingyu Wang
- School of Astronautics, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| | - Samanta Rosati
- Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy.
| | - Valentina Giannini
- Department of Surgical Sciences, University of Turin, Turin, 10124, Italy; Radiology Unit, Candiolo Cancer Institute, FPO-IRCCS, Candiolo, 10060, Italy.
| |
Collapse
|
2
|
Li S, Wang H, Meng Y, Zhang C, Song Z. Multi-organ segmentation: a progressive exploration of learning paradigms under scarce annotation. Phys Med Biol 2024; 69:11TR01. [PMID: 38479023 DOI: 10.1088/1361-6560/ad33b5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 03/13/2024] [Indexed: 05/21/2024]
Abstract
Precise delineation of multiple organs or abnormal regions in the human body from medical images plays an essential role in computer-aided diagnosis, surgical simulation, image-guided interventions, and especially in radiotherapy treatment planning. Thus, it is of great significance to explore automatic segmentation approaches, among which deep learning-based approaches have evolved rapidly and witnessed remarkable progress in multi-organ segmentation. However, obtaining an appropriately sized and fine-grained annotated dataset of multiple organs is extremely hard and expensive. Such scarce annotation limits the development of high-performance multi-organ segmentation models but promotes many annotation-efficient learning paradigms. Among these, studies on transfer learning leveraging external datasets, semi-supervised learning including unannotated datasets and partially-supervised learning integrating partially-labeled datasets have led the dominant way to break such dilemmas in multi-organ segmentation. We first review the fully supervised method, then present a comprehensive and systematic elaboration of the 3 abovementioned learning paradigms in the context of multi-organ segmentation from both technical and methodological perspectives, and finally summarize their challenges and future trends.
Collapse
Affiliation(s)
- Shiman Li
- Digital Medical Research Center, School of Basic Medical Science, Fudan University, Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, People's Republic of China
| | - Haoran Wang
- Digital Medical Research Center, School of Basic Medical Science, Fudan University, Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, People's Republic of China
| | - Yucong Meng
- Digital Medical Research Center, School of Basic Medical Science, Fudan University, Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, People's Republic of China
| | - Chenxi Zhang
- Digital Medical Research Center, School of Basic Medical Science, Fudan University, Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, People's Republic of China
| | - Zhijian Song
- Digital Medical Research Center, School of Basic Medical Science, Fudan University, Shanghai Key Lab of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, People's Republic of China
| |
Collapse
|
3
|
Herr J, Stoyanova R, Mellon EA. Convolutional Neural Networks for Glioma Segmentation and Prognosis: A Systematic Review. Crit Rev Oncog 2024; 29:33-65. [PMID: 38683153 DOI: 10.1615/critrevoncog.2023050852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]
Abstract
Deep learning (DL) is poised to redefine the way medical images are processed and analyzed. Convolutional neural networks (CNNs), a specific type of DL architecture, are exceptional for high-throughput processing, allowing for the effective extraction of relevant diagnostic patterns from large volumes of complex visual data. This technology has garnered substantial interest in the field of neuro-oncology as a promising tool to enhance medical imaging throughput and analysis. A multitude of methods harnessing MRI-based CNNs have been proposed for brain tumor segmentation, classification, and prognosis prediction. They are often applied to gliomas, the most common primary brain cancer, to classify subtypes with the goal of guiding therapy decisions. Additionally, the difficulty of repeating brain biopsies to evaluate treatment response in the setting of often confusing imaging findings provides a unique niche for CNNs to help distinguish the treatment response to gliomas. For example, glioblastoma, the most aggressive type of brain cancer, can grow due to poor treatment response, can appear to grow acutely due to treatment-related inflammation as the tumor dies (pseudo-progression), or falsely appear to be regrowing after treatment as a result of brain damage from radiation (radiation necrosis). CNNs are being applied to separate this diagnostic dilemma. This review provides a detailed synthesis of recent DL methods and applications for intratumor segmentation, glioma classification, and prognosis prediction. Furthermore, this review discusses the future direction of MRI-based CNN in the field of neuro-oncology and challenges in model interpretability, data availability, and computation efficiency.
Collapse
Affiliation(s)
| | - Radka Stoyanova
- Department of Radiation Oncology, University of Miami Miller School of Medicine, Sylvester Comprehensive Cancer Center, Miami, Fl 33136, USA
| | - Eric Albert Mellon
- Department of Radiation Oncology, University of Miami Miller School of Medicine, Sylvester Comprehensive Cancer Center, Miami, Fl 33136, USA
| |
Collapse
|
4
|
Wong KKL, Xu W, Ayoub M, Fu YL, Xu H, Shi R, Zhang M, Su F, Huang Z, Chen W. Brain image segmentation of the corpus callosum by combining Bi-Directional Convolutional LSTM and U-Net using multi-slice CT and MRI. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 238:107602. [PMID: 37244234 DOI: 10.1016/j.cmpb.2023.107602] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 05/12/2023] [Accepted: 05/14/2023] [Indexed: 05/29/2023]
Abstract
BACKGROUND AND OBJECTIVE Traditional disease diagnosis is usually performed by experienced physicians, but misdiagnosis or missed diagnosis still exists. Exploring the relationship between changes in the corpus callosum and multiple brain infarcts requires extracting corpus callosum features from brain image data, which requires addressing three key issues. (1) automation, (2) completeness, and (3) accuracy. Residual learning can facilitate network training, Bi-Directional Convolutional LSTM (BDC-LSTM) can exploit interlayer spatial dependencies, and HDC can expand the receptive domain without losing resolution. METHODS In this paper, we propose a segmentation method by combining BDC-LSTM and U-Net to segment the corpus callosum from multiple angles of brain images based on computed tomography (CT) and magnetic resonance imaging (MRI) in which two types of sequence, namely T2-weighted imaging as well as the Fluid Attenuated Inversion Recovery (Flair), were utilized. The two-dimensional slice sequences are segmented in the cross-sectional plane, and the segmentation results are combined to obtain the final results. Encoding, BDC- LSTM, and decoding include convolutional neural networks. The coding part uses asymmetric convolutional layers of different sizes and dilated convolutions to get multi-slice information and extend the convolutional layers' perceptual field. RESULTS This paper uses BDC-LSTM between the encoding and decoding parts of the algorithm. On the image segmentation of the brain in multiple cerebral infarcts dataset, accuracy rates of 0.876, 0.881, 0.887, and 0.912 were attained for the intersection of union (IOU), dice similarity coefficient (DS), sensitivity (SE), and predictive positivity value (PPV). The experimental findings demonstrate that the algorithm outperforms its rivals in accuracy. CONCLUSION This paper obtained segmentation results for three images using three models, ConvLSTM, Pyramid-LSTM, and BDC-LSTM, and compared them to verify that BDC-LSTM is the best method to perform the segmentation task for faster and more accurate detection of 3D medical images. We improve the convolutional neural network segmentation method to obtain medical images with high segmentation accuracy by solving the over-segmentation problem.
Collapse
Affiliation(s)
- Kelvin K L Wong
- School of Information and Electronics, Hunan City University, Yiyang 413000, China.
| | - Wanni Xu
- College of Technology and Engineering, National Taiwan Normal University, Taipei 106, Taiwan; Department of Computer Information Engineering, Nanchang Institute of Technology, Nanchang 330044, China.
| | - Muhammad Ayoub
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - You-Lei Fu
- College of Technology and Engineering, National Taiwan Normal University, Taipei 106, Taiwan; Department of Computer Information Engineering, Nanchang Institute of Technology, Nanchang 330044, China
| | - Huasen Xu
- Department of Civil Engineering, Shanghai Normal University, Shanghai 201418, China
| | - Ruizheng Shi
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Mu Zhang
- Department of Emergency, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Feng Su
- Department of Emergency, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Zhiguo Huang
- Department of Emergency, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Weimin Chen
- School of Information and Electronics, Hunan City University, Yiyang 413000, China
| |
Collapse
|