1
|
Lim H, Gi Y, Ko Y, Jo Y, Hong J, Kim J, Ahn SH, Park HC, Kim H, Chung K, Yoon M. A device-dependent auto-segmentation method based on combined generalized and single-device datasets. Med Phys 2025; 52:2375-2383. [PMID: 39699056 DOI: 10.1002/mp.17570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 11/12/2024] [Accepted: 11/26/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Although generalized-dataset-based auto-segmentation models that consider various computed tomography (CT) scanners have shown great clinical potential, their application to medical images from unseen scanners remains challenging because of device-dependent image features. PURPOSE This study aims to investigate the performance of a device-dependent auto-segmentation model based on a combined dataset of a generalized dataset and single CT scanner dataset. METHOD We constructed two training datasets for 21 chest and abdominal organs. The generalized dataset comprised 1203 publicly available multi-scanner data. The device-dependent dataset comprised 1253 data, including the 1203 multi-CT scanner data and 50 single CT scanner data. Using these datasets, the generalized-dataset-based model (GDSM) and the device-dependent-dataset-based model (DDSM) were trained. The models were trained using nnU-Net and tested on ten data samples from a single CT scanner. The evaluation metrics included the Dice similarity coefficient (DSC), the Hausdorff distance (HD), and the average symmetric surface distance (ASSD), which were used to assess the overall performance of the models. In addition, DSCdiff, HDratio, and ASSDratio, which are variations of the three existing metrics, were used to compare the performance of the models across different organs. RESULT For the average DSC, the GDSM and DDSM had values of 0.9251 and 0.9323, respectively. For the average HD, the GDSM and DDSM had values of 10.66 and 9.139 mm, respectively; for the average ASSD, the GDSM and DDSM had values of 0.8318 and 0.6656 mm, respectively. Compared with the GDSM, the DDSM showed consistent performance improvements of 0.78%, 14%, and 20% for the DSC, HD, and ASSD metrics, respectively. In addition, compared with the GDSM, the DDSM had better DSCdiff values in 14 of 21 tested organs, better HDratio values in 13 of 21 tested organs, and better ASSDratio values in 14 of 21 tested organs. The three averages of the variant metrics were all better for the DDSM than for the GDSM. CONCLUSION The results suggest that combining the generalized dataset with a single scanner dataset resulted in an overall improvement in model performance for that device image.
Collapse
Affiliation(s)
- Hyeongjin Lim
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | - Yongha Gi
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | - Yousun Ko
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | - Yunhui Jo
- Institute of Global Health Technology (IGHT), Korea University, Seoul, Republic of Korea
| | - Jinyoung Hong
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
| | | | - Sung Hwan Ahn
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Hee-Chul Park
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Haeyoung Kim
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Kwangzoo Chung
- Department of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Myonggeun Yoon
- Department of Bio-medical Engineering, Korea University, Seoul, Republic of Korea
- FieldCure Ltd, Seoul, Republic of Korea
| |
Collapse
|
2
|
Tang R, Zhao H, Tong Y, Mu R, Wang Y, Zhang S, Zhao Y, Wang W, Zhang M, Liu Y, Gao J. A frequency attention-embedded network for polyp segmentation. Sci Rep 2025; 15:4961. [PMID: 39929863 PMCID: PMC11811025 DOI: 10.1038/s41598-025-88475-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 01/28/2025] [Indexed: 02/13/2025] Open
Abstract
Gastrointestinal polyps are observed and treated under endoscopy, so there presents significant challenges to advance endoscopy imaging segmentation of polyps. Current methodologies often falter in distinguishing complex polyp structures within diverse (mucosal) tissue environments. In this paper, we propose the Frequency Attention-Embedded Network (FAENet), a novel approach leveraging frequency-based attention mechanisms to enhance polyp segmentation accuracy significantly. FAENet ingeniously segregates and processes image data into high and low-frequency components, enabling precise delineation of polyp boundaries and internal structures by integrating intra-component and cross-component attention mechanisms. This method not only preserves essential edge details but also refines the learned representation attentively, ensuring robust segmentation across varied imaging conditions. Comprehensive evaluations on two public datasets, Kvasir-SEG and CVC-ClinicDB, demonstrate FAENet's superiority over several state-of-the-art models in terms of Dice coefficient, Intersection over Union (IoU), sensitivity, and specificity. The results affirm that FAENet's advanced attention mechanisms significantly improve the segmentation quality, outperforming traditional and contemporary techniques. FAENet's success indicates its potential to revolutionize polyp segmentation in clinical practices, fostering diagnosis and efficient treatment of gastrointestinal polyps.
Collapse
Affiliation(s)
- Rui Tang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Hejing Zhao
- Research Center on Flood and Drought Disaster Reduction of Ministry of Water Resource, China Institute of Water Resources and Hydropower Research, Beijing, 100038, China
- Water History Department, China Institute of Water Resources and Hydropower Research, Beijing, 100038, China
| | - Yao Tong
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, 210023, China
- Jiangsu Province Engineering Research Center of TCM Intelligence Health Service, Nanjing University of Chinese Medicine, Nanjing, 210023, China
| | - Ruihui Mu
- College of Computer and Information, Xinxiang University, Xinxiang, 453000, China
| | - Yuqiang Wang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Shuhao Zhang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yao Zhao
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Weidong Wang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Min Zhang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yilin Liu
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China.
| | - Jianbo Gao
- Department of Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
3
|
Pomohaci MD, Grasu MC, Băicoianu-Nițescu AŞ, Enache RM, Lupescu IG. Systematic Review: AI Applications in Liver Imaging with a Focus on Segmentation and Detection. Life (Basel) 2025; 15:258. [PMID: 40003667 PMCID: PMC11856300 DOI: 10.3390/life15020258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2024] [Revised: 02/02/2025] [Accepted: 02/05/2025] [Indexed: 02/27/2025] Open
Abstract
The liver is a frequent focus in radiology due to its diverse pathology, and artificial intelligence (AI) could improve diagnosis and management. This systematic review aimed to assess and categorize research studies on AI applications in liver radiology from 2018 to 2024, classifying them according to areas of interest (AOIs), AI task and imaging modality used. We excluded reviews and non-liver and non-radiology studies. Using the PRISMA guidelines, we identified 6680 articles from the PubMed/Medline, Scopus and Web of Science databases; 1232 were found to be eligible. A further analysis of a subgroup of 329 studies focused on detection and/or segmentation tasks was performed. Liver lesions were the main AOI and CT was the most popular modality, while classification was the predominant AI task. Most detection and/or segmentation studies (48.02%) used only public datasets, and 27.65% used only one public dataset. Code sharing was practiced by 10.94% of these articles. This review highlights the predominance of classification tasks, especially applied to liver lesion imaging, most often using CT imaging. Detection and/or segmentation tasks relied mostly on public datasets, while external testing and code sharing were lacking. Future research should explore multi-task models and improve dataset availability to enhance AI's clinical impact in liver imaging.
Collapse
Affiliation(s)
- Mihai Dan Pomohaci
- Department 8: Radiology, Discipline of Radiology, Medical Imaging and Interventional Radiology I, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (M.D.P.); (A.-Ș.B.-N.)
- Department of Radiology and Medical Imaging, Fundeni Clinical Institute, 022328 Bucharest, Romania;
| | - Mugur Cristian Grasu
- Department 8: Radiology, Discipline of Radiology, Medical Imaging and Interventional Radiology I, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (M.D.P.); (A.-Ș.B.-N.)
- Department of Radiology and Medical Imaging, Fundeni Clinical Institute, 022328 Bucharest, Romania;
| | - Alexandru-Ştefan Băicoianu-Nițescu
- Department 8: Radiology, Discipline of Radiology, Medical Imaging and Interventional Radiology I, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (M.D.P.); (A.-Ș.B.-N.)
- Department of Radiology and Medical Imaging, Fundeni Clinical Institute, 022328 Bucharest, Romania;
| | - Robert Mihai Enache
- Department of Radiology and Medical Imaging, Fundeni Clinical Institute, 022328 Bucharest, Romania;
| | - Ioana Gabriela Lupescu
- Department 8: Radiology, Discipline of Radiology, Medical Imaging and Interventional Radiology I, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (M.D.P.); (A.-Ș.B.-N.)
- Department of Radiology and Medical Imaging, Fundeni Clinical Institute, 022328 Bucharest, Romania;
| |
Collapse
|
4
|
Guo T, Luan J, Gao J, Liu B, Shen T, Yu H, Ma G, Wang K. Computer-aided diagnosis of pituitary microadenoma on dynamic contrast-enhanced MRI based on spatio-temporal features. EXPERT SYSTEMS WITH APPLICATIONS 2025; 260:125414. [DOI: 10.1016/j.eswa.2024.125414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
|
5
|
Rohilla S, Jain S. Detection of Brain Tumor Employing Residual Network-based Optimized Deep Learning. Curr Comput Aided Drug Des 2025; 21:15-27. [PMID: 37587819 DOI: 10.2174/1573409920666230816090626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 05/19/2023] [Accepted: 05/30/2023] [Indexed: 08/18/2023]
Abstract
BACKGROUND Diagnosis and treatment planning play a very vital role in improving the survival of oncological patients. However, there is high variability in the shape, size, and structure of the tumor, making automatic segmentation difficult. The automatic and accurate detection and segmentation methods for brain tumors are proposed in this paper. METHODS A modified ResNet50 model was used for tumor detection, and a ResUNetmodel-based convolutional neural network for segmentation is proposed in this paper. The detection and segmentation were performed on the same dataset consisting of pre-contrast, FLAIR, and postcontrast MRI images of 110 patients collected from the cancer imaging archive. Due to the use of residual networks, the authors observed improvement in evaluation parameters, such as accuracy for tumor detection and dice similarity coefficient for tumor segmentation. RESULTS The accuracy of tumor detection and dice similarity coefficient achieved by the segmentation model were 96.77% and 0.893, respectively, for the TCIA dataset. The results were compared based on manual segmentation and existing segmentation techniques. The tumor mask was also individually compared to the ground truth using the SSIM value. The proposed detection and segmentation models were validated on BraTS2015 and BraTS2017 datasets, and the results were consensus. CONCLUSION The use of residual networks in both the detection and the segmentation model resulted in improved accuracy and DSC score. DSC score was increased by 5.9% compared to the UNet model, and the accuracy of the model was increased from 92% to 96.77% for the test set.
Collapse
Affiliation(s)
- Saransh Rohilla
- Department of Electronics and Communication Engineering, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
| | - Shruti Jain
- Department of Electronics and Communication Engineering, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
| |
Collapse
|
6
|
Dumbrique JIS, Hernandez RB, Cruz JML, Pagdanganan RM, Naval PC. Pneumothorax detection and segmentation from chest X-ray radiographs using a patch-based fully convolutional encoder-decoder network. FRONTIERS IN RADIOLOGY 2024; 4:1424065. [PMID: 39722784 PMCID: PMC11668597 DOI: 10.3389/fradi.2024.1424065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Accepted: 11/04/2024] [Indexed: 12/28/2024]
Abstract
Pneumothorax, a life-threatening condition characterized by air accumulation in the pleural cavity, requires early and accurate detection for optimal patient outcomes. Chest X-ray radiographs are a common diagnostic tool due to their speed and affordability. However, detecting pneumothorax can be challenging for radiologists because the sole visual indicator is often a thin displaced pleural line. This research explores deep learning techniques to automate and improve the detection and segmentation of pneumothorax from chest X-ray radiographs. We propose a novel architecture that combines the advantages of fully convolutional neural networks (FCNNs) and Vision Transformers (ViTs) while using only convolutional modules to avoid the quadratic complexity of ViT's self-attention mechanism. This architecture utilizes a patch-based encoder-decoder structure with skip connections to effectively combine high-level and low-level features. Compared to prior research and baseline FCNNs, our model demonstrates significantly higher accuracy in detection and segmentation while maintaining computational efficiency. This is evident on two datasets: (1) the SIIM-ACR Pneumothorax Segmentation dataset and (2) a novel dataset we curated from The Medical City, a private hospital in the Philippines. Ablation studies further reveal that using a mixed Tversky and Focal loss function significantly improves performance compared to using solely the Tversky loss. Our findings suggest our model has the potential to improve diagnostic accuracy and efficiency in pneumothorax detection, potentially aiding radiologists in clinical settings.
Collapse
Affiliation(s)
- Jakov Ivan S. Dumbrique
- Computer Vision and Machine Intelligence Group, Department of Computer Science, University of the Philippines-Diliman, Quezon City, Philippines
- Department of Mathematics, Ateneo de Manila University, Quezon City, Philippines
| | - Reynan B. Hernandez
- Ateneo School of Medicine and Public Health, Pasig, Philippines
- Department of Radiology, The Medical City, Pasig, Philippines
| | | | | | - Prospero C. Naval
- Computer Vision and Machine Intelligence Group, Department of Computer Science, University of the Philippines-Diliman, Quezon City, Philippines
| |
Collapse
|
7
|
Gul S, Khan MS, Hossain MSA, Chowdhury MEH, Sumon MSI. A Comparative Study of Decoders for Liver and Tumor Segmentation Using a Self-ONN-Based Cascaded Framework. Diagnostics (Basel) 2024; 14:2761. [PMID: 39682669 DOI: 10.3390/diagnostics14232761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2024] [Revised: 11/22/2024] [Accepted: 12/06/2024] [Indexed: 12/18/2024] Open
Abstract
Background/Objectives: Accurate liver and tumor detection and segmentation are crucial in diagnosis of early-stage liver malignancies. As opposed to manual interpretation, which is a difficult and time-consuming process, accurate tumor detection using a computer-aided diagnosis system can save both time and human efforts. Methods: We propose a cascaded encoder-decoder technique based on self-organized neural networks, which is a recent variant of operational neural networks (ONNs), for accurate segmentation and identification of liver tumors. The first encoder-decoder CNN segments the liver. For generating the liver region of interest, the segmented liver mask is placed over the input computed tomography (CT) image and then fed to the second Self-ONN model for tumor segmentation. For further investigation the other three distinct encoder-decoder architectures U-Net, feature pyramid networks (FPNs), and U-Net++, have also been investigated by altering the backbone at the encoders utilizing ResNet and DenseNet variants for transfer learning. Results: For the liver segmentation task, Self-ONN with a ResNet18 backbone has achieved a dice similarity coefficient score of 98.182% and an intersection over union of 97.436%. Tumor segmentation with Self-ONN with the DenseNet201 encoder resulted in an outstanding DSC of 92.836% and IoU of 91.748%. Conclusions: The suggested method is capable of precisely locating liver tumors of various sizes and shapes, including tiny infection patches that were said to be challenging to find in earlier research.
Collapse
Affiliation(s)
- Sidra Gul
- Department of Computer Systems Engineering, University of Engineering and Technology, Peshawar 25000, Pakistan
- Artificial Intelligence in Healthcare, Intelligent Information Processing Lab, National Center of Artificial Intelligence, Peshawar 25000, Pakistan
| | - Muhammad Salman Khan
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| | - Md Sakib Abrar Hossain
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| | - Muhammad E H Chowdhury
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| | - Md Shaheenur Islam Sumon
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| |
Collapse
|
8
|
Azad R, Aghdam EK, Rauland A, Jia Y, Avval AH, Bozorgpour A, Karimijafarbigloo S, Cohen JP, Adeli E, Merhof D. Medical Image Segmentation Review: The Success of U-Net. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:10076-10095. [PMID: 39167505 DOI: 10.1109/tpami.2024.3435571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Automatic medical image segmentation is a crucial topic in the medical domain and successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities. Over the years, the U-Net model has received tremendous attention from academic and industrial researchers who have extended it to address the scale and complexity created by medical tasks. These extensions are commonly related to enhancing the U-Net's backbone, bottleneck, or skip connections, or including representation learning, or combining it with a Transformer architecture, or even addressing probabilistic prediction of the segmentation map. Having a compendium of different previously proposed U-Net variants makes it easier for machine learning researchers to identify relevant research questions and understand the challenges of the biological tasks that challenge the model. In this work, we discuss the practical aspects of the U-Net model and organize each variant model into a taxonomy. Moreover, to measure the performance of these strategies in a clinical application, we propose fair evaluations of some unique and famous designs on well-known datasets. Furthermore, we provide a comprehensive implementation library with trained models. In addition, for ease of future studies, we created an online list of U-Net papers with their possible official implementation.
Collapse
|
9
|
Qin J, Luo H, He F, Qin G. DSA-Former: A Network of Hybrid Variable Structures for Liver and Liver Tumour Segmentation. Int J Med Robot 2024; 20:e70004. [PMID: 39535347 DOI: 10.1002/rcs.70004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 10/03/2024] [Accepted: 10/20/2024] [Indexed: 11/16/2024]
Abstract
BACKGROUND Accurately annotated CT images of liver tumours can effectively assist doctors in diagnosing and treating liver cancer. However, due to the relatively low density of the liver, its tumours, and surrounding tissues, as well as the existence of multi-scale problems, accurate automatic segmentation still faces challenges. METHODS We propose a segmentation network DSA-Former that combines convolutional kernels and attention. By combining the morphological and edge features of liver tumour images, capture global/local features and key inter-layer information, and integrate attention mechanisms obtaining detailed information to improve segmentation accuracy. RESULTS Compared to other methods, our approach demonstrates significant advantages in evaluation metrics such as the Dice coefficient, IOU, VOE, and HD95. Specifically, we achieve Dice coefficients of 96.8% for liver segmentation and 72.2% for liver tumour segmentation. CONCLUSION Our method offers enhanced precision in segmenting liver and liver tumour images, laying a robust foundation for liver cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Jun Qin
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Huizhen Luo
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Fei He
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Guihe Qin
- School of Computer Science and Technology, Jilin University, Changchun, China
| |
Collapse
|
10
|
Bereska JI, Zeeuw M, Wagenaar L, Jenssen HB, Wesdorp NJ, van der Meulen D, Bereska LF, Gavves E, Janssen BV, Besselink MG, Marquering HA, van Waesberghe JHTM, Aghayan DL, Pelanis E, van den Bergh J, Nota IIM, Moos S, Kemmerich G, Syversveen T, Kolrud FK, Huiskens J, Swijnenburg RJ, Punt CJA, Stoker J, Edwin B, Fretland ÅA, Kazemier G, Verpalen IM. Development and external evaluation of a self-learning auto-segmentation model for Colorectal Cancer Liver Metastases Assessment (COALA). Insights Imaging 2024; 15:279. [PMID: 39576456 PMCID: PMC11584830 DOI: 10.1186/s13244-024-01820-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 09/14/2024] [Indexed: 11/25/2024] Open
Abstract
OBJECTIVES Total tumor volume (TTV) is associated with overall and recurrence-free survival in patients with colorectal cancer liver metastases (CRLM). However, the labor-intensive nature of such manual assessments has hampered the clinical adoption of TTV as an imaging biomarker. This study aimed to develop and externally evaluate a CRLM auto-segmentation model on CT scans, to facilitate the clinical adoption of TTV. METHODS We developed an auto-segmentation model to segment CRLM using 783 contrast-enhanced portal venous phase CTs (CT-PVP) of 373 patients. We used a self-learning setup whereby we first trained a teacher model on 99 manually segmented CT-PVPs from three radiologists. The teacher model was then used to segment CRLM in the remaining 663 CT-PVPs for training the student model. We used the DICE score and the intraclass correlation coefficient (ICC) to compare the student model's segmentations and the TTV obtained from these segmentations to those obtained from the merged segmentations. We evaluated the student model in an external test set of 50 CT-PVPs from 35 patients from the Oslo University Hospital and an internal test set of 21 CT-PVPs from 10 patients from the Amsterdam University Medical Centers. RESULTS The model reached a mean DICE score of 0.85 (IQR: 0.05) and 0.83 (IQR: 0.10) on the internal and external test sets, respectively. The ICC between the segmented volumes from the student model and from the merged segmentations was 0.97 on both test sets. CONCLUSION The developed colorectal cancer liver metastases auto-segmentation model achieved a high DICE score and near-perfect agreement for assessing TTV. CRITICAL RELEVANCE STATEMENT AI model segments colorectal liver metastases on CT with high performance on two test sets. Accurate segmentation of colorectal liver metastases could facilitate the clinical adoption of total tumor volume as an imaging biomarker for prognosis and treatment response monitoring. KEY POINTS Developed colorectal liver metastases segmentation model to facilitate total tumor volume assessment. Model achieved high performance on internal and external test sets. Model can improve prognostic stratification and treatment planning for colorectal liver metastases.
Collapse
Affiliation(s)
- Jacqueline I Bereska
- Cancer Center Amsterdam, Amsterdam, The Netherlands.
- Amsterdam UMC, University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands.
- Amsterdam UMC, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands.
| | - Michiel Zeeuw
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Luuk Wagenaar
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
| | - Håvard Bjørke Jenssen
- Oslo University Hospital, Department of Radiology and Nuclear Medicine, Oslo, Norway
| | - Nina J Wesdorp
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Delanie van der Meulen
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Leonard F Bereska
- University of Amsterdam, Video and Image Sense Lab, Amsterdam, The Netherlands
| | - Efstratios Gavves
- University of Amsterdam, Video and Image Sense Lab, Amsterdam, The Netherlands
| | - Boris V Janssen
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam Gastroenterology Endocrinology and Metabolism, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Marc G Besselink
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam Gastroenterology Endocrinology and Metabolism, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Henk A Marquering
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Biomedical Engineering and Physics, Amsterdam, The Netherlands
| | - Jan-Hein T M van Waesberghe
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
| | - Davit L Aghayan
- Oslo University Hospital, Department of Hepato-Pancreato-Biliary Surgery, Oslo, Norway
- Oslo University Hospital, The Intervention Centre, Oslo, Norway
| | - Egidijus Pelanis
- Oslo University Hospital, Department of Hepato-Pancreato-Biliary Surgery, Oslo, Norway
- Oslo University Hospital, The Intervention Centre, Oslo, Norway
| | - Janneke van den Bergh
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
| | - Irene I M Nota
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
| | - Shira Moos
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
| | - Gunter Kemmerich
- Oslo University Hospital, Department of Radiology and Nuclear Medicine, Oslo, Norway
| | - Trygve Syversveen
- Oslo University Hospital, Department of Radiology and Nuclear Medicine, Oslo, Norway
| | - Finn Kristian Kolrud
- Oslo University Hospital, Department of Radiology and Nuclear Medicine, Oslo, Norway
| | - Joost Huiskens
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Rutger-Jan Swijnenburg
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
| | - Cornelis J A Punt
- Amsterdam UMC, University of Amsterdam, Department of Medical Oncology, Amsterdam, The Netherlands
- Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Jaap Stoker
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands
- Amsterdam Gastroenterology Endocrinology and Metabolism, Amsterdam, The Netherlands
| | - Bjørn Edwin
- Oslo University Hospital, Department of Hepato-Pancreato-Biliary Surgery, Oslo, Norway
- Oslo University Hospital, The Intervention Centre, Oslo, Norway
| | - Åsmund A Fretland
- Oslo University Hospital, Department of Hepato-Pancreato-Biliary Surgery, Oslo, Norway
- Oslo University Hospital, The Intervention Centre, Oslo, Norway
| | - Geert Kazemier
- Cancer Center Amsterdam, Amsterdam, The Netherlands
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Surgery, Amsterdam, The Netherlands
| | - Inez M Verpalen
- Cancer Center Amsterdam, Amsterdam, The Netherlands.
- Amsterdam UMC, University of Amsterdam, Department of Radiology and Nuclear Medicine, Amsterdam, The Netherlands.
| |
Collapse
|
11
|
Delmoral JC, R S Tavares JM. Semantic Segmentation of CT Liver Structures: A Systematic Review of Recent Trends and Bibliometric Analysis : Neural Network-based Methods for Liver Semantic Segmentation. J Med Syst 2024; 48:97. [PMID: 39400739 PMCID: PMC11473507 DOI: 10.1007/s10916-024-02115-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 10/02/2024] [Indexed: 10/15/2024]
Abstract
The use of artificial intelligence (AI) in the segmentation of liver structures in medical images has become a popular research focus in the past half-decade. The performance of AI tools in screening for this task may vary widely and has been tested in the literature in various datasets. However, no scientometric report has provided a systematic overview of this scientific area. This article presents a systematic and bibliometric review of recent advances in neuronal network modeling approaches, mainly of deep learning, to outline the multiple research directions of the field in terms of algorithmic features. Therefore, a detailed systematic review of the most relevant publications addressing fully automatic semantic segmenting liver structures in Computed Tomography (CT) images in terms of algorithm modeling objective, performance benchmark, and model complexity is provided. The review suggests that fully automatic hybrid 2D and 3D networks are the top performers in the semantic segmentation of the liver. In the case of liver tumor and vasculature segmentation, fully automatic generative approaches perform best. However, the reported performance benchmark indicates that there is still much to be improved in segmenting such small structures in high-resolution abdominal CT scans.
Collapse
Affiliation(s)
- Jessica C Delmoral
- Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal
| | - João Manuel R S Tavares
- Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Departamento de Engenharia Mecânica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal.
| |
Collapse
|
12
|
Wang L, Fatemi M, Alizad A. Artificial intelligence techniques in liver cancer. Front Oncol 2024; 14:1415859. [PMID: 39290245 PMCID: PMC11405163 DOI: 10.3389/fonc.2024.1415859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 08/15/2024] [Indexed: 09/19/2024] Open
Abstract
Hepatocellular Carcinoma (HCC), the most common primary liver cancer, is a significant contributor to worldwide cancer-related deaths. Various medical imaging techniques, including computed tomography, magnetic resonance imaging, and ultrasound, play a crucial role in accurately evaluating HCC and formulating effective treatment plans. Artificial Intelligence (AI) technologies have demonstrated potential in supporting physicians by providing more accurate and consistent medical diagnoses. Recent advancements have led to the development of AI-based multi-modal prediction systems. These systems integrate medical imaging with other modalities, such as electronic health record reports and clinical parameters, to enhance the accuracy of predicting biological characteristics and prognosis, including those associated with HCC. These multi-modal prediction systems pave the way for predicting the response to transarterial chemoembolization and microvascular invasion treatments and can assist clinicians in identifying the optimal patients with HCC who could benefit from interventional therapy. This paper provides an overview of the latest AI-based medical imaging models developed for diagnosing and predicting HCC. It also explores the challenges and potential future directions related to the clinical application of AI techniques.
Collapse
Affiliation(s)
- Lulu Wang
- Department of Engineering, School of Technology, Reykjavık University, Reykjavík, Iceland
- Department of Physiology and Biomedical Engineering, Mayo Clinic College of Medicine and Science, Rochester, MN, United States
| | - Mostafa Fatemi
- Department of Physiology and Biomedical Engineering, Mayo Clinic College of Medicine and Science, Rochester, MN, United States
| | - Azra Alizad
- Department of Radiology, Mayo Clinic College of Medicine and Science, Rochester, MN, United States
| |
Collapse
|
13
|
d'Albenzio G, Kamkova Y, Naseem R, Ullah M, Colonnese S, Cheikh FA, Kumar RP. A dual-encoder double concatenation Y-shape network for precise volumetric liver and lesion segmentation. Comput Biol Med 2024; 179:108870. [PMID: 39024904 DOI: 10.1016/j.compbiomed.2024.108870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 07/07/2024] [Accepted: 07/08/2024] [Indexed: 07/20/2024]
Abstract
Accurate segmentation of the liver and tumors from CT volumes is crucial for hepatocellular carcinoma diagnosis and pre-operative resection planning. Despite advances in deep learning-based methods for abdominal CT images, fully-automated segmentation remains challenging due to class imbalance and structural variations, often requiring cascaded approaches that incur significant computational costs. In this paper, we present the Dual-Encoder Double Concatenation Network (DEDC-Net) for simultaneous segmentation of the liver and its tumors. DEDC-Net leverages both residual and skip connections to enhance feature reuse and optimize performance in liver and tumor segmentation tasks. Extensive qualitative and quantitative experiments on the LiTS dataset demonstrate that DEDC-Net outperforms existing state-of-the-art liver segmentation methods. An ablation study was conducted to evaluate different encoder backbones - specifically VGG19 and ResNet - and the impact of incorporating an attention mechanism. Our results indicate that DEDC-Net, without any additional attention gates, achieves a superior mean Dice Score (DS) of 0.898 for liver segmentation. Moreover, integrating residual connections into one encoder yielded the highest DS for tumor segmentation tasks. The robustness of our proposed network was further validated on two additional, unseen CT datasets: IDCARDb-01 and COMET. Our model demonstrated superior lesion segmentation capabilities, particularly on IRCADb-01, achieving a DS of 0.629. The code implementation is publicly available at this website.
Collapse
Affiliation(s)
- Gabriella d'Albenzio
- The Intervention Center, Oslo University Hospital, 0slo, Norway; Department of Informatics, University of Oslo, Oslo, Norway.
| | - Yuliia Kamkova
- Department of Informatics, University of Oslo, Oslo, Norway; Department of Research and Development, Division of Emergencies and Critical Care, Oslo University Hospital, Oslo, Norway
| | - Rabia Naseem
- COMSATS, University Islamabad, Islamabad, Pakistan
| | - Mohib Ullah
- Department of Computer Science, Norwegian University of Science and Technology, Gjøvik, Norway
| | - Stefania Colonnese
- Department of Information Engineering, Electronics and Telecommunications (DIET), La Sapienza University of Rome, Rome, Italy
| | - Faouzi Alaya Cheikh
- Department of Computer Science, Norwegian University of Science and Technology, Gjøvik, Norway
| | | |
Collapse
|
14
|
Zuo Y, Zhang B, Dong Y, He W, Bi Y, Liu X, Zeng X, Deng Z. Glypred: Lysine Glycation Site Prediction via CCU-LightGBM-BiLSTM Framework with Multi-Head Attention Mechanism. J Chem Inf Model 2024; 64:6699-6711. [PMID: 39121059 DOI: 10.1021/acs.jcim.4c01034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2024]
Abstract
Glycation, a type of posttranslational modification, preferentially occurs on lysine and arginine residues, impairing protein functionality and altering characteristics. This process is linked to diseases such as Alzheimer's, diabetes, and atherosclerosis. Traditional wet lab experiments are time-consuming, whereas machine learning has significantly streamlined the prediction of protein glycation sites. Despite promising results, challenges remain, including data imbalance, feature redundancy, and suboptimal classifier performance. This research introduces Glypred, a lysine glycation site prediction model combining ClusterCentroids Undersampling (CCU), LightGBM, and bidirectional long short-term memory network (BiLSTM) methodologies, with an additional multihead attention mechanism integrated into the BiLSTM. To achieve this, the study undertakes several key steps: selecting diverse feature types to capture comprehensive protein information, employing a cluster-based undersampling strategy to balance the data set, using LightGBM for feature selection to enhance model performance, and implementing a bidirectional LSTM network for accurate classification. Together, these approaches ensure that Glypred effectively identifies glycation sites with high accuracy and robustness. For feature encoding, five distinct feature types─AAC, KMER, DR, PWAA, and EBGW─were selected to capture a broad spectrum of protein sequence and biological information. These encoded features were integrated and validated to ensure comprehensive protein information acquisition. To address the issue of highly imbalanced positive and negative samples, various undersampling algorithms, including random undersampling, NearMiss, edited nearest neighbor rule, and CCU, were evaluated. CCU was ultimately chosen to remove redundant nonglycated training data, establishing a balanced data set that enhances the model's accuracy and robustness. For feature selection, the LightGBM ensemble learning algorithm was employed to reduce feature dimensionality by identifying the most significant features. This approach accelerates model training, enhances generalization capabilities, and ensures good transferability of the model. Finally, a bidirectional long short-term memory network was used as the classifier, with a network structure designed to capture glycation modification site features from both forward and backward directions. To prevent overfitting, appropriate regularization parameters and dropout rates were introduced, achieving efficient classification. Experimental results show that Glypred achieved optimal performance. This model provides new insights for bioinformatics and encourages the application of similar strategies in other fields. A lysine glycation site prediction software tool was also developed using the PyQt5 library, offering researchers an auxiliary screening tool to reduce workload and improve efficiency. The software and data sets are available on GitHub: https://github.com/ZBYnb/Glypred.
Collapse
Affiliation(s)
- Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Bangyi Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Yinkang Dong
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| | - Wenying He
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300130, China
| | - Yue Bi
- Department of Biochemistry and Molecular Biology and Biomedicine Discovery Institute, Monash University, Clayton 3800, Australia
| | - Xiangrong Liu
- Department of Computer Science and Technology, National Institute for Data Science in Health and Medicine, Xiamen Key Laboratory of Intelligent Storage and Computing, Xiamen University, Xiamen 361005, China
| | - Xiangxiang Zeng
- School of Information Science and Engineering, Hunan University, Changsha 410012, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China
| |
Collapse
|
15
|
Wu H, Min W, Gai D, Huang Z, Geng Y, Wang Q, Chen R. HD-Former: A hierarchical dependency Transformer for medical image segmentation. Comput Biol Med 2024; 178:108671. [PMID: 38870721 DOI: 10.1016/j.compbiomed.2024.108671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 04/20/2024] [Accepted: 05/26/2024] [Indexed: 06/15/2024]
Abstract
Medical image segmentation is a compelling fundamental problem and an important auxiliary tool for clinical applications. Recently, the Transformer model has emerged as a valuable tool for addressing the limitations of convolutional neural networks by effectively capturing global relationships and numerous hybrid architectures combining convolutional neural networks (CNNs) and Transformer have been devised to enhance segmentation performance. However, they suffer from multilevel semantic feature gaps and fail to account for multilevel dependencies between space and channel. In this paper, we propose a hierarchical dependency Transformer for medical image segmentation, named HD-Former. First, we utilize a Compressed Bottleneck (CB) module to enrich shallow features and localize the target region. We then introduce the Dual Cross Attention Transformer (DCAT) module to fuse multilevel features and bridge the feature gap. In addition, we design the broad exploration network (BEN) that cascades convolution and self-attention from different percepts to capture hierarchical dense contextual semantic features locally and globally. Finally, we exploit uncertain multitask edge loss to adaptively map predictions to a consistent feature space, which can optimize segmentation edges. The extensive experiments conducted on medical image segmentation from ISIC, LiTS, Kvasir-SEG, and CVC-ClinicDB datasets demonstrate that our HD-Former surpasses the state-of-the-art methods in terms of both subjective visual performance and objective evaluation. Code: https://github.com/barcelonacontrol/HD-Former.
Collapse
Affiliation(s)
- Haifan Wu
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
| | - Weidong Min
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Di Gai
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Zheng Huang
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
| | - Yuhan Geng
- School of Public Health, University of Michigan, Ann Arbor, MI, 48105, USA.
| | - Qi Wang
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Ruibin Chen
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Information Department, The First Affiliated Hospital of Nanchang University, Nanchang, 330096, China.
| |
Collapse
|
16
|
Xuan P, Chu X, Cui H, Nakaguchi T, Wang L, Ning Z, Ning Z, Li C, Zhang T. Multi-view attribute learning and context relationship encoding enhanced segmentation of lung tumors from CT images. Comput Biol Med 2024; 177:108640. [PMID: 38833798 DOI: 10.1016/j.compbiomed.2024.108640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 04/25/2024] [Accepted: 05/18/2024] [Indexed: 06/06/2024]
Abstract
Graph convolutional neural networks (GCN) have shown the promise in medical image segmentation due to the flexibility of representing diverse range of image regions using graph nodes and propagating knowledge via graph edges. However, existing methods did not fully exploit the various attributes of image nodes and the context relationship among their attributes. We propose a new segmentation method with multi-similarity view enhancement and node attribute context learning (MNSeg). First, multiple views were formed by measuring the similarities among the image nodes, and MNSeg has a GCN based multi-view image node attribute learning (MAL) module to integrate various node attributes learnt from multiple similarity views. Each similarity view contains the specific similarities among all the image nodes, and it was integrated with the node attributes from all the channels to form the enhanced attributes of image nodes. Second, the context relationships among the attributes of image nodes are formulated by a transformer-based context relationship encoding (CRE) strategy to propagate these relationships across all the image nodes. During the transformer-based learning, the relationships were estimated based on the self-attention on all the image nodes, and then they were encoded into the learned node features. Finally, we design an attention at attribute category level (ACA) to discriminate and fuse the learnt diverse information from MAL, CRE, and the original node attributes. ACA identifies the more informative attribute categories by adaptively learn their importance. We validate the performance of MNSeg on a public lung tumor CT dataset and an in-house non-small cell lung cancer (NSCLC) dataset collected from the hospital. The segmentation results show that MNSeg outperformed the compared segmentation methods in terms of spatial overlap and the shape similarities. The ablation studies demonstrated the effectiveness of MAL, CRE, and ACA. The generalization ability of MNSeg was proved by the consistent improved segmentation performances using different 3D segmentation backbones.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science and Technology, Shantou University, Shantou, China; School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Xiuqiang Chu
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Australia
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Linlin Wang
- Department of Radiation Oncology, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Zhiyuan Ning
- School of Electrical and Information Engineering, The University of Sydney, Sydney, Australia
| | - Zhiyu Ning
- School of Electrical and Information Engineering, The University of Sydney, Sydney, Australia
| | | | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China; School of Mathematical Science, Heilongjiang University, Harbin, China.
| |
Collapse
|
17
|
万 雨, 周 冬, 王 长, 刘 宜, 白 崇. [Multi-scale medical image segmentation based on pixel encoding and spatial attention mechanism]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2024; 41:511-519. [PMID: 38932537 PMCID: PMC11208660 DOI: 10.7507/1001-5515.202310001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 02/24/2024] [Indexed: 06/28/2024]
Abstract
In response to the issues of single-scale information loss and large model parameter size during the sampling process in U-Net and its variants for medical image segmentation, this paper proposes a multi-scale medical image segmentation method based on pixel encoding and spatial attention. Firstly, by redesigning the input strategy of the Transformer structure, a pixel encoding module is introduced to enable the model to extract global semantic information from multi-scale image features, obtaining richer feature information. Additionally, deformable convolutions are incorporated into the Transformer module to accelerate convergence speed and improve module performance. Secondly, a spatial attention module with residual connections is introduced to allow the model to focus on the foreground information of the fused feature maps. Finally, through ablation experiments, the network is lightweighted to enhance segmentation accuracy and accelerate model convergence. The proposed algorithm achieves satisfactory results on the Synapse dataset, an official public dataset for multi-organ segmentation provided by the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), with Dice similarity coefficient (DSC) and 95% Hausdorff distance (HD95) scores of 77.65 and 18.34, respectively. The experimental results demonstrate that the proposed algorithm can enhance multi-organ segmentation performance, potentially filling the gap in multi-scale medical image segmentation algorithms, and providing assistance for professional physicians in diagnosis.
Collapse
Affiliation(s)
- 雨龙 万
- 云南大学 信息学院(昆明 650504)School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - 冬明 周
- 云南大学 信息学院(昆明 650504)School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - 长城 王
- 云南大学 信息学院(昆明 650504)School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - 宜松 刘
- 云南大学 信息学院(昆明 650504)School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - 崇斌 白
- 云南大学 信息学院(昆明 650504)School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| |
Collapse
|
18
|
Lin H, Zhao M, Zhu L, Pei X, Wu H, Zhang L, Li Y. Gaussian filter facilitated deep learning-based architecture for accurate and efficient liver tumor segmentation for radiation therapy. Front Oncol 2024; 14:1423774. [PMID: 38966060 PMCID: PMC11222586 DOI: 10.3389/fonc.2024.1423774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 06/06/2024] [Indexed: 07/06/2024] Open
Abstract
Purpose Addressing the challenges of unclear tumor boundaries and the confusion between cysts and tumors in liver tumor segmentation, this study aims to develop an auto-segmentation method utilizing Gaussian filter with the nnUNet architecture to effectively distinguish between tumors and cysts, enhancing the accuracy of liver tumor auto-segmentation. Methods Firstly, 130 cases of liver tumorsegmentation challenge 2017 (LiTS2017) were used for training and validating nnU-Net-based auto-segmentation model. Then, 14 cases of 3D-IRCADb dataset and 25 liver cancer cases retrospectively collected in our hospital were used for testing. The dice similarity coefficient (DSC) was used to evaluate the accuracy of auto-segmentation model by comparing with manual contours. Results The nnU-Net achieved an average DSC value of 0.86 for validation set (20 LiTS cases) and 0.82 for public testing set (14 3D-IRCADb cases). For clinical testing set, the standalone nnU-Net model achieved an average DSC value of 0.75, which increased to 0.81 after post-processing with the Gaussian filter (P<0.05), demonstrating its effectiveness in mitigating the influence of liver cysts on liver tumor segmentation. Conclusion Experiments show that Gaussian filter is beneficial to improve the accuracy of liver tumor segmentation in clinic.
Collapse
Affiliation(s)
- Hongyu Lin
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Min Zhao
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Lingling Zhu
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Xi Pei
- Technology Development Department, Anhui Wisdom Technology Co., Ltd., Hefei, China
| | - Haotian Wu
- Technology Development Department, Anhui Wisdom Technology Co., Ltd., Hefei, China
| | - Lian Zhang
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Ying Li
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| |
Collapse
|
19
|
AlJabri M, Alghamdi M, Collado-Mesa F, Abdel-Mottaleb M. Recurrent attention U-Net for segmentation and quantification of breast arterial calcifications on synthesized 2D mammograms. PeerJ Comput Sci 2024; 10:e2076. [PMID: 38855260 PMCID: PMC11157579 DOI: 10.7717/peerj-cs.2076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 04/30/2024] [Indexed: 06/11/2024]
Abstract
Breast arterial calcifications (BAC) are a type of calcification commonly observed on mammograms and are generally considered benign and not associated with breast cancer. However, there is accumulating observational evidence of an association between BAC and cardiovascular disease, the leading cause of death in women. We present a deep learning method that could assist radiologists in detecting and quantifying BAC in synthesized 2D mammograms. We present a recurrent attention U-Net model consisting of encoder and decoder modules that include multiple blocks that each use a recurrent mechanism, a recurrent mechanism, and an attention module between them. The model also includes a skip connection between the encoder and the decoder, similar to a U-shaped network. The attention module was used to enhance the capture of long-range dependencies and enable the network to effectively classify BAC from the background, whereas the recurrent blocks ensured better feature representation. The model was evaluated using a dataset containing 2,000 synthesized 2D mammogram images. We obtained 99.8861% overall accuracy, 69.6107% sensitivity, 66.5758% F-1 score, and 59.5498% Jaccard coefficient, respectively. The presented model achieved promising performance compared with related models.
Collapse
Affiliation(s)
- Manar AlJabri
- Department of Computer Science and Artificial Intelligence, Umm Al-Qura University, Makkah, Makkah, Saudi Arabia
- King Abdul Aziz University, Jeddah, Makkah, Saudi Arabia
| | - Manal Alghamdi
- Department of Computer Science and Artificial Intelligence, Umm Al-Qura University, Makkah, Makkah, Saudi Arabia
| | - Fernando Collado-Mesa
- Department of Radiology, Miller School of Medicine, University of Miami, Miami, Florida, United States
| | - Mohamed Abdel-Mottaleb
- Department of Electrical and Computer Engineering, University of Miami, Miami, Florida, United States
| |
Collapse
|
20
|
Siami M, Barszcz T, Zimroz R. Advanced Image Analytics for Mobile Robot-Based Condition Monitoring in Hazardous Environments: A Comprehensive Thermal Defect Processing Framework. SENSORS (BASEL, SWITZERLAND) 2024; 24:3421. [PMID: 38894210 PMCID: PMC11174847 DOI: 10.3390/s24113421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/14/2024] [Accepted: 05/20/2024] [Indexed: 06/21/2024]
Abstract
In hazardous environments like mining sites, mobile inspection robots play a crucial role in condition monitoring (CM) tasks, particularly by collecting various kinds of data, such as images. However, the sheer volume of collected image samples and existing noise pose challenges in processing and visualizing thermal anomalies. Recognizing these challenges, our study addresses the limitations of industrial big data analytics for mobile robot-generated image data. We present a novel, fully integrated approach involving a dimension reduction procedure. This includes a semantic segmentation technique utilizing the pre-trained VGG16 CNN architecture for feature selection, followed by random forest (RF) and extreme gradient boosting (XGBoost) classifiers for the prediction of the pixel class labels. We also explore unsupervised learning using the PCA-K-means method for dimension reduction and classification of unlabeled thermal defects based on anomaly severity. Our comprehensive methodology aims to efficiently handle image-based CM tasks in hazardous environments. To validate its practicality, we applied our approach in a real-world scenario, and the results confirm its robust performance in processing and visualizing thermal data collected by mobile inspection robots. This affirms the effectiveness of our methodology in enhancing the overall performance of CM processes.
Collapse
Affiliation(s)
| | - Tomasz Barszcz
- Faculty of Mechanical Engineering and Robotics, AGH University, Al. Mickiewicza 30, 30-059 Kraków, Poland;
| | - Radoslaw Zimroz
- Faculty of Geoengineering, Mining and Geology, Wrocław University of Science and Technology, Na Grobli 15, 50-421 Wrocław, Poland;
| |
Collapse
|
21
|
Sun G, Pan Y, Kong W, Xu Z, Ma J, Racharak T, Nguyen LM, Xin J. DA-TransUNet: integrating spatial and channel dual attention with transformer U-net for medical image segmentation. Front Bioeng Biotechnol 2024; 12:1398237. [PMID: 38827037 PMCID: PMC11141164 DOI: 10.3389/fbioe.2024.1398237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 04/18/2024] [Indexed: 06/04/2024] Open
Abstract
Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional U-Net architectures and their transformer-integrated variants excel in automated segmentation tasks. Existing models also struggle with parameter efficiency and computational complexity, often due to the extensive use of Transformers. However, they lack the ability to harness the image's intrinsic position and channel features. Research employing Dual Attention mechanisms of position and channel have not been specifically optimized for the high-detail demands of medical images. To address these issues, this study proposes a novel deep medical image segmentation framework, called DA-TransUNet, aiming to integrate the Transformer and dual attention block (DA-Block) into the traditional U-shaped architecture. Also, DA-TransUNet tailored for the high-detail requirements of medical images, optimizes the intermittent channels of Dual Attention (DA) and employs DA in each skip-connection to effectively filter out irrelevant information. This integration significantly enhances the model's capability to extract features, thereby improving the performance of medical image segmentation. DA-TransUNet is validated in medical image segmentation tasks, consistently outperforming state-of-the-art techniques across 5 datasets. In summary, DA-TransUNet has made significant strides in medical image segmentation, offering new insights into existing techniques. It strengthens model performance from the perspective of image features, thereby advancing the development of high-precision automated medical image diagnosis. The codes and parameters of our model will be publicly available at https://github.com/SUN-1024/DA-TransUnet.
Collapse
Affiliation(s)
- Guanqun Sun
- School of Information Engineering, Hangzhou Medical College, Hangzhou, China
- School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Japan
| | - Yizhi Pan
- School of Information Engineering, Hangzhou Medical College, Hangzhou, China
| | - Weikun Kong
- Department of Electronic Engineering, Tsinghua University, Beijing, China
| | - Zichang Xu
- Department of Systems Immunology, Immunology Frontier Research Institute (IFReC), Osaka University, Suita, Japan
| | - Jianhua Ma
- Faculty of Computer and Information Sciences, Hosei University, Tokyo, Japan
| | - Teeradaj Racharak
- School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Japan
| | - Le-Minh Nguyen
- School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Japan
| | - Junyi Xin
- School of Information Engineering, Hangzhou Medical College, Hangzhou, China
- Zhejiang Engineering Research Center for Brain Cognition and Brain Diseases Digital Medical Instruments, Hangzhou Medical College, Hangzhou, China
- Academy for Advanced Interdisciplinary Studies of Future Health, Hangzhou Medical College, Hangzhou, China
| |
Collapse
|
22
|
Li Y, Zhang L, Yu H, Wang J, Wang S, Liu J, Zheng Q. A comprehensive segmentation of chest X-ray improves deep learning-based WHO radiologically confirmed pneumonia diagnosis in children. Eur Radiol 2024; 34:3471-3482. [PMID: 37930411 DOI: 10.1007/s00330-023-10367-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/29/2023] [Accepted: 08/31/2023] [Indexed: 11/07/2023]
Abstract
OBJECTIVES To investigate a comprehensive segmentation of chest X-ray (CXR) in promoting deep learning-based World Health Organization's (WHO) radiologically confirmed pneumonia diagnosis in children. METHODS A total of 4400 participants between January 2016 and June 2021were identified for a cross-sectional study and divided into primary endpoint pneumonia (PEP), other infiltrates, and normal groups according to WHO's diagnostic criteria. The CXR was divided into six segments of left lung, right lung, mediastinum, diaphragm, ext-left lung, and ext-right lung by adopting the RA-UNet. To demonstrate the benefits of lung field segmentation in pneumonia diagnosis, the segmented images and images that were not segmented, which constituted seven segmentation combinations, were fed into the CBAM-ResNet under a three-category classification comparison. The interpretability of the CBAM-ResNet for pneumonia diagnosis was also performed by adopting a Grad-CAM module. RESULTS The RA-UNet achieved a high spatial overlap between manual and automatic segmentation (averaged DSC = 0.9639). The CBAM-ResNet when fed with the six segments achieved superior three-category diagnosis performance (accuracy = 0.8243) over other segmentation combinations and deep learning models under comparison, which was increased by around 6% in accuracy, precision, specificity, sensitivity, F1-score, and around 3% in AUC. The Grad-CAM could capture the pneumonia lesions more accurately, generating a more interpretable visualization and enhancing the superiority and reliability of our study in assisting pediatric pneumonia diagnosis. CONCLUSIONS The comprehensive segmentation of CXR could improve deep learning-based pneumonia diagnosis in childhood with a more reasonable WHO's radiological standardized pneumonia classification instead of conventional dichotomous bacterial pneumonia and viral pneumonia. CLINICAL RELEVANCE STATEMENT The comprehensive segmentation of chest X-ray improves deep learning-based WHO confirmed pneumonia diagnosis in children, laying a strong foundation for the potential inclusion of computer-aided pediatric CXR readings in precise classification of pneumonia and PCV vaccine trials efficacy in children. KEY POINTS • The chest X-ray was comprehensively segmented into six anatomical structures of left lung, right lung, mediastinum, diaphragm, ext-left lung, and ext-right lung. • The comprehensive segmentation improved the three-category classification of primary endpoint pneumonia, other infiltrates, and normal with an increase by around 6% in accuracy, precision, specificity, sensitivity, F1-score, and around 3% in AUC. • The comprehensive segmentation gave rise to a more accurate and interpretable visualization results in capturing the pneumonia lesions.
Collapse
Affiliation(s)
- Yuemei Li
- School of Computer and Control Engineering, Yantai University, Yantai, 264005, China
| | - Lin Zhang
- Department of Radiology, Xiamen Children's Hospital, Children's Hospital of Fudan University at Xiamen, Xiamen, Fujian, China
| | - Hu Yu
- School of Computer and Control Engineering, Yantai University, Yantai, 264005, China
| | - Jian Wang
- Department of Radiology, Xiamen Children's Hospital, Children's Hospital of Fudan University at Xiamen, Xiamen, Fujian, China
| | - Shuo Wang
- Yantai University Trier College of Sustainable Technology, Yantai, 264005, Shandong Province, China
- Trier University of Applied Sciences, D-54208, Trier, Germany
| | - Jungang Liu
- Department of Radiology, Xiamen Children's Hospital, Children's Hospital of Fudan University at Xiamen, Xiamen, Fujian, China.
| | - Qiang Zheng
- School of Computer and Control Engineering, Yantai University, Yantai, 264005, China.
| |
Collapse
|
23
|
Wang KN, Li SX, Bu Z, Zhao FX, Zhou GQ, Zhou SJ, Chen Y. SBCNet: Scale and Boundary Context Attention Dual-Branch Network for Liver Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2854-2865. [PMID: 38427554 DOI: 10.1109/jbhi.2024.3370864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
Automated segmentation of liver tumors in CT scans is pivotal for diagnosing and treating liver cancer, offering a valuable alternative to labor-intensive manual processes and ensuring the provision of accurate and reliable clinical assessment. However, the inherent variability of liver tumors, coupled with the challenges posed by blurred boundaries in imaging characteristics, presents a substantial obstacle to achieving their precise segmentation. In this paper, we propose a novel dual-branch liver tumor segmentation model, SBCNet, to address these challenges effectively. Specifically, our proposed method introduces a contextual encoding module, which enables a better identification of tumor variability using an advanced multi-scale adaptive kernel. Moreover, a boundary enhancement module is designed for the counterpart branch to enhance the perception of boundaries by incorporating contour learning with the Sobel operator. Finally, we propose a hybrid multi-task loss function, concurrently concerning tumors' scale and boundary features, to foster interaction across different tasks of dual branches, further improving tumor segmentation. Experimental validation on the publicly available LiTS dataset demonstrates the practical efficacy of each module, with SBCNet yielding competitive results compared to other state-of-the-art methods for liver tumor segmentation.
Collapse
|
24
|
Zhan F, Wang W, Chen Q, Guo Y, He L, Wang L. Three-Direction Fusion for Accurate Volumetric Liver and Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2175-2186. [PMID: 38109246 DOI: 10.1109/jbhi.2023.3344392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Biomedical image segmentation of organs, tissues and lesions has gained increasing attention in clinical treatment planning and navigation, which involves the exploration of two-dimensional (2D) and three-dimensional (3D) contexts in the biomedical image. Compared to 2D methods, 3D methods pay more attention to inter-slice correlations, which offer additional spatial information for image segmentation. An organ or tumor has a 3D structure that can be observed from three directions. Previous studies focus only on the vertical axis, limiting the understanding of the relationship between a tumor and its surrounding tissues. Important information can also be obtained from sagittal and coronal axes. Therefore, spatial information of organs and tumors can be obtained from three directions, i.e. the sagittal, coronal and vertical axes, to understand better the invasion depth of tumor and its relationship with the surrounding tissues. Moreover, the edges of organs and tumors in biomedical image may be blurred. To address these problems, we propose a three-direction fusion volumetric segmentation (TFVS) model for segmenting 3D biomedical images from three perspectives in sagittal, coronal and transverse planes, respectively. We use the dataset of the liver task provided by the Medical Segmentation Decathlon challenge to train our model. The TFVS method demonstrates a competitive performance on the 3D-IRCADB dataset. In addition, the t-test and Wilcoxon signed-rank test are also performed to show the statistical significance of the improvement by the proposed method as compared with the baseline methods. The proposed method is expected to be beneficial in guiding and facilitating clinical diagnosis and treatment.
Collapse
|
25
|
Luo X, Li P, Chen H, Zhou K, Piao S, Yang L, Hu B, Geng D. Automatic segmentation of hepatocellular carcinoma on dynamic contrast-enhanced MRI based on deep learning. Phys Med Biol 2024; 69:065008. [PMID: 38330492 DOI: 10.1088/1361-6560/ad2790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 02/08/2024] [Indexed: 02/10/2024]
Abstract
Objective. Precise hepatocellular carcinoma (HCC) detection is crucial for clinical management. While studies focus on computed tomography-based automatic algorithms, there is a rareness of research on automatic detection based on dynamic contrast enhanced (DCE) magnetic resonance imaging. This study is to develop an automatic detection and segmentation deep learning model for HCC using DCE.Approach: DCE images acquired from 2016 to 2021 were retrospectively collected. Then, 382 patients (301 male; 81 female) with 466 lesions pathologically confirmed were included and divided into an 80% training-validation set and a 20% independent test set. For external validation, 51 patients (42 male; 9 female) in another hospital from 2018 to 2021 were included. The U-net architecture was modified to accommodate multi-phasic DCE input. The model was trained with the training-validation set using five-fold cross-validation, and furtherly evaluated with the independent test set using comprehensive metrics for segmentation and detection performance. The proposed automatic segmentation model consisted of five main steps: phase registration, automatic liver region extraction using a pre-trained model, automatic HCC lesion segmentation using the multi-phasic deep learning model, ensemble of five-fold predictions, and post-processing using connected component analysis to enhance the performance to refine predictions and eliminate false positives.Main results. The proposed model achieved a mean dice similarity coefficient (DSC) of 0.81 ± 0.11, a sensitivity of 94.41 ± 15.50%, a precision of 94.19 ± 17.32%, and 0.14 ± 0.48 false positive lesions per patient in the independent test set. The model detected 88% (80/91) HCC lesions in the condition of DSC > 0.5, and the DSC per tumor was 0.80 ± 0.13. In the external set, the model detected 92% (58/62) lesions with 0.12 ± 0.33 false positives per patient, and the DSC per tumor was 0.75 ± 0.10.Significance.This study developed an automatic detection and segmentation deep learning model for HCC using DCE, which yielded promising post-processed results in accurately identifying and delineating HCC lesions.
Collapse
Affiliation(s)
- Xiao Luo
- Academy for Engineering and Technology, Fudan University, Shanghai, People's Republic of China
| | - Peiwen Li
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, People's Republic of China
| | - Hongyi Chen
- Academy for Engineering and Technology, Fudan University, Shanghai, People's Republic of China
| | - Kun Zhou
- Academy for Engineering and Technology, Fudan University, Shanghai, People's Republic of China
| | - Sirong Piao
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, People's Republic of China
- Department of Radiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People's Republic China
| | - Liqin Yang
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, People's Republic of China
- Shanghai Engineering Research Center of Intelligent Imaging for Critical Brain Diseases, Shanghai, People's Republic China
- Institute of Functional and Molecular Medical Imaging, Fudan University, Shanghai, People's Republic of China
| | - Bin Hu
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, People's Republic of China
| | - Daoying Geng
- Academy for Engineering and Technology, Fudan University, Shanghai, People's Republic of China
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, People's Republic of China
- Shanghai Engineering Research Center of Intelligent Imaging for Critical Brain Diseases, Shanghai, People's Republic China
- Institute of Functional and Molecular Medical Imaging, Fudan University, Shanghai, People's Republic of China
| |
Collapse
|
26
|
Siami M, Barszcz T, Wodecki J, Zimroz R. Semantic segmentation of thermal defects in belt conveyor idlers using thermal image augmentation and U-Net-based convolutional neural networks. Sci Rep 2024; 14:5748. [PMID: 38459162 PMCID: PMC10923815 DOI: 10.1038/s41598-024-55864-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 02/28/2024] [Indexed: 03/10/2024] Open
Abstract
The belt conveyor (BC) is the main means of horizontal transportation of bulk materials at mining sites. The sudden fault in BC modules may cause unexpected stops in production lines. With the increasing number of applications of inspection mobile robots in condition monitoring (CM) of industrial infrastructure in hazardous environments, in this article we introduce an image processing pipeline for automatic segmentation of thermal defects in thermal images captured from BC idlers using a mobile robot. This study follows the fact that CM of idler temperature is an important task for preventing sudden breakdowns in BC system networks. We compared the performance of three different types of U-Net-based convolutional neural network architectures for the identification of thermal anomalies using a small number of hand-labeled thermal images. Experiments on the test data set showed that the attention residual U-Net with binary cross entropy as the loss function handled the semantic segmentation problem better than our previous research and other studied U-Net variations.
Collapse
Affiliation(s)
- Mohammad Siami
- AMC Vibro Sp. z o.o., Pilotow 2e, 31-462, Kraków, Poland.
| | - Tomasz Barszcz
- Faculty of Mechanical Engineering and Robotics, AGH University, Al. Mickiewicza 30, 30-059, Kraków, Poland
| | - Jacek Wodecki
- Faculty of Geoengineering, Mining and Geology, Wroclaw University of Science and Technology, Na Grobli 15, 50-421, Wroclaw, Poland
| | - Radoslaw Zimroz
- Faculty of Geoengineering, Mining and Geology, Wroclaw University of Science and Technology, Na Grobli 15, 50-421, Wroclaw, Poland
| |
Collapse
|
27
|
Wang J, Zhang B, Wang Y, Zhou C, Vonsky MS, Mitrofanova LB, Zou D, Li Q. CrossU-Net: Dual-modality cross-attention U-Net for segmentation of precancerous lesions in gastric cancer. Comput Med Imaging Graph 2024; 112:102339. [PMID: 38262134 DOI: 10.1016/j.compmedimag.2024.102339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 10/20/2023] [Accepted: 01/15/2024] [Indexed: 01/25/2024]
Abstract
Gastric precancerous lesions (GPL) significantly elevate the risk of gastric cancer, and precise diagnosis and timely intervention are critical for patient survival. Due to the elusive pathological features of precancerous lesions, the early detection rate is less than 10%, which hinders lesion localization and diagnosis. In this paper, we provide a GPL pathological dataset and propose a novel method for improving the segmentation accuracy on a limited-scale dataset, namely RGB and Hyperspectral dual-modal pathological image Cross-attention U-Net (CrossU-Net). Specifically, we present a self-supervised pre-training model for hyperspectral images to serve downstream segmentation tasks. Secondly, we design a dual-stream U-Net-based network to extract features from different modal images. To promote information exchange between spatial information in RGB images and spectral information in hyperspectral images, we customize the cross-attention mechanism between the two networks. Furthermore, we use an intermediate agent in this mechanism to improve computational efficiency. Finally, we add a distillation loss to align predicted results for both branches, improving network generalization. Experimental results show that our CrossU-Net achieves accuracy and Dice of 96.53% and 91.62%, respectively, for GPL lesion segmentation, providing a promising spectral research approach for the localization and subsequent quantitative analysis of pathological features in early diagnosis.
Collapse
Affiliation(s)
- Jiansheng Wang
- Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, China; Engineering Research Center of Nanophotonics & Advanced Instrument, Ministry of Education, East China Normal University, Shanghai, China
| | - Benyan Zhang
- Department of Gastroenterology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yan Wang
- Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, China
| | - Chunhua Zhou
- Department of Gastroenterology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Maxim S Vonsky
- D.I. Mendeleev Institute for Metrology, Moskovsky Pr 19, St Petersburg, Russia; Almazov National Medical Research Centre, Saint-Petersburg, Russia
| | | | - Duowu Zou
- Department of Gastroenterology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Qingli Li
- Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, China; Engineering Research Center of Nanophotonics & Advanced Instrument, Ministry of Education, East China Normal University, Shanghai, China; Engineering Center of SHMEC for Space Information and GNSS, Shanghai, China.
| |
Collapse
|
28
|
Pu Q, Xi Z, Yin S, Zhao Z, Zhao L. Advantages of transformer and its application for medical image segmentation: a survey. Biomed Eng Online 2024; 23:14. [PMID: 38310297 PMCID: PMC10838005 DOI: 10.1186/s12938-024-01212-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/22/2024] [Indexed: 02/05/2024] Open
Abstract
PURPOSE Convolution operator-based neural networks have shown great success in medical image segmentation over the past decade. The U-shaped network with a codec structure is one of the most widely used models. Transformer, a technology used in natural language processing, can capture long-distance dependencies and has been applied in Vision Transformer to achieve state-of-the-art performance on image classification tasks. Recently, researchers have extended transformer to medical image segmentation tasks, resulting in good models. METHODS This review comprises publications selected through a Web of Science search. We focused on papers published since 2018 that applied the transformer architecture to medical image segmentation. We conducted a systematic analysis of these studies and summarized the results. RESULTS To better comprehend the benefits of convolutional neural networks and transformers, the construction of the codec and transformer modules is first explained. Second, the medical image segmentation model based on transformer is summarized. The typically used assessment markers for medical image segmentation tasks are then listed. Finally, a large number of medical segmentation datasets are described. CONCLUSION Even if there is a pure transformer model without any convolution operator, the sample size of medical picture segmentation still restricts the growth of the transformer, even though it can be relieved by a pretraining model. More often than not, researchers are still designing models using transformer and convolution operators.
Collapse
Affiliation(s)
- Qiumei Pu
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Zuoxin Xi
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, 100049, China
| | - Shuai Yin
- School of Information Engineering, Minzu University of China, Beijing, 100081, China
| | - Zhe Zhao
- The Fourth Medical Center of PLA General Hospital, Beijing, 100039, China
| | - Lina Zhao
- CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
29
|
Guo X, Wang Z, Wu P, Li Y, Alsaadi FE, Zeng N. ELTS-Net: An enhanced liver tumor segmentation network with augmented receptive field and global contextual information. Comput Biol Med 2024; 169:107879. [PMID: 38142549 DOI: 10.1016/j.compbiomed.2023.107879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/30/2023] [Accepted: 12/18/2023] [Indexed: 12/26/2023]
Abstract
The liver is one of the organs with the highest incidence rate in the human body, and late-stage liver cancer is basically incurable. Therefore, early diagnosis and lesion location of liver cancer are of important clinical value. This study proposes an enhanced network architecture ELTS-Net based on the 3D U-Net model, to address the limitations of conventional image segmentation methods and the underutilization of image spatial features by the 2D U-Net network structure. ELTS-Net expands upon the original network by incorporating dilated convolutions to increase the receptive field of the convolutional kernel. Additionally, an attention residual module, comprising an attention mechanism and residual connections, replaces the original convolutional module, serving as the primary components of the encoder and decoder. This design enables the network to capture contextual information globally in both channel and spatial dimensions. Furthermore, deep supervision modules are integrated between different levels of the decoder network, providing additional feedback from deeper intermediate layers. This constrains the network weights to the target regions and optimizing segmentation results. Evaluation on the LiTS2017 dataset shows improvements in evaluation metrics for liver and tumor segmentation tasks compared to the baseline 3D U-Net model, achieving 95.2% liver segmentation accuracy and 71.9% tumor segmentation accuracy, with accuracy improvements of 0.9% and 3.1% respectively. The experimental results validate the superior segmentation performance of ELTS-Net compared to other comparison models, offering valuable guidance for clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Xiaoyue Guo
- College of Engineering, Peking University, Beijing 100871, China; Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China
| | - Zidong Wang
- Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UK.
| | - Peishu Wu
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China
| | - Yurong Li
- College of Electrical Engineering and Automation, Fuzhou University, Fujian 350116, China; Fujian Key Lab of Medical Instrumentation & Pharmaceutical Technology, Fujian 350116, China
| | - Fuad E Alsaadi
- Communication Systems and Networks Research Group, Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Nianyin Zeng
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China.
| |
Collapse
|
30
|
Yuan L, Song J, Fan Y. MCNMF-Unet: a mixture Conv-MLP network with multi-scale features fusion Unet for medical image segmentation. PeerJ Comput Sci 2024; 10:e1798. [PMID: 38259898 PMCID: PMC10803052 DOI: 10.7717/peerj-cs.1798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 12/15/2023] [Indexed: 01/24/2024]
Abstract
Recently, the medical image segmentation scheme combining Vision Transformer (ViT) and multilayer perceptron (MLP) has been widely used. However, one of its disadvantages is that the feature fusion ability of different levels is weak and lacks flexible localization information. To reduce the semantic gap between the encoding and decoding stages, we propose a mixture conv-MLP network with multi-scale features fusion Unet (MCNMF-Unet) for medical image segmentation. MCNMF-Unet is a U-shaped network based on convolution and MLP, which not only inherits the advantages of convolutional in extracting underlying features and visual structures, but also utilizes MLP to fuse local and global information of each layer of the network. MCNMF-Unet performs multi-layer fusion and multi-scale feature map skip connections in each network stage so that all the feature information can be fully utilized and the gradient disappearance problem can be alleviated. Additionally, MCNMF-Unet incorporates a multi-axis and multi-windows MLP module. This module is fully end-to-end and eliminates the need to consider the negative impact of image cropping. It not only fuses information from multiple dimensions and receptive fields but also reduces the number of parameters and computational complexity. We evaluated the proposed model on BUSI, ISIC2018 and CVC-ClinicDB datasets. The experimental results show that the performance of our proposed model is superior to most existing networks, with an IoU of 84.04% and a F1-score of 91.18%.
Collapse
Affiliation(s)
- Lei Yuan
- Key Laboratory of Light Field Manipulation and System Integration Applications in Fujian Province, School of Physics and Information Engineering, Minnan Normal University, Zhangzhou, Fujian, China
| | - Jianhua Song
- Key Laboratory of Light Field Manipulation and System Integration Applications in Fujian Province, School of Physics and Information Engineering, Minnan Normal University, Zhangzhou, Fujian, China
| | - Yazhuo Fan
- Key Laboratory of Light Field Manipulation and System Integration Applications in Fujian Province, School of Physics and Information Engineering, Minnan Normal University, Zhangzhou, Fujian, China
| |
Collapse
|
31
|
Xiao H, Li L, Liu Q, Zhang Q, Liu J, Liu Z. Context-aware and local-aware fusion with transformer for medical image segmentation. Phys Med Biol 2024; 69:025011. [PMID: 38086076 DOI: 10.1088/1361-6560/ad14c6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 12/12/2023] [Indexed: 01/11/2024]
Abstract
Objective. Convolutional neural networks (CNNs) have made significant progress in medical image segmentation tasks. However, for complex segmentation tasks, CNNs lack the ability to establish long-distance relationships, resulting in poor segmentation performance. The characteristics of intra-class diversity and inter-class similarity in images increase the difficulty of segmentation. Additionally, some focus areas exhibit a scattered distribution, making segmentation even more challenging.Approach. Therefore, this work proposed a new Transformer model, FTransConv, to address the issues of inter-class similarity, intra-class diversity, and scattered distribution in medical image segmentation tasks. To achieve this, three Transformer-CNN modules were designed to extract global and local information, and a full-scale squeeze-excitation module was proposed in the decoder using the idea of full-scale connections.Main results. Without any pre-training, this work verified the effectiveness of FTransConv on three public COVID-19 CT datasets and MoNuSeg. Experiments have shown that FTransConv, which has only 26.98M parameters, outperformed other state-of-the-art models, such as Swin-Unet, TransAttUnet, UCTransNet, LeViT-UNet, TransUNet, UTNet, and SAUNet++. This model achieved the best segmentation performance with a DSC of 83.22% in COVID-19 datasets and 79.47% in MoNuSeg.Significance. This work demonstrated that our method provides a promising solution for regions with high inter-class similarity, intra-class diversity and scatter distribution in image segmentation.
Collapse
Affiliation(s)
- Hanguang Xiao
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Li Li
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Qiyuan Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Qihang Zhang
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Junqi Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| | - Zhi Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, People's Republic of China
| |
Collapse
|
32
|
Shao J, Luan S, Ding Y, Xue X, Zhu B, Wei W. Attention Connect Network for Liver Tumor Segmentation from CT and MRI Images. Technol Cancer Res Treat 2024; 23:15330338231219366. [PMID: 38179668 PMCID: PMC10771068 DOI: 10.1177/15330338231219366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/18/2023] [Accepted: 11/21/2023] [Indexed: 01/06/2024] Open
Abstract
Introduction: Currently, the incidence of liver cancer is on the rise annually. Precise identification of liver tumors is crucial for clinicians to strategize the treatment and combat liver cancer. Thus far, liver tumor contours have been derived through labor-intensive and subjective manual labeling. Computers have gained widespread application in the realm of liver tumor segmentation. Nonetheless, liver tumor segmentation remains a formidable challenge owing to the diverse range of volumes, shapes, and image intensities encountered. Methods: In this article, we introduce an innovative solution called the attention connect network (AC-Net) designed for automated liver tumor segmentation. Building upon the U-shaped network architecture, our approach incorporates 2 critical attention modules: the axial attention module (AAM) and the vision transformer module (VTM), which replace conventional skip-connections to seamlessly integrate spatial features. The AAM facilitates feature fusion by computing axial attention across feature maps, while the VTM operates on the lowest resolution feature maps, employing multihead self-attention, and reshaping the output into a feature map for subsequent concatenation. Furthermore, we employ a specialized loss function tailored to our approach. Our methodology begins with pretraining AC-Net using the LiTS2017 dataset and subsequently fine-tunes it using computed tomography (CT) and magnetic resonance imaging (MRI) data sourced from Hubei Cancer Hospital. Results: The performance metrics for AC-Net on CT data are as follows: dice similarity coefficient (DSC) of 0.90, Jaccard coefficient (JC) of 0.82, recall of 0.92, average symmetric surface distance (ASSD) of 4.59, Hausdorff distance (HD) of 11.96, and precision of 0.89. For AC-Net on MRI data, the metrics are DSC of 0.80, JC of 0.70, recall of 0.82, ASSD of 7.58, HD of 30.26, and precision of 0.84. Conclusion: The comparative experiments highlight that AC-Net exhibits exceptional tumor recognition accuracy when tested on the Hubei Cancer Hospital dataset, demonstrating highly competitive performance for practical clinical applications. Furthermore, the ablation experiments provide conclusive evidence of the efficacy of each module proposed in this article. For those interested, the code for this research article can be accessed at the following GitHub repository: https://github.com/killian-zero/py_tumor-segmentation.git.
Collapse
Affiliation(s)
- Jiakang Shao
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Shunyao Luan
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Yi Ding
- Department of Radiation Oncology, Hubei Cancer Hospital, TongJi Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xudong Xue
- Department of Radiation Oncology, Hubei Cancer Hospital, TongJi Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Benpeng Zhu
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Wei Wei
- Department of Radiation Oncology, Hubei Cancer Hospital, TongJi Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
33
|
Yang S, Liang Y, Wu S, Sun P, Chen Z. SADSNet: A robust 3D synchronous segmentation network for liver and liver tumors based on spatial attention mechanism and deep supervision. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:707-723. [PMID: 38552134 DOI: 10.3233/xst-230312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Highlights • Introduce a data augmentation strategy to expand the required different morphological data during the training and learning phase, and improve the algorithm's feature learning ability for complex and diverse tumor morphology CT images.• Design attention mechanisms for encoding and decoding paths to extract fine pixel level features, improve feature extraction capabilities, and achieve efficient spatial channel feature fusion.• The deep supervision layer is used to correct and decode the final image data to provide high accuracy of results.• The effectiveness of this method has been affirmed through validation on the LITS, 3DIRCADb, and SLIVER datasets. BACKGROUND Accurately extracting liver and liver tumors from medical images is an important step in lesion localization and diagnosis, surgical planning, and postoperative monitoring. However, the limited number of radiation therapists and a great number of images make this work time-consuming. OBJECTIVE This study designs a spatial attention deep supervised network (SADSNet) for simultaneous automatic segmentation of liver and tumors. METHOD Firstly, self-designed spatial attention modules are introduced at each layer of the encoder and decoder to extract image features at different scales and resolutions, helping the model better capture liver tumors and fine structures. The designed spatial attention module is implemented through two gate signals related to liver and tumors, as well as changing the size of convolutional kernels; Secondly, deep supervision is added behind the three layers of the decoder to assist the backbone network in feature learning and improve gradient propagation, enhancing robustness. RESULTS The method was testing on LITS, 3DIRCADb, and SLIVER datasets. For the liver, it obtained dice similarity coefficients of 97.03%, 96.11%, and 97.40%, surface dice of 81.98%, 82.53%, and 86.29%, 95% hausdorff distances of 8.96 mm, 8.26 mm, and 3.79 mm, and average surface distances of 1.54 mm, 1.19 mm, and 0.81 mm. Additionally, it also achieved precise tumor segmentation, which with dice scores of 87.81% and 87.50%, surface dice of 89.63% and 84.26%, 95% hausdorff distance of 12.96 mm and 16.55 mm, and average surface distances of 1.11 mm and 3.04 mm on LITS and 3DIRCADb, respectively. CONCLUSION The experimental results show that the proposed method is effective and superior to some other methods. Therefore, this method can provide technical support for liver and liver tumor segmentation in clinical practice.
Collapse
Affiliation(s)
- Sijing Yang
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Yongbo Liang
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Shang Wu
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Peng Sun
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, China
| | - Zhencheng Chen
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, China
- Guangxi Colleges and Universities Key Laboratory of Biomedical Sensors and Intelligent Instruments, Guilin, China
- Guangxi Engineering Technology Research Center of Human Physiological Information Noninvasive Detection, Guilin, China
| |
Collapse
|
34
|
Xi H, Dong H, Sheng Y, Cui H, Huang C, Li J, Zhu J. MSCT-UNET: multi-scale contrastive transformer within U-shaped network for medical image segmentation. Phys Med Biol 2023; 69:015022. [PMID: 38061069 DOI: 10.1088/1361-6560/ad135d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 12/07/2023] [Indexed: 12/30/2023]
Abstract
Objective.Automatic mutli-organ segmentation from anotomical images is essential in disease diagnosis and treatment planning. The U-shaped neural network with encoder-decoder has achieved great success in various segmentation tasks. However, a pure convolutional neural network (CNN) is not suitable for modeling long-range relations due to limited receptive fields, and a pure transformer is not good at capturing pixel-level features.Approach.We propose a new hybrid network named MSCT-UNET which fuses CNN features with transformer features at multi-scale and introduces multi-task contrastive learning to improve the segmentation performance. Specifically, the multi-scale low-level features extracted from CNN are further encoded through several transformers to build hierarchical global contexts. Then the cross fusion block fuses the low-level and high-level features in different directions. The deep-fused features are flowed back to the CNN and transformer branch for the next scale fusion. We introduce multi-task contrastive learning including a self-supervised global contrast learning and a supervised local contrast learning into MSCT-UNET. We also make the decoder stronger by using a transformer to better restore the segmentation map.Results.Evaluation results on ACDC, Synapase and BraTS datasets demonstrate the improved performance over other methods compared. Ablation study results prove the effectiveness of our major innovations.Significance.The hybrid encoder of MSCT-UNET can capture multi-scale long-range dependencies and fine-grained detail features at the same time. The cross fusion block can fuse these features deeply. The multi-task contrastive learning of MSCT-UNET can strengthen the representation ability of the encoder and jointly optimize the networks. The source code is publicly available at:https://github.com/msctunet/MSCT_UNET.git.
Collapse
Affiliation(s)
- Heran Xi
- School of Electronic Engineering, Heilongjiang University, Harbin, 150001, People's Republic of China
| | - Haoji Dong
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Yue Sheng
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, 3000, Australia
| | - Chengying Huang
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| | - Jinbao Li
- Qilu University of Technology (Shandong Academy of Science), Shandong Artificial Intelligence Institute, Jinnan, 250014, People's Republic of China
| | - Jinghua Zhu
- School of Computer Science and Technology, Heilongjiang University, Harbin, 150000, People's Republic of China
| |
Collapse
|
35
|
Wang Z, Zhu J, Fu S, Ye Y. Context fusion network with multi-scale-aware skip connection and twin-split attention for liver tumor segmentation. Med Biol Eng Comput 2023; 61:3167-3180. [PMID: 37470963 DOI: 10.1007/s11517-023-02876-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 06/20/2023] [Indexed: 07/21/2023]
Abstract
Manually annotating liver tumor contours is a time-consuming and labor-intensive task for clinicians. Therefore, automated segmentation is urgently needed in clinical diagnosis. However, automatic segmentation methods face certain challenges due to heterogeneity, fuzzy boundaries, and irregularity of tumor tissue. In this paper, a novel deep learning-based approach with multi-scale-aware (MSA) module and twin-split attention (TSA) module is proposed for tumor segmentation. The MSA module can bridge the semantic gap and reduce the loss of detailed information. The TSA module can recalibrate the channel response of the feature map. Eventually, we can count tumors based on the segmentation results from a 3D perspective for cancer grading. Extensive experiments conducted on the LiTS2017 dataset show the effectiveness of the proposed method by achieving a Dice index of 85.97% and a Jaccard index of 81.56% over the state of the art. In addition, the proposed method also achieved a Dice index of 83.67% and a Jaccard index of 80.11% in 3Dircadb dataset verification, which further reflects its robustness and generalization ability.
Collapse
Affiliation(s)
- Zhendong Wang
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China
| | - Jiehua Zhu
- Department of Mathematical Sciences, Georgia Southern University, Statesboro, GA, 30460, USA
| | - Shujun Fu
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China
| | - Yangbo Ye
- Department of Mathematics, The University of Iowa, Iowa City, IA, 52242, USA.
| |
Collapse
|
36
|
Wan B, Hu B, Zhao M, Li K, Ye X. Deep learning-based magnetic resonance image segmentation technique for application to glioma. Front Med (Lausanne) 2023; 10:1172767. [PMID: 38053614 PMCID: PMC10694355 DOI: 10.3389/fmed.2023.1172767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/10/2023] [Indexed: 12/07/2023] Open
Abstract
Introduction Brain glioma segmentation is a critical task for medical diagnosis, monitoring, and treatment planning. Discussion Although deep learning-based fully convolutional neural networks have shown promising results in this field, their unstable segmentation quality remains a major concern. Moreover, they do not consider the unique genomic and basic data of brain glioma patients, which may lead to inaccurate diagnosis and treatment planning. Methods This study proposes a new model that overcomes this problem by improving the overall architecture and incorporating an innovative loss function. First, we employed DeepLabv3+ as the overall architecture of the model and RegNet as the image encoder. We designed an attribute encoder module to incorporate the patient's genomic and basic data and the image depth information into a 2D convolutional neural network, which was combined with the image encoder and atrous spatial pyramid pooling module to form the encoder module for addressing the multimodal fusion problem. In addition, the cross-entropy loss and Dice loss are implemented with linear weighting to solve the problem of sample imbalance. An innovative loss function is proposed to suppress specific size regions, thereby preventing the occurrence of segmentation errors of noise-like regions; hence, higher-stability segmentation results are obtained. Experiments were conducted on the Lower-Grade Glioma Segmentation Dataset, a widely used benchmark dataset for brain tumor segmentation. Results The proposed method achieved a Dice score of 94.36 and an intersection over union score of 91.83, thus outperforming other popular models.
Collapse
Affiliation(s)
- Bing Wan
- Department of Radiology, China Three Gorges University, Affiliated Renhe Hospital, Yichang, Hubei
- Department of Radiology, Chongqing People’s Hospital, Chongqing, China
| | - Bingbing Hu
- School of Computer Science, Yangtze University, Jingzhou, China
| | - Ming Zhao
- School of Computer Science, Yangtze University, Jingzhou, China
| | - Kang Li
- Department of Radiology, Chongqing People’s Hospital, Chongqing, China
| | - Xu Ye
- Electronics and Information School, Yangtze University, Jingzhou, China
| |
Collapse
|
37
|
Hettihewa K, Kobchaisawat T, Tanpowpong N, Chalidabhongse TH. MANet: a multi-attention network for automatic liver tumor segmentation in computed tomography (CT) imaging. Sci Rep 2023; 13:20098. [PMID: 37973987 PMCID: PMC10654423 DOI: 10.1038/s41598-023-46580-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023] Open
Abstract
Automatic liver tumor segmentation is a paramount important application for liver tumor diagnosis and treatment planning. However, it has become a highly challenging task due to the heterogeneity of the tumor shape and intensity variation. Automatic liver tumor segmentation is capable to establish the diagnostic standard to provide relevant radiological information to all levels of expertise. Recently, deep convolutional neural networks have demonstrated superiority in feature extraction and learning in medical image segmentation. However, multi-layer dense feature stacks make the model quite inconsistent in imitating visual attention and awareness of radiological expertise for tumor recognition and segmentation task. To bridge that visual attention capability, attention mechanisms have developed for better feature selection. In this paper, we propose a novel network named Multi Attention Network (MANet) as a fusion of attention mechanisms to learn highlighting important features while suppressing irrelevant features for the tumor segmentation task. The proposed deep learning network has followed U-Net as the basic architecture. Moreover, residual mechanism is implemented in the encoder. Convolutional block attention module has split into channel attention and spatial attention modules to implement in encoder and decoder of the proposed architecture. The attention mechanism in Attention U-Net is integrated to extract low-level features to combine with high-level ones. The developed deep learning architecture is trained and evaluated on the publicly available MICCAI 2017 Liver Tumor Segmentation dataset and 3DIRCADb dataset under various evaluation metrics. MANet demonstrated promising results compared to state-of-the-art methods with comparatively small parameter overhead.
Collapse
Affiliation(s)
- Kasun Hettihewa
- Perceptual Intelligent Computing Laboratory, Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | | | - Natthaporn Tanpowpong
- Department of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Thanarat H Chalidabhongse
- Perceptual Intelligent Computing Laboratory, Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand.
- Applied Digital Technology in Medicine (ATM) Research Group, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand.
| |
Collapse
|
38
|
Chen Y, Yu L, Wang JY, Panjwani N, Obeid JP, Liu W, Liu L, Kovalchuk N, Gensheimer MF, Vitzthum LK, Beadle BM, Chang DT, Le QT, Han B, Xing L. Adaptive Region-Specific Loss for Improved Medical Image Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:13408-13421. [PMID: 37363838 PMCID: PMC11346301 DOI: 10.1109/tpami.2023.3289667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Defining the loss function is an important part of neural network design and critically determines the success of deep learning modeling. A significant shortcoming of the conventional loss functions is that they weight all regions in the input image volume equally, despite the fact that the system is known to be heterogeneous (i.e., some regions can achieve high prediction performance more easily than others). Here, we introduce a region-specific loss to lift the implicit assumption of homogeneous weighting for better learning. We divide the entire volume into multiple sub-regions, each with an individualized loss constructed for optimal local performance. Effectively, this scheme imposes higher weightings on the sub-regions that are more difficult to segment, and vice versa. Furthermore, the regional false positive and false negative errors are computed for each input image during a training step and the regional penalty is adjusted accordingly to enhance the overall accuracy of the prediction. Using different public and in-house medical image datasets, we demonstrate that the proposed regionally adaptive loss paradigm outperforms conventional methods in the multi-organ segmentations, without any modification to the neural network architecture or additional data preparation.
Collapse
|
39
|
Tian Y, Zhang Z, Zhao B, Liu L, Liu X, Feng Y, Tian J, Kou D. Coarse-to-fine prior-guided attention network for multi-structure segmentation on dental panoramic radiographs. Phys Med Biol 2023; 68:215010. [PMID: 37816372 DOI: 10.1088/1361-6560/ad0218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 10/10/2023] [Indexed: 10/12/2023]
Abstract
Objective. Accurate segmentation of various anatomical structures from dental panoramic radiographs is essential for the diagnosis and treatment planning of various diseases in digital dentistry. In this paper, we propose a novel deep learning-based method for accurate and fully automatic segmentation of the maxillary sinus, mandibular condyle, mandibular nerve, alveolar bone and teeth on panoramic radiographs.Approach. A two-stage coarse-to-fine prior-guided segmentation framework is proposed to segment multiple structures on dental panoramic radiographs. In the coarse stage, a multi-label segmentation network is used to generate the coarse segmentation mask, and in the fine-tuning stage, a prior-guided attention network with an encoder-decoder architecture is proposed to precisely predict the mask of each anatomical structure. First, a prior-guided edge fusion module is incorporated into the network at the input of each convolution level of the encode path to generate edge-enhanced image feature maps. Second, a prior-guided spatial attention module is proposed to guide the network to extract relevant spatial features from foreground regions based on the combination of the prior information and the spatial attention mechanism. Finally, a prior-guided hybrid attention module is integrated at the bottleneck of the network to explore global context from both spatial and category perspectives.Main results. We evaluated the segmentation performance of our method on a testing dataset that contains 150 panoramic radiographs collected from real-world clinical scenarios. The segmentation results indicate that our proposed method achieves more accurate segmentation performance compared with state-of-the-art methods. The average Jaccard scores are 87.91%, 85.25%, 63.94%, 93.46% and 88.96% for the maxillary sinus, mandibular condyle, mandibular nerve, alveolar bone and teeth, respectively.Significance. The proposed method was able to accurately segment multiple structures on panoramic radiographs. This method has the potential to be part of the process of automatic pathology diagnosis from dental panoramic radiographs.
Collapse
Affiliation(s)
- Yuan Tian
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Zhejia Zhang
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Bailiang Zhao
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Lichao Liu
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Xiaolin Liu
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Yang Feng
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Jie Tian
- Angelalign Inc. No. 500 Zhengli Road, Yangpu District, Shanghai, People's Republic of China
| | - Dazhi Kou
- Shanghai Supercomputer Center. No. 585 Guoshoujing Road, Pudong New District, Shanghai, People's Republic of China
| |
Collapse
|
40
|
Zou Y, Ge Y, Zhao L, Li W. MR-Trans: MultiResolution Transformer for medical image segmentation. Comput Biol Med 2023; 165:107456. [PMID: 37696179 DOI: 10.1016/j.compbiomed.2023.107456] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 07/31/2023] [Accepted: 09/04/2023] [Indexed: 09/13/2023]
Abstract
In recent years, the transformer-based methods such as TransUNet and SwinUNet have been successfully applied in the research of medical image segmentation. However, these methods are all high-to-low resolution network by recovering high-resolution feature representations from low-resolution. This kind of structure led to loss of low-level semantic information in encoder stage. In this paper, we propose a new framework named MR-Trans to maintain high-resolution and low-resolution feature representations simultaneously. MR-Trans consists of three modules, namely a branch partition module, an encoder module and a decoder module. We construct multi-resolution branches with different resolutions in branch partition stage. In encoder module, we adopt Swin Transformer method to extract long-range dependencies on each branch and propose a new feature fusion strategy to fuse features with different scales between branches. A novel decoder network is proposed in MR-Trans by combining the PSPNet and FPNet at the same time to improve the recognition ability at different scales. Extensive experiments on two different datasets demonstrate that our method achieves better performance than other previous state-of-the-art methods for medical image segmentation.
Collapse
Affiliation(s)
- Yibo Zou
- School of Information, Shanghai Ocean University, Shanghai, 201306, China.
| | - Yan Ge
- School of Information, Shanghai Ocean University, Shanghai, 201306, China.
| | - Linlin Zhao
- School of Information, Shanghai Ocean University, Shanghai, 201306, China.
| | - Wei Li
- Department of Pediatrics, Tongji Hospital, Tongji University School of Medicine, Shanghai, 200065, China.
| |
Collapse
|
41
|
Shen L, Wang Q, Zhang Y, Qin F, Jin H, Zhao W. DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation. Medicine (Baltimore) 2023; 102:e35328. [PMID: 37773842 PMCID: PMC10545043 DOI: 10.1097/md.0000000000035328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 08/31/2023] [Indexed: 10/01/2023] Open
Abstract
U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation.
Collapse
Affiliation(s)
- Longfeng Shen
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Qiong Wang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Yingjie Zhang
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Fenglan Qin
- Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), College of Computer Science and Technology, Huaibei Normal University, Huaibei, China
- Anhui Big-Data Research Center on University Management, Huaibei, China
| | - Hengjun Jin
- People’s Hospital of Huaibei City, Huaibei, China
| | - Wei Zhao
- People’s Hospital of Huaibei City, Huaibei, China
| |
Collapse
|
42
|
Ma J, Yuan G, Guo C, Gang X, Zheng M. SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules. Front Med (Lausanne) 2023; 10:1273441. [PMID: 37841008 PMCID: PMC10569032 DOI: 10.3389/fmed.2023.1273441] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/12/2023] [Indexed: 10/17/2023] Open
Abstract
Medical images are information carriers that visually reflect and record the anatomical structure of the human body, and play an important role in clinical diagnosis, teaching and research, etc. Modern medicine has become increasingly inseparable from the intelligent processing of medical images. In recent years, there have been more and more attempts to apply deep learning theory to medical image segmentation tasks, and it is imperative to explore a simple and efficient deep learning algorithm for medical image segmentation. In this paper, we investigate the segmentation of lung nodule images. We address the above-mentioned problems of medical image segmentation algorithms and conduct research on medical image fusion algorithms based on a hybrid channel-space attention mechanism and medical image segmentation algorithms with a hybrid architecture of Convolutional Neural Networks (CNN) and Visual Transformer. To the problem that medical image segmentation algorithms are difficult to capture long-range feature dependencies, this paper proposes a medical image segmentation model SW-UNet based on a hybrid CNN and Vision Transformer (ViT) framework. Self-attention mechanism and sliding window design of Visual Transformer are used to capture global feature associations and break the perceptual field limitation of convolutional operations due to inductive bias. At the same time, a widened self-attentive vector is used to streamline the number of modules and compress the model size so as to fit the characteristics of a small amount of medical data, which makes the model easy to be overfitted. Experiments on the LUNA16 lung nodule image dataset validate the algorithm and show that the proposed network can achieve efficient medical image segmentation on a lightweight scale. In addition, to validate the migratability of the model, we performed additional validation on other tumor datasets with desirable results. Our research addresses the crucial need for improved medical image segmentation algorithms. By introducing the SW-UNet model, which combines CNN and ViT, we successfully capture long-range feature dependencies and break the perceptual field limitations of traditional convolutional operations. This approach not only enhances the efficiency of medical image segmentation but also maintains model scalability and adaptability to small medical datasets. The positive outcomes on various tumor datasets emphasize the potential migratability and broad applicability of our proposed model in the field of medical image analysis.
Collapse
Affiliation(s)
- Jiajun Ma
- Shenhua Hollysys Information Technology Co., Ltd., Beijing, China
| | - Gang Yuan
- The First Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Chenhua Guo
- School of Software, North University of China, Taiyuan, China
| | | | - Minting Zheng
- The First Affiliated Hospital of Dalian Medical University, Dalian, China
| |
Collapse
|
43
|
Yang L, Zhai C, Liu Y, Yu H. CFHA-Net: A polyp segmentation method with cross-scale fusion strategy and hybrid attention. Comput Biol Med 2023; 164:107301. [PMID: 37573723 DOI: 10.1016/j.compbiomed.2023.107301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/10/2023] [Accepted: 07/28/2023] [Indexed: 08/15/2023]
Abstract
Colorectal cancer is a prevalent disease in modern times, with most cases being caused by polyps. Therefore, the segmentation of polyps has garnered significant attention in the field of medical image segmentation. In recent years, the variant network derived from the U-Net network has demonstrated a good segmentation effect on polyp segmentation challenges. In this paper, a polyp segmentation model, called CFHA-Net, is proposed, that combines a cross-scale feature fusion strategy and a hybrid attention mechanism. Inspired by feature learning, the encoder unit incorporates a cross-scale context fusion (CCF) module that performs cross-layer feature fusion and enhances the feature information of different scales. The skip connection is optimized by proposed triple hybrid attention (THA) module that aggregates spatial and channel attention features from three directions to improve the long-range dependence between features and help identify subsequent polyp lesion boundaries. Additionally, a dense-receptive feature fusion (DFF) module, which combines dense connections and multi-receptive field fusion modules, is added at the bottleneck layer to capture more comprehensive context information. Furthermore, a hybrid pooling (HP) module and a hybrid upsampling (HU) module are proposed to help the segmentation network acquire more contextual features. A series of experiments have been conducted on three typical datasets for polyp segmentation (CVC-ClinicDB, Kvasir-SEG, EndoTect) to evaluate the effectiveness and generalization of the proposed CFHA-Net. The experimental results demonstrate the validity and generalization of the proposed method, with many performance metrics surpassing those of related advanced segmentation networks. Therefore, proposed CFHA-Net could present a promising solution to the challenges of polyp segmentation in medical image analysis. The source code of proposed CFHA-Net is available at https://github.com/CXzhai/CFHA-Net.git.
Collapse
Affiliation(s)
- Lei Yang
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China
| | - Chenxu Zhai
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China
| | - Yanhong Liu
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China.
| | - Hongnian Yu
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Built Environment, Edinburgh Napier University, Edinburgh EH10 5DT, UK
| |
Collapse
|
44
|
Tu DY, Lin PC, Chou HH, Shen MR, Hsieh SY. Slice-Fusion: Reducing False Positives in Liver Tumor Detection for Mask R-CNN. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3267-3277. [PMID: 37027274 DOI: 10.1109/tcbb.2023.3265394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Automatic liver tumor detection from computed tomography (CT) makes clinical examinations more accurate. However, deep learning-based detection algorithms are characterized by high sensitivity and low precision, which hinders diagnosis given that false-positive tumors must first be identified and excluded. These false positives arise because detection models incorrectly identify partial volume artifacts as lesions, which in turn stems from the inability to learn the perihepatic structure from a global perspective. To overcome this limitation, we propose a novel slice-fusion method in which mining the global structural relationship between the tissues in the target CT slices and fusing the features of adjacent slices according to the importance of the tissues. Furthermore, we design a new network based on our slice-fusion method and Mask R-CNN detection model, called Pinpoint-Net. We evaluated proposed model on the Liver Tumor Segmentation Challenge (LiTS) dataset and our liver metastases dataset. Experiments demonstrated that our slice-fusion method not only enhance tumor detection ability via reducing the number of false-positive tumors smaller than 10mm, but also improve segmentation performance. Without bells and whistles, a single Pinpoint-Net showed outstanding performance in liver tumor detection and segmentation on LiTS test dataset compared with other state-of-the-art models.
Collapse
|
45
|
肖 汉, 李 焕, 冉 智, 张 启, 张 勃, 韦 羽, 祝 秀. [Corona virus disease 2019 lesion segmentation network based on an adaptive joint loss function]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2023; 40:743-752. [PMID: 37666765 PMCID: PMC10477394 DOI: 10.7507/1001-5515.202206051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 05/30/2023] [Indexed: 09/06/2023]
Abstract
Corona virus disease 2019 (COVID-19) is an acute respiratory infectious disease with strong contagiousness, strong variability, and long incubation period. The probability of misdiagnosis and missed diagnosis can be significantly decreased with the use of automatic segmentation of COVID-19 lesions based on computed tomography images, which helps doctors in rapid diagnosis and precise treatment. This paper introduced the level set generalized Dice loss function (LGDL) in conjunction with the level set segmentation method based on COVID-19 lesion segmentation network and proposed a dual-path COVID-19 lesion segmentation network (Dual-SAUNet++) to address the pain points such as the complex symptoms of COVID-19 and the blurred boundaries that are challenging to segment. LGDL is an adaptive weight joint loss obtained by combining the generalized Dice loss of the mask path and the mean square error of the level set path. On the test set, the model achieved Dice similarity coefficient of (87.81 ± 10.86)%, intersection over union of (79.20 ± 14.58)%, sensitivity of (94.18 ± 13.56)%, specificity of (99.83 ± 0.43)% and Hausdorff distance of 18.29 ± 31.48 mm. Studies indicated that Dual-SAUNet++ has a great anti-noise capability and it can segment multi-scale lesions while simultaneously focusing on their area and border information. The method proposed in this paper assists doctors in judging the severity of COVID-19 infection by accurately segmenting the lesion, and provides a reliable basis for subsequent clinical treatment.
Collapse
Affiliation(s)
- 汉光 肖
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| | - 焕琪 李
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| | - 智强 冉
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| | - 启航 张
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| | - 勃龙 张
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| | - 羽佳 韦
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| | - 秀红 祝
- 重庆理工大学 两江人工智能学院(重庆 401135)School of Artificial Intelligence, Chongqing University of Technology, Chongqing 401135, P. R. China
| |
Collapse
|
46
|
Wang Q, Xu L, Wang L, Yang X, Sun Y, Yang B, Greenwald SE. Automatic coronary artery segmentation of CCTA images using UNet with a local contextual transformer. Front Physiol 2023; 14:1138257. [PMID: 37675283 PMCID: PMC10478234 DOI: 10.3389/fphys.2023.1138257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/01/2023] [Indexed: 09/08/2023] Open
Abstract
Coronary artery segmentation is an essential procedure in the computer-aided diagnosis of coronary artery disease. It aims to identify and segment the regions of interest in the coronary circulation for further processing and diagnosis. Currently, automatic segmentation of coronary arteries is often unreliable because of their small size and poor distribution of contrast medium, as well as the problems that lead to over-segmentation or omission. To improve the performance of convolutional-neural-network (CNN) based coronary artery segmentation, we propose a novel automatic method, DR-LCT-UNet, with two innovative components: the Dense Residual (DR) module and the Local Contextual Transformer (LCT) module. The DR module aims to preserve unobtrusive features through dense residual connections, while the LCT module is an improved Transformer that focuses on local contextual information, so that coronary artery-related information can be better exploited. The LCT and DR modules are effectively integrated into the skip connections and encoder-decoder of the 3D segmentation network, respectively. Experiments on our CorArtTS2020 dataset show that the dice similarity coefficient (DSC), Recall, and Precision of the proposed method reached 85.8%, 86.3% and 85.8%, respectively, outperforming 3D-UNet (taken as the reference among the 6 other chosen comparison methods), by 2.1%, 1.9%, and 2.1%.
Collapse
Affiliation(s)
- Qianjin Wang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Lisheng Xu
- College of Medicine and Biological and Information Engineering, Northeastern University, Shenyang, China
| | - Lu Wang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Xiaofan Yang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China
| | - Yu Sun
- College of Medicine and Biological and Information Engineering, Northeastern University, Shenyang, China
- Department of Radiology, General Hospital of Northern Theater Command, Shenyang, China
- Key Laboratory of Cardiovascular Imaging and Research of Liaoning Province, Shenyang, China
| | - Benqiang Yang
- Department of Radiology, General Hospital of Northern Theater Command, Shenyang, China
- Key Laboratory of Cardiovascular Imaging and Research of Liaoning Province, Shenyang, China
| | - Stephen E. Greenwald
- Blizard Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom
| |
Collapse
|
47
|
Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. Transformers in medical imaging: A survey. Med Image Anal 2023; 88:102802. [PMID: 37315483 DOI: 10.1016/j.media.2023.102802] [Citation(s) in RCA: 186] [Impact Index Per Article: 93.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2023] [Accepted: 03/23/2023] [Indexed: 06/16/2023]
Abstract
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
Collapse
Affiliation(s)
- Fahad Shamshad
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
| | - Salman Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; CECS, Australian National University, Canberra ACT 0200, Australia
| | - Syed Waqas Zamir
- Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | | | - Munawar Hayat
- Faculty of IT, Monash University, Clayton VIC 3800, Australia
| | - Fahad Shahbaz Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; Computer Vision Laboratory, Linköping University, Sweden
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
| |
Collapse
|
48
|
Wu S, Yu H, Li C, Zheng R, Xia X, Wang C, Wang H. A Coarse-to-Fine Fusion Network for Small Liver Tumor Detection and Segmentation: A Real-World Study. Diagnostics (Basel) 2023; 13:2504. [PMID: 37568868 PMCID: PMC10417427 DOI: 10.3390/diagnostics13152504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 07/14/2023] [Accepted: 07/18/2023] [Indexed: 08/13/2023] Open
Abstract
Liver tumor semantic segmentation is a crucial task in medical image analysis that requires multiple MRI modalities. This paper proposes a novel coarse-to-fine fusion segmentation approach to detect and segment small liver tumors of various sizes. To enhance the segmentation accuracy of small liver tumors, the method incorporates a detection module and a CSR (convolution-SE-residual) module, which includes a convolution block, an SE (squeeze and excitation) module, and a residual module for fine segmentation. The proposed method demonstrates superior performance compared to conventional single-stage end-to-end networks. A private liver MRI dataset comprising 218 patients with a total of 3605 tumors, including 3273 tumors smaller than 3.0 cm, were collected for the proposed method. There are five types of liver tumors identified in this dataset: hepatocellular carcinoma (HCC); metastases of the liver; cholangiocarcinoma (ICC); hepatic cyst; and liver hemangioma. The results indicate that the proposed method outperforms the single segmentation networks 3D UNet and nnU-Net as well as the fusion networks of 3D UNet and nnU-Net with nnDetection. The proposed architecture was evaluated on a test set of 44 images, with an average Dice similarity coefficient (DSC) and recall of 86.9% and 86.7%, respectively, which is a 1% improvement compared to the comparison method. More importantly, compared to existing methods, our proposed approach demonstrates state-of-the-art performance in segmenting small objects with sizes smaller than 10 mm, achieving a Dice score of 85.3% and a malignancy detection rate of 87.5%.
Collapse
Affiliation(s)
- Shu Wu
- Zhiyu Software Information Co., Ltd., Shanghai 200030, China
| | - Hang Yu
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Cuiping Li
- Zhiyu Software Information Co., Ltd., Shanghai 200030, China
| | - Rencheng Zheng
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Xueqin Xia
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Chengyan Wang
- Human Phenome Institute, Fudan University, Shanghai 200433, China
| | - He Wang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Human Phenome Institute, Fudan University, Shanghai 200433, China
- Department of Neurology, Zhongshan Hospital, Fudan University, Shanghai 200032, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai 200433, China
| |
Collapse
|
49
|
Zhong H, Li A, Chen Y, Huang Q, Chen X, Kang J, You Y. Comparative analysis of automatic segmentation of esophageal cancer using 3D Res-UNet on conventional and 40-keV virtual mono-energetic CT Images: a retrospective study. PeerJ 2023; 11:e15707. [PMID: 37483982 PMCID: PMC10358343 DOI: 10.7717/peerj.15707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 06/15/2023] [Indexed: 07/25/2023] Open
Abstract
Objectives To assess the performance of 3D Res-UNet for fully automated segmentation of esophageal cancer (EC) and compare the segmentation accuracy between conventional images (CI) and 40-keV virtual mono-energetic images (VMI40 kev). Methods Patients underwent spectral CT scanning and diagnosed of EC by operation or gastroscope biopsy in our hospital from 2019 to 2020 were analyzed retrospectively. All artery spectral base images were transferred to the dedicated workstation to generate VMI40 kev and CI. The segmentation model of EC was constructed by 3D Res-UNet neural network in VMI40 kev and CI, respectively. After optimization training, the Dice similarity coefficient (DSC), overlap (IOU), average symmetrical surface distance (ASSD) and 95% Hausdorff distance (HD_95) of EC at pixel level were tested and calculated in the test set. The paired rank sum test was used to compare the results of VMI40 kev and CI. Results A total of 160 patients were included in the analysis and randomly divided into the training dataset (104 patients), validation dataset (26 patients) and test dataset (30 patients). VMI40 kevas input data in the training dataset resulted in higher model performance in the test dataset in comparison with using CI as input data (DSC:0.875 vs 0.859, IOU: 0.777 vs 0.755, ASSD:0.911 vs 0.981, HD_95: 4.41 vs 6.23, all p-value <0.05). Conclusion Fully automated segmentation of EC with 3D Res-UNet has high accuracy and clinically feasibility for both CI and VMI40 kev. Compared with CI, VMI40 kev indicated slightly higher accuracy in this test dataset.
Collapse
Affiliation(s)
- Hua Zhong
- Department of Radiology, Zhong Shan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Anqi Li
- Department of Radiology, Zhong Shan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Yingdong Chen
- Department of Radiology, Zhong Shan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Qianwen Huang
- Department of Radiology, Zhong Shan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Xingbiao Chen
- Clinical Science, Philips Healthcare, Shanghai, China
| | - Jianghe Kang
- Department of Radiology, Zhong Shan Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Youkuang You
- Department of Radiology, Xiamen Xianyue Hospital, Xiamen, Fujian, China
| |
Collapse
|
50
|
Ma R, Hao L, Tao Y, Mendoza X, Khodeiry M, Liu Y, Shyu ML, Lee RK. RGC-Net: An Automatic Reconstruction and Quantification Algorithm for Retinal Ganglion Cells Based on Deep Learning. Transl Vis Sci Technol 2023; 12:7. [PMID: 37140906 PMCID: PMC10166122 DOI: 10.1167/tvst.12.5.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 03/31/2023] [Indexed: 05/05/2023] Open
Abstract
Purpose The purpose of this study was to develop a deep learning-based fully automated reconstruction and quantification algorithm which automatically delineates the neurites and somas of retinal ganglion cells (RGCs). Methods We trained a deep learning-based multi-task image segmentation model, RGC-Net, that automatically segments the neurites and somas in RGC images. A total of 166 RGC scans with manual annotations from human experts were used to develop this model, whereas 132 scans were used for training, and the remaining 34 scans were reserved as testing data. Post-processing techniques removed speckles or dead cells in soma segmentation results to further improve the robustness of the model. Quantification analyses were also conducted to compare five different metrics obtained by our automated algorithm and manual annotations. Results Quantitatively, our segmentation model achieves average foreground accuracy, background accuracy, overall accuracy, and dice similarity coefficient of 0.692, 0.999, 0.997, and 0.691 for the neurite segmentation task, and 0.865, 0.999, 0.997, and 0.850 for the soma segmentation task, respectively. Conclusions The experimental results demonstrate that RGC-Net can accurately and reliably reconstruct neurites and somas in RGC images. We also demonstrate our algorithm is comparable to human manually curated annotations in quantification analyses. Translational Relevance Our deep learning model provides a new tool that can trace and analyze the RGC neurites and somas efficiently and faster than manual analysis.
Collapse
Affiliation(s)
- Rui Ma
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA
| | - Lili Hao
- Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, FL, USA
- Department of Ophthalmology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Yudong Tao
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA
| | - Ximena Mendoza
- Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Mohamed Khodeiry
- Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Yuan Liu
- Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Mei-Ling Shyu
- School of Science and Engineering, University of Missouri-Kansas City, Kansas City, MO, USA
| | - Richard K. Lee
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA
- Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, FL, USA
| |
Collapse
|