1
|
Yang X, Yang R, Liu X, Chen Z, Zheng Q. Recent Advances in Artificial Intelligence for Precision Diagnosis and Treatment of Bladder Cancer: A Review. Ann Surg Oncol 2025:10.1245/s10434-025-17228-6. [PMID: 40221553 DOI: 10.1245/s10434-025-17228-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Accepted: 03/09/2025] [Indexed: 04/14/2025]
Abstract
BACKGROUND Bladder cancer is one of the top ten cancers globally, with its incidence steadily rising in China. Early detection and prognosis risk assessment play a crucial role in guiding subsequent treatment decisions for bladder cancer. However, traditional diagnostic methods such as bladder endoscopy, imaging, or pathology examinations heavily rely on the clinical expertise and experience of clinicians, exhibiting subjectivity and poor reproducibility. MATERIALS AND METHODS With the rise of artificial intelligence, novel approaches, particularly those employing deep learning technology, have shown significant advancements in clinical tasks related to bladder cancer, including tumor detection, molecular subtyping identification, tumor staging and grading, prognosis prediction, and recurrence assessment. RESULTS Artificial intelligence, with its robust data mining capabilities, enhances diagnostic efficiency and reproducibility when assisting clinicians in decision-making, thereby reducing the risks of misdiagnosis and underdiagnosis. This not only helps alleviate the current challenges of talent shortages and uneven distribution of medical resources but also fosters the development of precision medicine. CONCLUSIONS This study provides a comprehensive review of the latest research advances and prospects of artificial intelligence technology in the precise diagnosis and treatment of bladder cancer.
Collapse
Affiliation(s)
- Xiangxiang Yang
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
- Institute of Urologic Disease, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Rui Yang
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
- Institute of Urologic Disease, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Xiuheng Liu
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
- Institute of Urologic Disease, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Zhiyuan Chen
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China.
- Institute of Urologic Disease, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China.
| | - Qingyuan Zheng
- Department of Urology, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China.
- Institute of Urologic Disease, Renmin Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China.
| |
Collapse
|
2
|
Yao L, Xia Y, Chen Z, Li S, Yao J, Jin D, Liang Y, Lin J, Zhao B, Han C, Lu L, Zhang L, Liu Z, Chen X. A Colorectal Coordinate-Driven Method for Colorectum and Colorectal Cancer Segmentation in Conventional CT Scans. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7395-7406. [PMID: 38687670 DOI: 10.1109/tnnls.2024.3386610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Automated colorectal cancer (CRC) segmentation in medical imaging is the key to achieving automation of CRC detection, staging, and treatment response monitoring. Compared with magnetic resonance imaging (MRI) and computed tomography colonography (CTC), conventional computed tomography (CT) has enormous potential because of its broad implementation, superiority for the hollow viscera (colon), and convenience without needing bowel preparation. However, the segmentation of CRC in conventional CT is more challenging due to the difficulties presenting with the unprepared bowel, such as distinguishing the colorectum from other structures with similar appearance and distinguishing the CRC from the contents of the colorectum. To tackle these challenges, we introduce DeepCRC-SL, the first automated segmentation algorithm for CRC and colorectum in conventional contrast-enhanced CT scans. We propose a topology-aware deep learning-based approach, which builds a novel 1-D colorectal coordinate system and encodes each voxel of the colorectum with a relative position along the coordinate system. We then induce an auxiliary regression task to predict the colorectal coordinate value of each voxel, aiming to integrate global topology into the segmentation network and thus improve the colorectum's continuity. Self-attention layers are utilized to capture global contexts for the coordinate regression task and enhance the ability to differentiate CRC and colorectum tissues. Moreover, a coordinate-driven self-learning (SL) strategy is introduced to leverage a large amount of unlabeled data to improve segmentation performance. We validate the proposed approach on a dataset including 227 labeled and 585 unlabeled CRC cases by fivefold cross-validation. Experimental results demonstrate that our method outperforms some recent related segmentation methods and achieves the segmentation accuracy in DSC for CRC of 0.669 and colorectum of 0.892, reaching to the performance (at 0.639 and 0.890, respectively) of a medical resident with two years of specialized CRC imaging fellowship.
Collapse
|
3
|
Dautkulova A, Aider OA, Teulière C, Coste J, Chaix R, Ouachik O, Pereira B, Lemaire JJ. Automated segmentation of deep brain structures from Inversion-Recovery MRI. Comput Med Imaging Graph 2025; 120:102488. [PMID: 39787737 DOI: 10.1016/j.compmedimag.2024.102488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 09/05/2024] [Accepted: 12/30/2024] [Indexed: 01/12/2025]
Abstract
Methods for the automated segmentation of brain structures are a major subject of medical research. The small structures of the deep brain have received scant attention, notably for lack of manual delineations by medical experts. In this study, we assessed an automated segmentation of a novel clinical dataset containing White Matter Attenuated Inversion-Recovery (WAIR) MRI images and five manually segmented structures (substantia nigra (SN), subthalamic nucleus (STN), red nucleus (RN), mammillary body (MB) and mammillothalamic fascicle (MT-fa)) in 53 patients with severe Parkinson's disease. T1 and DTI images were additionally used. We also assessed the reorientation of DTI diffusion vectors with reference to the ACPC line. A state-of-the-art nnU-Net method was trained and tested on subsets of 38 and 15 image datasets respectively. We used Dice similarity coefficient (DSC), 95% Hausdorff distance (95HD), and volumetric similarity (VS) as metrics to evaluate network efficiency in reproducing manual contouring. Random-effects models statistically compared values according to structures, accounting for between- and within-participant variability. Results show that WAIR significantly outperformed T1 for DSC (0.739 ± 0.073), 95HD (1.739 ± 0.398), and VS (0.892 ± 0.044). The DSC values for automated segmentation of MB, RN, SN, STN, and MT-fa decreased in that order, in line with the increasing complexity observed in manual segmentation. Based on training results, the reorientation of DTI vectors improved the automated segmentation.
Collapse
Affiliation(s)
- Aigerim Dautkulova
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France.
| | - Omar Ait Aider
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Céline Teulière
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Jérôme Coste
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France; Université Clermont Auvergne, CNRS, CHU Clermont-Ferrand, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Rémi Chaix
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France; Université Clermont Auvergne, CNRS, CHU Clermont-Ferrand, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Omar Ouachik
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Bruno Pereira
- Direction de la Recherche et de l'Innovation, CHU Clermont-Ferrand, F-63000 Clermont-Ferrand, France
| | - Jean-Jacques Lemaire
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France; Université Clermont Auvergne, CNRS, CHU Clermont-Ferrand, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| |
Collapse
|
4
|
Krishnan C, Onuoha E, Hung A, Sung KH, Kim H. Multi-attention Mechanism for Enhanced Pseudo-3D Prostate Zonal Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01401-0. [PMID: 40021566 DOI: 10.1007/s10278-025-01401-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 12/15/2024] [Accepted: 12/31/2024] [Indexed: 03/03/2025]
Abstract
This study presents a novel pseudo-3D Global-Local Channel Spatial Attention (GLCSA) mechanism designed to enhance prostate zonal segmentation in high-resolution T2-weighted MRI images. GLCSA captures complex, multi-dimensional features while maintaining computational efficiency by integrating global and local attention in channel and spatial domains, complemented by a slice interaction module simulating 3D processing. Applied across various U-Net architectures, GLCSA was evaluated on two datasets: a proprietary set of 44 patients and the public ProstateX dataset of 204 patients. Performance, measured using the Dice Similarity Coefficient (DSC) and Mean Surface Distance (MSD) metrics, demonstrated significant improvements in segmentation accuracy for both the transition zone (TZ) and peripheral zone (PZ), with minimal parameter increase (1.27%). GLCSA achieved DSC increases of 0.74% and 11.75% for TZ and PZ, respectively, in the proprietary dataset. In the ProstateX dataset, improvements were even more pronounced, with DSC increases of 7.34% for TZ and 24.80% for PZ. Comparative analysis showed GLCSA-UNet performing competitively against other 2D, 2.5D, and 3D models, with DSC values of 0.85 (TZ) and 0.65 (PZ) on the proprietary dataset and 0.80 (TZ) and 0.76 (PZ) on the ProstateX dataset. Similarly, MSD values were 1.14 (TZ) and 1.21 (PZ) on the proprietary dataset and 1.48 (TZ) and 0.98 (PZ) on the ProstateX dataset. Ablation studies highlighted the effectiveness of combining channel and spatial attention and the advantages of global embedding over patch-based methods. In conclusion, GLCSA offers a robust balance between the detailed feature capture of 3D models and the efficiency of 2D models, presenting a promising tool for improving prostate MRI image segmentation.
Collapse
Affiliation(s)
- Chetana Krishnan
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Ezinwanne Onuoha
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Alex Hung
- Department of Radiology, The University of California Los Angeles, Los Angeles, CA, 90404, USA
| | - Kyung Hyun Sung
- Department of Radiology, The University of California Los Angeles, Los Angeles, CA, 90404, USA
| | - Harrison Kim
- Department of Radiology, The University of Alabama at Birmingham, 1720 2Nd Avenue South, VH G082, Birmingham, AL, 35294, USA.
| |
Collapse
|
5
|
Xie J, Wei J, Shi H, Lin Z, Lu J, Zhang X, Wan C. A deep learning approach for early prediction of breast cancer neoadjuvant chemotherapy response on multistage bimodal ultrasound images. BMC Med Imaging 2025; 25:26. [PMID: 39849366 PMCID: PMC11758756 DOI: 10.1186/s12880-024-01543-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Accepted: 12/19/2024] [Indexed: 01/25/2025] Open
Abstract
Neoadjuvant chemotherapy (NAC) is a systemic and systematic chemotherapy regimen for breast cancer patients before surgery. However, NAC is not effective for everyone, and the process is excruciating. Therefore, accurate early prediction of the efficacy of NAC is essential for the clinical diagnosis and treatment of patients. In this study, a novel convolutional neural network model with bimodal layer-wise feature fusion module (BLFFM) and temporal hybrid attention module (THAM) is proposed, which uses multistage bimodal ultrasound images as input for early prediction of the efficacy of neoadjuvant chemotherapy in locally advanced breast cancer (LABC) patients. The BLFFM can effectively mine the highly complex correlation and complementary feature information between gray-scale ultrasound (GUS) and color Doppler blood flow imaging (CDFI). The THAM is able to focus on key features of lesion progression before and after one cycle of NAC. The GUS and CDFI videos of 101 patients collected from cooperative medical institutions were preprocessed to obtain 3000 sets of multistage bimodal ultrasound image combinations for experiments. The experimental results show that the proposed model is effective and outperforms the compared models. The code will be published on the https://github.com/jinzhuwei/BLTA-CNN .
Collapse
Affiliation(s)
- Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
| | - Jinzhu Wei
- School of Medicine, Shanghai University, Shanghai, 200444, China
| | - Huachan Shi
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
| | - Zhe Lin
- School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
| | - Jinsong Lu
- Department of Ultrasound, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China.
| | - Xueqing Zhang
- Department of Pathology, Renji Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China.
| | - Caifeng Wan
- Department of Ultrasound, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China.
- Department of Breast Surgery, School of Medicine, Renji Hospital, Shanghai Jiao Tong University, Shanghai, 200030, China.
| |
Collapse
|
6
|
He Y, Li B, He R, Fu G, Sun D, Shan D, Zhang Z. Adaptive fusion of dual-view for grading prostate cancer. Comput Med Imaging Graph 2025; 119:102479. [PMID: 39708679 DOI: 10.1016/j.compmedimag.2024.102479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 11/19/2024] [Accepted: 12/10/2024] [Indexed: 12/23/2024]
Abstract
Accurate preoperative grading of prostate cancer is crucial for assisted diagnosis. Multi-parametric magnetic resonance imaging (MRI) is a commonly used non-invasive approach, however, the interpretation of MRI images is still subject to significant subjectivity due to variations in physicians' expertise and experience. To achieve accurate, non-invasive, and efficient grading of prostate cancer, this paper proposes a deep learning method that adaptively fuses dual-view MRI images. Specifically, a dual-view adaptive fusion model is designed. The model employs encoders to extract embedded features from two MRI sequences: T2-weighted imaging (T2WI) and apparent diffusion coefficient (ADC). The model reconstructs the original input images using the embedded features and adopts a cross-embedding fusion module to adaptively fuse the embedded features from the two views. Adaptive fusion refers to dynamically adjusting the fusion weights of the features from the two views according to different input samples, thereby fully utilizing complementary information. Furthermore, the model adaptively weights the prediction results from the two views based on uncertainty estimation, further enhancing the grading performance. To verify the importance of effective multi-view fusion for prostate cancer grading, extensive experiments are designed. The experiments evaluate the performance of single-view models, dual-view models, and state-of-the-art multi-view fusion algorithms. The results demonstrate that the proposed dual-view adaptive fusion method achieves the best grading performance, confirming its effectiveness for assisted grading diagnosis of prostate cancer. This study provides a novel deep learning solution for preoperative grading of prostate cancer, which has the potential to assist clinical physicians in making more accurate diagnostic decisions and has significant clinical application value.
Collapse
Affiliation(s)
- Yaolin He
- Department of Oncology, The Second Affiliated Hospital, Hengyang Medical School, University of South China, Hengyang, 421001, China.
| | - Bowen Li
- Department of Radiology, The Second Affiliated Hospital, Hengyang Medical School, University of South China, Hengyang, 421001, China.
| | - Ruimin He
- Department of Oncology, The Second Affiliated Hospital, Hengyang Medical School, University of South China, Hengyang, 421001, China.
| | - Guangming Fu
- Department of Oncology, The Second Xiangya Hospital of Central South University, Changsha, 410011, China.
| | - Dan Sun
- Department of Electrical & Systems Engineering, Washington University in St. Louis, St. Louis, MO 63112, USA.
| | - Dongyong Shan
- Department of Oncology, The Second Xiangya Hospital of Central South University, Changsha, 410011, China.
| | - Zijian Zhang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, China; Department of Oncology, Xiangya Hospital, Central South University, Changsha, 410008, China.
| |
Collapse
|
7
|
Shen C, Li W, Chen H, Wang X, Zhu F, Li Y, Wang X, Jin B. Complementary information mutual learning for multimodality medical image segmentation. Neural Netw 2024; 180:106670. [PMID: 39299035 DOI: 10.1016/j.neunet.2024.106670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/10/2024] [Accepted: 08/26/2024] [Indexed: 09/22/2024]
Abstract
Radiologists must utilize medical images of multiple modalities for tumor segmentation and diagnosis due to the limitations of medical imaging technology and the diversity of tumor signals. This has led to the development of multimodal learning in medical image segmentation. However, the redundancy among modalities creates challenges for existing subtraction-based joint learning methods, such as misjudging the importance of modalities, ignoring specific modal information, and increasing cognitive load. These thorny issues ultimately decrease segmentation accuracy and increase the risk of overfitting. This paper presents the complementary information mutual learning (CIML) framework, which can mathematically model and address the negative impact of inter-modal redundant information. CIML adopts the idea of addition and removes inter-modal redundant information through inductive bias-driven task decomposition and message passing-based redundancy filtering. CIML first decomposes the multimodal segmentation task into multiple subtasks based on expert prior knowledge, minimizing the information dependence between modalities. Furthermore, CIML introduces a scheme in which each modality can extract information from other modalities additively through message passing. To achieve non-redundancy of extracted information, the redundant filtering is transformed into complementary information learning inspired by the variational information bottleneck. The complementary information learning procedure can be efficiently solved by variational inference and cross-modal spatial attention. Numerical results from the verification task and standard benchmarks indicate that CIML efficiently removes redundant information between modalities, outperforming SOTA methods regarding validation accuracy and segmentation effect. To emphasize, message-passing-based redundancy filtering allows neural network visualization techniques to visualize the knowledge relationship among different modalities, which reflects interpretability.
Collapse
Affiliation(s)
- Chuyun Shen
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.
| | - Wenhao Li
- School of Data Science, The Chinese University of Hong Kong, Shenzhen Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen 518172, China.
| | - Haoqing Chen
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.
| | - Xiaoling Wang
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.
| | - Fengping Zhu
- Huashan Hospital Fudan University, Shanghai 200040, China.
| | - Yuxin Li
- Huashan Hospital Fudan University, Shanghai 200040, China.
| | - Xiangfeng Wang
- School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.
| | - Bo Jin
- School of Software Engineering, Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai 200092, China.
| |
Collapse
|
8
|
Su J, Luo Z, Wang C, Lian S, Lin X, Li S. Reconstruct incomplete relation for incomplete modality brain tumor segmentation. Neural Netw 2024; 180:106657. [PMID: 39186839 DOI: 10.1016/j.neunet.2024.106657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 08/14/2024] [Accepted: 08/19/2024] [Indexed: 08/28/2024]
Abstract
Different brain tumor magnetic resonance imaging (MRI) modalities provide diverse tumor-specific information. Previous works have enhanced brain tumor segmentation performance by integrating multiple MRI modalities. However, multi-modal MRI data are often unavailable in clinical practice. An incomplete modality leads to missing tumor-specific information, which degrades the performance of existing models. Various strategies have been proposed to transfer knowledge from a full modality network (teacher) to an incomplete modality one (student) to address this issue. However, they neglect the fact that brain tumor segmentation is a structural prediction problem that requires voxel semantic relations. In this paper, we propose a Reconstruct Incomplete Relation Network (RIRN) that transfers voxel semantic relational knowledge from the teacher to the student. Specifically, we propose two types of voxel relations to incorporate structural knowledge: Class-relative relations (CRR) and Class-agnostic relations (CAR). The CRR groups voxels into different tumor regions and constructs a relation between them. The CAR builds a global relation between all voxel features, complementing the local inter-region relation. Moreover, we use adversarial learning to align the holistic structural prediction between the teacher and the student. Extensive experimentation on both the BraTS 2018 and BraTS 2020 datasets establishes that our method outperforms all state-of-the-art approaches.
Collapse
Affiliation(s)
- Jiawei Su
- School of Computer Engineering, Jimei University, Xiamen, China; The Department of Artificial Intelligence, Xiamen University, Fujian, China
| | - Zhiming Luo
- The Department of Artificial Intelligence, Xiamen University, Fujian, China.
| | - Chengji Wang
- The School of Computer Science, Central China Normal University, Wuhan, China
| | - Sheng Lian
- The College of Computer and Data Science, Fuzhou University, Fujian, China
| | - Xuejuan Lin
- The Department of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fujian, China
| | - Shaozi Li
- The Department of Artificial Intelligence, Xiamen University, Fujian, China
| |
Collapse
|
9
|
Gaillard L, Tjaberinga MC, Dremmen MHG, Mathijssen IMJ, Vrooman HA. Brain volume in infants with metopic synostosis: Less white matter volume with an accelerated growth pattern in early life. J Anat 2024; 245:894-902. [PMID: 38417842 PMCID: PMC11547220 DOI: 10.1111/joa.14028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/30/2024] [Accepted: 02/05/2024] [Indexed: 03/01/2024] Open
Abstract
Metopic synostosis patients are at risk for neurodevelopmental disorders despite a negligible risk of intracranial hypertension. To gain insight into the underlying pathophysiology of metopic synostosis and associated neurodevelopmental disorders, we aimed to investigate brain volumes of non-syndromic metopic synostosis patients using preoperative MRI brain scans. MRI brain scans were processed with HyperDenseNet to calculate total intracranial volume (TIV), total brain volume (TBV), total grey matter volume (TGMV), total white matter volume (TWMV) and total cerebrospinal fluid volume (TCBFV). We compared global brain volumes of patients with controls corrected for age and sex using linear regression. Lobe-specific grey matter volumes were assessed in secondary analyses. We included 45 metopic synostosis patients and 14 controls (median age at MRI 0.56 years [IQR 0.36] and 1.1 years [IQR 0.47], respectively). We found no significant differences in TIV, TBV, TGMV or TCBFV in patients compared to controls. TWMV was significantly smaller in patients (-62,233 mm3 [95% CI = -96,968; -27,498], Holm-corrected p = 0.004), and raw data show an accelerated growth pattern of white matter in metopic synostosis patients. Grey matter volume analyses per lobe indicated increased cingulate (1378 mm3 [95% CI = 402; 2355]) and temporal grey matter (4747 [95% CI = 178; 9317]) volumes in patients compared to controls. To conclude, we found smaller TWMV with an accelerated white matter growth pattern in metopic synostosis patients, similar to white matter growth patterns seen in autism. TIV, TBV, TGMV and TCBFV were comparable in patients and controls. Secondary analyses suggest larger cingulate and temporal lobe volumes. These findings suggest a generalized intrinsic brain anomaly in the pathophysiology of neurodevelopmental disorders associated with metopic synostosis.
Collapse
Affiliation(s)
- L. Gaillard
- Department of Plastic and Reconstructive Surgery and Hand SurgeryErasmus MC—Sophia Children's Hospital, University Medical Center RotterdamRotterdamThe Netherlands
| | - M. C. Tjaberinga
- Department of Plastic and Reconstructive Surgery and Hand SurgeryErasmus MC—Sophia Children's Hospital, University Medical Center RotterdamRotterdamThe Netherlands
| | - M. H. G. Dremmen
- Department of Radiology and Nuclear MedicineErasmus MC—Sophia Children's Hospital, University Medical Center RotterdamRotterdamThe Netherlands
| | - I. M. J. Mathijssen
- Department of Plastic and Reconstructive Surgery and Hand SurgeryErasmus MC—Sophia Children's Hospital, University Medical Center RotterdamRotterdamThe Netherlands
| | - H. A. Vrooman
- Department of Radiology and Nuclear MedicineErasmus MC—Sophia Children's Hospital, University Medical Center RotterdamRotterdamThe Netherlands
| |
Collapse
|
10
|
Lu Y, Gao H, Qiu J, Qiu Z, Liu J, Bai X. DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT. Comput Med Imaging Graph 2024; 118:102462. [PMID: 39556905 DOI: 10.1016/j.compmedimag.2024.102462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 10/14/2024] [Accepted: 11/03/2024] [Indexed: 11/20/2024]
Abstract
This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it's anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.
Collapse
Affiliation(s)
- Yi Lu
- Image Processing Center, Beihang University, Beijing 102206, China
| | - Hongjian Gao
- Image Processing Center, Beihang University, Beijing 102206, China
| | - Jikuan Qiu
- Department of Otolaryngology, Head and Neck Surgery, Peking University First Hospital, Beijing 100034, China
| | - Zihan Qiu
- Department of Otorhinolaryngology, Head and Neck Surgery, The Sixth Affiliated Hospital of Sun Yat-sen University, Sun Yat-sen University, Guangzhou 510655, China
| | - Junxiu Liu
- Department of Otolaryngology, Head and Neck Surgery, Peking University First Hospital, Beijing 100034, China.
| | - Xiangzhi Bai
- Image Processing Center, Beihang University, Beijing 102206, China; The State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China; Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing 100191, China.
| |
Collapse
|
11
|
Yang H, Song Y, Li Y, Hong Z, Liu J, Li J, Zhang D, Fu L, Lu J, Qiu L. A Dual-Branch Residual Network with Attention Mechanisms for Enhanced Classification of Vaginal Lesions in Colposcopic Images. Bioengineering (Basel) 2024; 11:1182. [PMID: 39768001 PMCID: PMC11673476 DOI: 10.3390/bioengineering11121182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 11/15/2024] [Accepted: 11/20/2024] [Indexed: 01/11/2025] Open
Abstract
Vaginal intraepithelial neoplasia (VAIN), linked to HPV infection, is a condition that is often overlooked during colposcopy, especially in the vaginal vault area, as clinicians tend to focus more on cervical lesions. This oversight can lead to missed or delayed diagnosis and treatment for patients with VAIN. Timely and accurate classification of VAIN plays a crucial role in the evaluation of vaginal lesions and the formulation of effective diagnostic approaches. The challenge is the high similarity between different classes and the low variability in the same class in colposcopic images, which can affect the accuracy, precision, and recall rates, depending on the image quality and the clinician's experience. In this study, a dual-branch lesion-aware residual network (DLRNet), designed for small medical sample sizes, is introduced, which classifies vaginal lesions by examining the relationship between cervical and vaginal lesions. The DLRNet model includes four main components: a lesion localization module, a dual-branch classification module, an attention-guidance module, and a pretrained network module. The dual-branch classification module combines the original images with segmentation maps obtained from the lesion localization module using a pretrained ResNet network to fine-tune parameters at different levels, explore lesion-specific features from both global and local perspectives, and facilitate layered interactions. The feature guidance module focuses the local branch network on vaginal-specific features by using spatial and channel attention mechanisms. The final integration involves a shared feature extraction module and independent fully connected layers, which represent and merge the dual-branch inputs. The weighted fusion method effectively integrates multiple inputs, enhancing the discriminative and generalization capabilities of the model. Classification experiments on 1142 collected colposcopic images demonstrate that this method raises the existing classification levels, achieving the classification of VAIN into three lesion grades, thus providing a valuable tool for the early screening of vaginal diseases.
Collapse
Affiliation(s)
- Haima Yang
- School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
- Key Laboratory of Space Active Opto-Electronics Technology, Chinese Academy of Sciences, Shanghai 200083, China
| | - Yeye Song
- School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Yuling Li
- Department of Obstetrics and Gynecology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200030, China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200030, China
- Department of Obstetrics and Gynecology, Shanxi Bethune Hospital, Taiyuan 050081, China
| | - Zubei Hong
- Department of Obstetrics and Gynecology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200030, China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200030, China
| | - Jin Liu
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Jun Li
- School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Dawei Zhang
- School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Le Fu
- Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai 200092, China
| | - Jinyu Lu
- School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Lihua Qiu
- Department of Obstetrics and Gynecology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200030, China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200030, China
| |
Collapse
|
12
|
Pan Y, Yong H, Lu W, Li G, Cong J. Brain tumor segmentation by combining MultiEncoder UNet with wavelet fusion. J Appl Clin Med Phys 2024; 25:e14527. [PMID: 39284311 PMCID: PMC11540057 DOI: 10.1002/acm2.14527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 04/28/2024] [Accepted: 08/07/2024] [Indexed: 11/07/2024] Open
Abstract
BACKGROUND AND OBJECTIVE Accurate segmentation of brain tumors from multimodal magnetic resonance imaging (MRI) holds significant importance in clinical diagnosis and surgical intervention, while current deep learning methods cope with situations of multimodal MRI by an early fusion strategy that implicitly assumes that the modal relationships are linear, which tends to ignore the complementary information between modalities, negatively impacting the model's performance. Meanwhile, long-range relationships between voxels cannot be captured due to the localized character of the convolution procedure. METHOD Aiming at this problem, we propose a multimodal segmentation network based on a late fusion strategy that employs multiple encoders and a decoder for the segmentation of brain tumors. Each encoder is specialized for processing distinct modalities. Notably, our framework includes a feature fusion module based on a 3D discrete wavelet transform aimed at extracting complementary features among the encoders. Additionally, a 3D global context-aware module was introduced to capture the long-range dependencies of tumor voxels at a high level of features. The decoder combines fused and global features to enhance the network's segmentation performance. RESULT Our proposed model is experimented on the publicly available BraTS2018 and BraTS2021 datasets. The experimental results show competitiveness with state-of-the-art methods. CONCLUSION The results demonstrate that our approach applies a novel concept for multimodal fusion within deep neural networks and delivers more accurate and promising brain tumor segmentation, with the potential to assist physicians in diagnosis.
Collapse
Affiliation(s)
- Yuheng Pan
- Computer and Information Engineering DepartmentTianjin Chengjian UniversityTianjinChina
| | - Haohan Yong
- Computer and Information Engineering DepartmentTianjin Chengjian UniversityTianjinChina
| | - Weijia Lu
- Computer and Information Engineering DepartmentTianjin Chengjian UniversityTianjinChina
| | - Guoyan Li
- Computer and Information Engineering DepartmentTianjin Chengjian UniversityTianjinChina
| | - Jia Cong
- Computer and Information Engineering DepartmentTianjin Chengjian UniversityTianjinChina
| |
Collapse
|
13
|
Zhuang Y, Liu H, Fang W, Ma G, Sun S, Zhu Y, Zhang X, Ge C, Chen W, Long J, Song E. A 3D hierarchical cross-modality interaction network using transformers and convolutions for brain glioma segmentation in MR images. Med Phys 2024; 51:8371-8389. [PMID: 39137295 DOI: 10.1002/mp.17354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 06/20/2024] [Accepted: 08/02/2024] [Indexed: 08/15/2024] Open
Abstract
BACKGROUND Precise glioma segmentation from multi-parametric magnetic resonance (MR) images is essential for brain glioma diagnosis. However, due to the indistinct boundaries between tumor sub-regions and the heterogeneous appearances of gliomas in volumetric MR scans, designing a reliable and automated glioma segmentation method is still challenging. Although existing 3D Transformer-based or convolution-based segmentation networks have obtained promising results via multi-modal feature fusion strategies or contextual learning methods, they widely lack the capability of hierarchical interactions between different modalities and cannot effectively learn comprehensive feature representations related to all glioma sub-regions. PURPOSE To overcome these problems, in this paper, we propose a 3D hierarchical cross-modality interaction network (HCMINet) using Transformers and convolutions for accurate multi-modal glioma segmentation, which leverages an effective hierarchical cross-modality interaction strategy to sufficiently learn modality-specific and modality-shared knowledge correlated to glioma sub-region segmentation from multi-parametric MR images. METHODS In the HCMINet, we first design a hierarchical cross-modality interaction Transformer (HCMITrans) encoder to hierarchically encode and fuse heterogeneous multi-modal features by Transformer-based intra-modal embeddings and inter-modal interactions in multiple encoding stages, which effectively captures complex cross-modality correlations while modeling global contexts. Then, we collaborate an HCMITrans encoder with a modality-shared convolutional encoder to construct the dual-encoder architecture in the encoding stage, which can learn the abundant contextual information from global and local perspectives. Finally, in the decoding stage, we present a progressive hybrid context fusion (PHCF) decoder to progressively fuse local and global features extracted by the dual-encoder architecture, which utilizes the local-global context fusion (LGCF) module to efficiently alleviate the contextual discrepancy among the decoding features. RESULTS Extensive experiments are conducted on two public and competitive glioma benchmark datasets, including the BraTS2020 dataset with 494 patients and the BraTS2021 dataset with 1251 patients. Results show that our proposed method outperforms existing Transformer-based and CNN-based methods using other multi-modal fusion strategies in our experiments. Specifically, the proposed HCMINet achieves state-of-the-art mean DSC values of 85.33% and 91.09% on the BraTS2020 online validation dataset and the BraTS2021 local testing dataset, respectively. CONCLUSIONS Our proposed method can accurately and automatically segment glioma regions from multi-parametric MR images, which is beneficial for the quantitative analysis of brain gliomas and helpful for reducing the annotation burden of neuroradiologists.
Collapse
Affiliation(s)
- Yuzhou Zhuang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Liu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Wei Fang
- Wuhan Zhongke Industrial Research Institute of Medical Science Co., Ltd, Wuhan, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Sisi Sun
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yunfeng Zhu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xu Zhang
- Wuhan United Imaging Healthcare Surgical Technology Co., Ltd, Wuhan, China
| | - Chuanbin Ge
- Wuhan United Imaging Healthcare Surgical Technology Co., Ltd, Wuhan, China
| | - Wenyang Chen
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Jiaosong Long
- School of Art and Design, Hubei University of Technology, Wuhan, China
| | - Enmin Song
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
14
|
Zhong T, Wang Y, Xu X, Wu X, Liang S, Ning Z, Wang L, Niu Y, Li G, Zhang Y. A brain subcortical segmentation tool based on anatomy attentional fusion network for developing macaques. Comput Med Imaging Graph 2024; 116:102404. [PMID: 38870599 DOI: 10.1016/j.compmedimag.2024.102404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 05/21/2024] [Accepted: 05/22/2024] [Indexed: 06/15/2024]
Abstract
Magnetic Resonance Imaging (MRI) plays a pivotal role in the accurate measurement of brain subcortical structures in macaques, which is crucial for unraveling the complexities of brain structure and function, thereby enhancing our understanding of neurodegenerative diseases and brain development. However, due to significant differences in brain size, structure, and imaging characteristics between humans and macaques, computational tools developed for human neuroimaging studies often encounter obstacles when applied to macaques. In this context, we propose an Anatomy Attentional Fusion Network (AAF-Net), which integrates multimodal MRI data with anatomical constraints in a multi-scale framework to address the challenges posed by the dynamic development, regional heterogeneity, and age-related size variations of the juvenile macaque brain, thus achieving precise subcortical segmentation. Specifically, we generate a Signed Distance Map (SDM) based on the initial rough segmentation of the subcortical region by a network as an anatomical constraint, providing comprehensive information on positions, structures, and morphology. Then we construct AAF-Net to fully fuse the SDM anatomical constraints and multimodal images for refined segmentation. To thoroughly evaluate the performance of our proposed tool, over 700 macaque MRIs from 19 datasets were used in this study. Specifically, we employed two manually labeled longitudinal macaque datasets to develop the tool and complete four-fold cross-validations. Furthermore, we incorporated various external datasets to demonstrate the proposed tool's generalization capabilities and promise in brain development research. We have made this tool available as an open-source resource at https://github.com/TaoZhong11/Macaque_subcortical_segmentation for direct application.
Collapse
Affiliation(s)
- Tao Zhong
- School of Biomedical Engineering, Guangdong Provincial Key Laboratory of Medical Image Processing and Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, China
| | - Ya Wang
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, USA
| | - Xiaotong Xu
- School of Biomedical Engineering, Guangdong Provincial Key Laboratory of Medical Image Processing and Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, China
| | - Xueyang Wu
- School of Biomedical Engineering, Guangdong Provincial Key Laboratory of Medical Image Processing and Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, China
| | - Shujun Liang
- School of Biomedical Engineering, Guangdong Provincial Key Laboratory of Medical Image Processing and Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, China
| | - Zhenyuan Ning
- School of Biomedical Engineering, Guangdong Provincial Key Laboratory of Medical Image Processing and Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, China
| | - Li Wang
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, USA
| | - Yuyu Niu
- Yunnan Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, China
| | - Gang Li
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, USA.
| | - Yu Zhang
- School of Biomedical Engineering, Guangdong Provincial Key Laboratory of Medical Image Processing and Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, China.
| |
Collapse
|
15
|
Tian X, Ye J, Zhang T, Zhang L, Liu X, Fu F, Shi X, Xu C. Multi-Path Fusion in SFCF-Net for Enhanced Multi-Frequency Electrical Impedance Tomography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2814-2824. [PMID: 38536679 DOI: 10.1109/tmi.2024.3382338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
Multi-frequency electrical impedance tomography (mfEIT) offers a nondestructive imaging technology that reconstructs the distribution of electrical characteristics within a subject based on the impedance spectral differences among biological tissues. However, the technology faces challenges in imaging multi-class lesion targets when the conductivity of background tissues is frequency-dependent. To address these issues, we propose a spatial-frequency cross-fusion network (SFCF-Net) imaging algorithm, built on a multi-path fusion structure. This algorithm uses multi-path structures and hyper-dense connections to capture both spatial and frequency correlations between multi-frequency conductivity images, which achieves differential imaging for lesion targets of multiple categories through cross-fusion of information. According to both simulation and physical experiment results, the proposed SFCF-Net algorithm shows an excellent performance in terms of lesion imaging and category discrimination compared to the weighted frequency-difference, U-Net, and MMV-Net algorithms. The proposed algorithm enhances the ability of mfEIT to simultaneously obtain both structural and spectral information from the tissue being examined and improves the accuracy and reliability of mfEIT, opening new avenues for its application in clinical diagnostics and treatment monitoring.
Collapse
|
16
|
Li Y, El Habib Daho M, Conze PH, Zeghlache R, Le Boité H, Tadayoni R, Cochener B, Lamard M, Quellec G. A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput Biol Med 2024; 177:108635. [PMID: 38796881 DOI: 10.1016/j.compbiomed.2024.108635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/18/2024] [Accepted: 05/18/2024] [Indexed: 05/29/2024]
Abstract
Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.
Collapse
Affiliation(s)
- Yihao Li
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | - Mostafa El Habib Daho
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France.
| | | | - Rachid Zeghlache
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | - Hugo Le Boité
- Sorbonne University, Paris, France; Ophthalmology Department, Lariboisière Hospital, AP-HP, Paris, France
| | - Ramin Tadayoni
- Ophthalmology Department, Lariboisière Hospital, AP-HP, Paris, France; Paris Cité University, Paris, France
| | - Béatrice Cochener
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France; Ophthalmology Department, CHRU Brest, Brest, France
| | - Mathieu Lamard
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | | |
Collapse
|
17
|
Muthusivarajan R, Celaya A, Yung JP, Long JP, Viswanath SE, Marcus DS, Chung C, Fuentes D. Evaluating the relationship between magnetic resonance image quality metrics and deep learning-based segmentation accuracy of brain tumors. Med Phys 2024; 51:4898-4906. [PMID: 38640464 PMCID: PMC11233231 DOI: 10.1002/mp.17059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 01/16/2024] [Accepted: 02/25/2024] [Indexed: 04/21/2024] Open
Abstract
BACKGROUND Magnetic resonance imaging (MRI) scans are known to suffer from a variety of acquisition artifacts as well as equipment-based variations that impact image appearance and segmentation performance. It is still unclear whether a direct relationship exists between magnetic resonance (MR) image quality metrics (IQMs) (e.g., signal-to-noise, contrast-to-noise) and segmentation accuracy. PURPOSE Deep learning (DL) approaches have shown significant promise for automated segmentation of brain tumors on MRI but depend on the quality of input training images. We sought to evaluate the relationship between IQMs of input training images and DL-based brain tumor segmentation accuracy toward developing more generalizable models for multi-institutional data. METHODS We trained a 3D DenseNet model on the BraTS 2020 cohorts for segmentation of tumor subregions enhancing tumor (ET), peritumoral edematous, and necrotic and non-ET on MRI; with performance quantified via a 5-fold cross-validated Dice coefficient. MRI scans were evaluated through the open-source quality control tool MRQy, to yield 13 IQMs per scan. The Pearson correlation coefficient was computed between whole tumor (WT) dice values and IQM measures in the training cohorts to identify quality measures most correlated with segmentation performance. Each selected IQM was used to group MRI scans as "better" quality (BQ) or "worse" quality (WQ), via relative thresholding. Segmentation performance was re-evaluated for the DenseNet model when (i) training on BQ MRI images with validation on WQ images, as well as (ii) training on WQ images, and validation on BQ images. Trends were further validated on independent test sets derived from the BraTS 2021 training cohorts. RESULTS For this study, multimodal MRI scans from the BraTS 2020 training cohorts were used to train the segmentation model and validated on independent test sets derived from the BraTS 2021 cohort. Among the selected IQMs, models trained on BQ images based on inhomogeneity measurements (coefficient of variance, coefficient of joint variation, coefficient of variation of the foreground patch) and the models trained on WQ images based on noise measurement peak signal-to-noise ratio (SNR) yielded significantly improved tumor segmentation accuracy compared to their inverse models. CONCLUSIONS Our results suggest that a significant correlation may exist between specific MR IQMs and DenseNet-based brain tumor segmentation performance. The selection of MRI scans for model training based on IQMs may yield more accurate and generalizable models in unseen validation.
Collapse
Affiliation(s)
| | - Adrian Celaya
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030 USA
- Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005 USA
| | - Joshua P. Yung
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030 USA
| | - James P Long
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77030 USA
| | - Satish E. Viswanath
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Daniel S. Marcus
- Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Caroline Chung
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030 USA
| | - David Fuentes
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030 USA
| |
Collapse
|
18
|
Kondejkar T, Al-Heejawi SMA, Breggia A, Ahmad B, Christman R, Ryan ST, Amal S. Multi-Scale Digital Pathology Patch-Level Prostate Cancer Grading Using Deep Learning: Use Case Evaluation of DiagSet Dataset. Bioengineering (Basel) 2024; 11:624. [PMID: 38927860 PMCID: PMC11200755 DOI: 10.3390/bioengineering11060624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/03/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
Prostate cancer remains a prevalent health concern, emphasizing the critical need for early diagnosis and precise treatment strategies to mitigate mortality rates. The accurate prediction of cancer grade is paramount for timely interventions. This paper introduces an approach to prostate cancer grading, framing it as a classification problem. Leveraging ResNet models on multi-scale patch-level digital pathology and the Diagset dataset, the proposed method demonstrates notable success, achieving an accuracy of 0.999 in identifying clinically significant prostate cancer. The study contributes to the evolving landscape of cancer diagnostics, offering a promising avenue for improved grading accuracy and, consequently, more effective treatment planning. By integrating innovative deep learning techniques with comprehensive datasets, our approach represents a step forward in the pursuit of personalized and targeted cancer care.
Collapse
Affiliation(s)
- Tanaya Kondejkar
- College of Engineering, Northeastern University, Boston, MA 02115, USA; (T.K.); (S.M.A.A.-H.)
| | | | - Anne Breggia
- MaineHealth Institute for Research, Scarborough, ME 04074, USA;
| | - Bilal Ahmad
- Maine Medical Center, Portland, ME 04102, USA; (B.A.); (R.C.); (S.T.R.)
| | - Robert Christman
- Maine Medical Center, Portland, ME 04102, USA; (B.A.); (R.C.); (S.T.R.)
| | - Stephen T. Ryan
- Maine Medical Center, Portland, ME 04102, USA; (B.A.); (R.C.); (S.T.R.)
| | - Saeed Amal
- The Roux Institute, Department of Bioengineering, College of Engineering, Northeastern University, Boston, MA 02115, USA
| |
Collapse
|
19
|
Cheng Z, Wang S, Xin T, Zhou T, Zhang H, Shao L. Few-Shot Medical Image Segmentation via Generating Multiple Representative Descriptors. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2202-2214. [PMID: 38265915 DOI: 10.1109/tmi.2024.3358295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Automatic medical image segmentation has witnessed significant development with the success of large models on massive datasets. However, acquiring and annotating vast medical image datasets often proves to be impractical due to the time consumption, specialized expertise requirements, and compliance with patient privacy standards, etc. As a result, Few-shot Medical Image Segmentation (FSMIS) has become an increasingly compelling research direction. Conventional FSMIS methods usually learn prototypes from support images and apply nearest-neighbor searching to segment the query images. However, only a single prototype cannot well represent the distribution of each class, thus leading to restricted performance. To address this problem, we propose to Generate Multiple Representative Descriptors (GMRD), which can comprehensively represent the commonality within the corresponding class distribution. In addition, we design a Multiple Affinity Maps based Prediction (MAMP) module to fuse the multiple affinity maps generated by the aforementioned descriptors. Furthermore, to address intra-class variation and enhance the representativeness of descriptors, we introduce two novel losses. Notably, our model is structured as a dual-path design to achieve a balance between foreground and background differences in medical images. Extensive experiments on four publicly available medical image datasets demonstrate that our method outperforms the state-of-the-art methods, and the detailed analysis also verifies the effectiveness of our designed module.
Collapse
|
20
|
Luo J, Dai P, He Z, Huang Z, Liao S, Liu K. Deep learning models for ischemic stroke lesion segmentation in medical images: A survey. Comput Biol Med 2024; 175:108509. [PMID: 38677171 DOI: 10.1016/j.compbiomed.2024.108509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/09/2024] [Accepted: 04/21/2024] [Indexed: 04/29/2024]
Abstract
This paper provides a comprehensive review of deep learning models for ischemic stroke lesion segmentation in medical images. Ischemic stroke is a severe neurological disease and a leading cause of death and disability worldwide. Accurate segmentation of stroke lesions in medical images such as MRI and CT scans is crucial for diagnosis, treatment planning and prognosis. This paper first introduces common imaging modalities used for stroke diagnosis, discussing their capabilities in imaging lesions at different disease stages from the acute to chronic stage. It then reviews three major public benchmark datasets for evaluating stroke segmentation algorithms: ATLAS, ISLES and AISD, highlighting their key characteristics. The paper proceeds to provide an overview of foundational deep learning architectures for medical image segmentation, including CNN-based and transformer-based models. It summarizes recent innovations in adapting these architectures to the task of stroke lesion segmentation across the three datasets, analyzing their motivations, modifications and results. A survey of loss functions and data augmentations employed for this task is also included. The paper discusses various aspects related to stroke segmentation tasks, including prior knowledge, small lesions, and multimodal fusion, and then concludes by outlining promising future research directions. Overall, this comprehensive review covers critical technical developments in the field to support continued progress in automated stroke lesion segmentation.
Collapse
Affiliation(s)
- Jialin Luo
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Peishan Dai
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China.
| | - Zhuang He
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Zhongchao Huang
- Department of Biomedical Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Shenghui Liao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Kun Liu
- Brain Hospital of Hunan Province (The Second People's Hospital of Hunan Province), Changsha, Hunan, China
| |
Collapse
|
21
|
Yang Y, Yue S, Quan H. CS-UNet: Cross-scale U-Net with Semantic-position dependencies for retinal vessel segmentation. NETWORK (BRISTOL, ENGLAND) 2024; 35:134-153. [PMID: 38050997 DOI: 10.1080/0954898x.2023.2288858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/23/2023] [Indexed: 12/07/2023]
Abstract
Accurate retinal vessel segmentation is the prerequisite for early recognition and treatment of retina-related diseases. However, segmenting retinal vessels is still challenging due to the intricate vessel tree in fundus images, which has a significant number of tiny vessels, low contrast, and lesion interference. For this task, the u-shaped architecture (U-Net) has become the de-facto standard and has achieved considerable success. However, U-Net is a pure convolutional network, which usually shows limitations in global modelling. In this paper, we propose a novel Cross-scale U-Net with Semantic-position Dependencies (CS-UNet) for retinal vessel segmentation. In particular, we first designed a Semantic-position Dependencies Aggregator (SPDA) and incorporate it into each layer of the encoder to better focus on global contextual information by integrating the relationship of semantic and position. To endow the model with the capability of cross-scale interaction, the Cross-scale Relation Refine Module (CSRR) is designed to dynamically select the information associated with the vessels, which helps guide the up-sampling operation. Finally, we have evaluated CS-UNet on three public datasets: DRIVE, CHASE_DB1, and STARE. Compared to most existing state-of-the-art methods, CS-UNet demonstrated better performance.
Collapse
Affiliation(s)
- Ying Yang
- College of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, China
| | - Shengbin Yue
- College of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, China
- Yunnan Provincial Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan, China
| | - Haiyan Quan
- College of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, China
| |
Collapse
|
22
|
Chen Q, Zhang J, Meng R, Zhou L, Li Z, Feng Q, Shen D. Modality-Specific Information Disentanglement From Multi-Parametric MRI for Breast Tumor Segmentation and Computer-Aided Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1958-1971. [PMID: 38206779 DOI: 10.1109/tmi.2024.3352648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
Breast cancer is becoming a significant global health challenge, with millions of fatalities annually. Magnetic Resonance Imaging (MRI) can provide various sequences for characterizing tumor morphology and internal patterns, and becomes an effective tool for detection and diagnosis of breast tumors. However, previous deep-learning based tumor segmentation methods from multi-parametric MRI still have limitations in exploring inter-modality information and focusing task-informative modality/modalities. To address these shortcomings, we propose a Modality-Specific Information Disentanglement (MoSID) framework to extract both inter- and intra-modality attention maps as prior knowledge for guiding tumor segmentation. Specifically, by disentangling modality-specific information, the MoSID framework provides complementary clues for the segmentation task, by generating modality-specific attention maps to guide modality selection and inter-modality evaluation. Our experiments on two 3D breast datasets and one 2D prostate dataset demonstrate that the MoSID framework outperforms other state-of-the-art multi-modality segmentation methods, even in the cases of missing modalities. Based on the segmented lesions, we further train a classifier to predict the patients' response to radiotherapy. The prediction accuracy is comparable to the case of using manually-segmented tumors for treatment outcome prediction, indicating the robustness and effectiveness of the proposed segmentation method. The code is available at https://github.com/Qianqian-Chen/MoSID.
Collapse
|
23
|
Zhang W, Tao Y, Huang Z, Li Y, Chen Y, Song T, Ma X, Zhang Y. Multi-phase features interaction transformer network for liver tumor segmentation and microvascular invasion assessment in contrast-enhanced CT. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:5735-5761. [PMID: 38872556 DOI: 10.3934/mbe.2024253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Precise segmentation of liver tumors from computed tomography (CT) scans is a prerequisite step in various clinical applications. Multi-phase CT imaging enhances tumor characterization, thereby assisting radiologists in accurate identification. However, existing automatic liver tumor segmentation models did not fully exploit multi-phase information and lacked the capability to capture global information. In this study, we developed a pioneering multi-phase feature interaction Transformer network (MI-TransSeg) for accurate liver tumor segmentation and a subsequent microvascular invasion (MVI) assessment in contrast-enhanced CT images. In the proposed network, an efficient multi-phase features interaction module was introduced to enable bi-directional feature interaction among multiple phases, thus maximally exploiting the available multi-phase information. To enhance the model's capability to extract global information, a hierarchical transformer-based encoder and decoder architecture was designed. Importantly, we devised a multi-resolution scales feature aggregation strategy (MSFA) to optimize the parameters and performance of the proposed model. Subsequent to segmentation, the liver tumor masks generated by MI-TransSeg were applied to extract radiomic features for the clinical applications of the MVI assessment. With Institutional Review Board (IRB) approval, a clinical multi-phase contrast-enhanced CT abdominal dataset was collected that included 164 patients with liver tumors. The experimental results demonstrated that the proposed MI-TransSeg was superior to various state-of-the-art methods. Additionally, we found that the tumor mask predicted by our method showed promising potential in the assessment of microvascular invasion. In conclusion, MI-TransSeg presents an innovative paradigm for the segmentation of complex liver tumors, thus underscoring the significance of multi-phase CT data exploitation. The proposed MI-TransSeg network has the potential to assist radiologists in diagnosing liver tumors and assessing microvascular invasion.
Collapse
Affiliation(s)
- Wencong Zhang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
- Department of Biomedical Engineering, College of Design and Engineering, National University of Singapore, Singapore
| | - Yuxi Tao
- Department of Radiology, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Zhanyao Huang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Yue Li
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yingjia Chen
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Tengfei Song
- Department of Radiology, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xiangyuan Ma
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Yaqin Zhang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| |
Collapse
|
24
|
Svanera M, Savardi M, Signoroni A, Benini S, Muckli L. Fighting the scanner effect in brain MRI segmentation with a progressive level-of-detail network trained on multi-site data. Med Image Anal 2024; 93:103090. [PMID: 38241763 DOI: 10.1016/j.media.2024.103090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 10/30/2023] [Accepted: 01/12/2024] [Indexed: 01/21/2024]
Abstract
Many clinical and research studies of the human brain require accurate structural MRI segmentation. While traditional atlas-based methods can be applied to volumes from any acquisition site, recent deep learning algorithms ensure high accuracy only when tested on data from the same sites exploited in training (i.e., internal data). Performance degradation experienced on external data (i.e., unseen volumes from unseen sites) is due to the inter-site variability in intensity distributions, and to unique artefacts caused by different MR scanner models and acquisition parameters. To mitigate this site-dependency, often referred to as the scanner effect, we propose LOD-Brain, a 3D convolutional neural network with progressive levels-of-detail (LOD), able to segment brain data from any site. Coarser network levels are responsible for learning a robust anatomical prior helpful in identifying brain structures and their locations, while finer levels refine the model to handle site-specific intensity distributions and anatomical variations. We ensure robustness across sites by training the model on an unprecedentedly rich dataset aggregating data from open repositories: almost 27,000 T1w volumes from around 160 acquisition sites, at 1.5 - 3T, from a population spanning from 8 to 90 years old. Extensive tests demonstrate that LOD-Brain produces state-of-the-art results, with no significant difference in performance between internal and external sites, and robust to challenging anatomical variations. Its portability paves the way for large-scale applications across different healthcare institutions, patient populations, and imaging technology manufacturers. Code, model, and demo are available on the project website.
Collapse
Affiliation(s)
- Michele Svanera
- Center for Cognitive Neuroimaging at the School of Psychology & Neuroscience, University of Glasgow, UK.
| | - Mattia Savardi
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, University of Brescia, Italy
| | - Alberto Signoroni
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, University of Brescia, Italy
| | - Sergio Benini
- Department of Information Engineering, University of Brescia, Italy
| | - Lars Muckli
- Center for Cognitive Neuroimaging at the School of Psychology & Neuroscience, University of Glasgow, UK
| |
Collapse
|
25
|
Zhu Z, Sun M, Qi G, Li Y, Gao X, Liu Y. Sparse Dynamic Volume TransUNet with multi-level edge fusion for brain tumor segmentation. Comput Biol Med 2024; 172:108284. [PMID: 38503086 DOI: 10.1016/j.compbiomed.2024.108284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 02/19/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
3D MRI Brain Tumor Segmentation is of great significance in clinical diagnosis and treatment. Accurate segmentation results are critical for localization and spatial distribution of brain tumors using 3D MRI. However, most existing methods mainly focus on extracting global semantic features from the spatial and depth dimensions of a 3D volume, while ignoring voxel information, inter-layer connections, and detailed features. A 3D brain tumor segmentation network SDV-TUNet (Sparse Dynamic Volume TransUNet) based on an encoder-decoder architecture is proposed to achieve accurate segmentation by effectively combining voxel information, inter-layer feature connections, and intra-axis information. Volumetric data is fed into a 3D network consisting of extended depth modeling for dense prediction by using two modules: sparse dynamic (SD) encoder-decoder module and multi-level edge feature fusion (MEFF) module. The SD encoder-decoder module is utilized to extract global spatial semantic features for brain tumor segmentation, which employs multi-head self-attention and sparse dynamic adaptive fusion in a 3D extended shifted window strategy. In the encoding stage, dynamic perception of regional connections and multi-axis information interactions are realized through local tight correlations and long-range sparse correlations. The MEFF module achieves the fusion of multi-level local edge information in a layer-by-layer incremental manner and connects the fusion to the decoder module through skip connections to enhance the propagation ability of spatial edge information. The proposed method is applied to the BraTS2020 and BraTS2021 benchmarks, and the experimental results show its superior performance compared with state-of-the-art brain tumor segmentation methods. The source codes of the proposed method are available at https://github.com/SunMengw/SDV-TUNet.
Collapse
Affiliation(s)
- Zhiqin Zhu
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Mengwei Sun
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Guanqiu Qi
- Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA.
| | - Yuanyuan Li
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Xinbo Gao
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Yu Liu
- Department of Biomedical Engineering, Hefei University of Technology, Hefei 230009, China.
| |
Collapse
|
26
|
Lee IC, Tsai YP, Lin YC, Chen TC, Yen CH, Chiu NC, Hwang HE, Liu CA, Huang JG, Lee RC, Chao Y, Ho SY, Huang YH. A hierarchical fusion strategy of deep learning networks for detection and segmentation of hepatocellular carcinoma from computed tomography images. Cancer Imaging 2024; 24:43. [PMID: 38532511 PMCID: PMC10964581 DOI: 10.1186/s40644-024-00686-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 03/08/2024] [Indexed: 03/28/2024] Open
Abstract
BACKGROUND Automatic segmentation of hepatocellular carcinoma (HCC) on computed tomography (CT) scans is in urgent need to assist diagnosis and radiomics analysis. The aim of this study is to develop a deep learning based network to detect HCC from dynamic CT images. METHODS Dynamic CT images of 595 patients with HCC were used. Tumors in dynamic CT images were labeled by radiologists. Patients were randomly divided into training, validation and test sets in a ratio of 5:2:3, respectively. We developed a hierarchical fusion strategy of deep learning networks (HFS-Net). Global dice, sensitivity, precision and F1-score were used to measure performance of the HFS-Net model. RESULTS The 2D DenseU-Net using dynamic CT images was more effective for segmenting small tumors, whereas the 2D U-Net using portal venous phase images was more effective for segmenting large tumors. The HFS-Net model performed better, compared with the single-strategy deep learning models in segmenting small and large tumors. In the test set, the HFS-Net model achieved good performance in identifying HCC on dynamic CT images with global dice of 82.8%. The overall sensitivity, precision and F1-score were 84.3%, 75.5% and 79.6% per slice, respectively, and 92.2%, 93.2% and 92.7% per patient, respectively. The sensitivity in tumors < 2 cm, 2-3, 3-5 cm and > 5 cm were 72.7%, 92.9%, 94.2% and 100% per patient, respectively. CONCLUSIONS The HFS-Net model achieved good performance in the detection and segmentation of HCC from dynamic CT images, which may support radiologic diagnosis and facilitate automatic radiomics analysis.
Collapse
Affiliation(s)
- I-Cheng Lee
- Division of Gastroenterology and Hepatology, Department of Medicine, Taipei Veterans General Hospital, Taipei, Taiwan
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yung-Ping Tsai
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yen-Cheng Lin
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Ting-Chun Chen
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Chia-Heng Yen
- Institute of Computer Science and Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Nai-Chi Chiu
- Department of Radiology, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Hsuen-En Hwang
- Department of Radiology, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Chien-An Liu
- Department of Radiology, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Jia-Guan Huang
- National Taiwan University School of Medicine, Taipei, Taiwan
| | - Rheun-Chuan Lee
- Department of Radiology, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Yee Chao
- Cancer Center, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Shinn-Ying Ho
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
- Center for Intelligent Drug Systems and Smart Bio-devices (IDS 2 B), National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
- College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan.
| | - Yi-Hsiang Huang
- Division of Gastroenterology and Hepatology, Department of Medicine, Taipei Veterans General Hospital, Taipei, Taiwan.
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
- Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
- Healthcare and Service Center, Taipei Veterans General Hospital, Taipei, Taiwan.
| |
Collapse
|
27
|
Fu Y, Ma L, Wan S, Ge S, Yang Z. A novel clinical artificial intelligence model for disease detection via retinal imaging. Innovation (N Y) 2024; 5:100575. [PMID: 38379789 PMCID: PMC10876903 DOI: 10.1016/j.xinn.2024.100575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 01/03/2024] [Indexed: 02/22/2024] Open
Affiliation(s)
- Yidian Fu
- Department of Ophthalmology, Ninth People’s Hospital, Shanghai JiaoTong University School of Medicine, Shanghai 200011, China
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China
| | - Liang Ma
- Department of Ophthalmology, Ninth People’s Hospital, Shanghai JiaoTong University School of Medicine, Shanghai 200011, China
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China
| | - Sheng Wan
- PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, Nanjing 210000, China
- Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210000, China
| | - Shengfang Ge
- Department of Ophthalmology, Ninth People’s Hospital, Shanghai JiaoTong University School of Medicine, Shanghai 200011, China
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China
| | - Zhi Yang
- Department of Ophthalmology, Ninth People’s Hospital, Shanghai JiaoTong University School of Medicine, Shanghai 200011, China
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China
| |
Collapse
|
28
|
Chen J, Huang G, Yuan X, Zhong G, Zheng Z, Pun CM, Zhu J, Huang Z. Quaternion Cross-Modality Spatial Learning for Multi-Modal Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:1412-1423. [PMID: 38145537 DOI: 10.1109/jbhi.2023.3346529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
Recently, the Deep Neural Networks (DNNs) have had a large impact on imaging process including medical image segmentation, and the real-valued convolution of DNN has been extensively utilized in multi-modal medical image segmentation to accurately segment lesions via learning data information. However, the weighted summation operation in such convolution limits the ability to maintain spatial dependence that is crucial for identifying different lesion distributions. In this paper, we propose a novel Quaternion Cross-modality Spatial Learning (Q-CSL) which explores the spatial information while considering the linkage between multi-modal images. Specifically, we introduce to quaternion to represent data and coordinates that contain spatial information. Additionally, we propose Quaternion Spatial-association Convolution to learn the spatial information. Subsequently, the proposed De-level Quaternion Cross-modality Fusion (De-QCF) module excavates inner space features and fuses cross-modality spatial dependency. Our experimental results demonstrate that our approach compared to the competitive methods perform well with only 0.01061 M parameters and 9.95G FLOPs.
Collapse
|
29
|
Hussain D, Al-Masni MA, Aslam M, Sadeghi-Niaraki A, Hussain J, Gu YH, Naqvi RA. Revolutionizing tumor detection and classification in multimodality imaging based on deep learning approaches: Methods, applications and limitations. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:857-911. [PMID: 38701131 DOI: 10.3233/xst-230429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
BACKGROUND The emergence of deep learning (DL) techniques has revolutionized tumor detection and classification in medical imaging, with multimodal medical imaging (MMI) gaining recognition for its precision in diagnosis, treatment, and progression tracking. OBJECTIVE This review comprehensively examines DL methods in transforming tumor detection and classification across MMI modalities, aiming to provide insights into advancements, limitations, and key challenges for further progress. METHODS Systematic literature analysis identifies DL studies for tumor detection and classification, outlining methodologies including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variants. Integration of multimodality imaging enhances accuracy and robustness. RESULTS Recent advancements in DL-based MMI evaluation methods are surveyed, focusing on tumor detection and classification tasks. Various DL approaches, including CNNs, YOLO, Siamese Networks, Fusion-Based Models, Attention-Based Models, and Generative Adversarial Networks, are discussed with emphasis on PET-MRI, PET-CT, and SPECT-CT. FUTURE DIRECTIONS The review outlines emerging trends and future directions in DL-based tumor analysis, aiming to guide researchers and clinicians toward more effective diagnosis and prognosis. Continued innovation and collaboration are stressed in this rapidly evolving domain. CONCLUSION Conclusions drawn from literature analysis underscore the efficacy of DL approaches in tumor detection and classification, highlighting their potential to address challenges in MMI analysis and their implications for clinical practice.
Collapse
Affiliation(s)
- Dildar Hussain
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Mohammed A Al-Masni
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Muhammad Aslam
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Abolghasem Sadeghi-Niaraki
- Department of Computer Science & Engineering and Convergence Engineering for Intelligent Drone, XR Research Center, Sejong University, Seoul, Korea
| | - Jamil Hussain
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Yeong Hyeon Gu
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Rizwan Ali Naqvi
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Korea
| |
Collapse
|
30
|
He W, Zhang C, Dai J, Liu L, Wang T, Liu X, Jiang Y, Li N, Xiong J, Wang L, Xie Y, Liang X. A statistical deformation model-based data augmentation method for volumetric medical image segmentation. Med Image Anal 2024; 91:102984. [PMID: 37837690 DOI: 10.1016/j.media.2023.102984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 07/15/2023] [Accepted: 09/28/2023] [Indexed: 10/16/2023]
Abstract
The accurate delineation of organs-at-risk (OARs) is a crucial step in treatment planning during radiotherapy, as it minimizes the potential adverse effects of radiation on surrounding healthy organs. However, manual contouring of OARs in computed tomography (CT) images is labor-intensive and susceptible to errors, particularly for low-contrast soft tissue. Deep learning-based artificial intelligence algorithms surpass traditional methods but require large datasets. Obtaining annotated medical images is both time-consuming and expensive, hindering the collection of extensive training sets. To enhance the performance of medical image segmentation, augmentation strategies such as rotation and Gaussian smoothing are employed during preprocessing. However, these conventional data augmentation techniques cannot generate more realistic deformations, limiting improvements in accuracy. To address this issue, this study introduces a statistical deformation model-based data augmentation method for volumetric medical image segmentation. By applying diverse and realistic data augmentation to CT images from a limited patient cohort, our method significantly improves the fully automated segmentation of OARs across various body parts. We evaluate our framework on three datasets containing tumor OARs from the head, neck, chest, and abdomen. Test results demonstrate that the proposed method achieves state-of-the-art performance in numerous OARs segmentation challenges. This innovative approach holds considerable potential as a powerful tool for various medical imaging-related sub-fields, effectively addressing the challenge of limited data access.
Collapse
Affiliation(s)
- Wenfeng He
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
| | - Chulong Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Jingjing Dai
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Lin Liu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Tangsheng Wang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Xuan Liu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yuming Jiang
- Department of Radiation Oncology, Wake Forest University School of Medicine, Winston Salem, North Carolina 27157, USA
| | - Na Li
- Department of Biomedical Engineering, Guangdong Medical University, Dongguan, 523808, China
| | - Jing Xiong
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Lei Wang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Xiaokun Liang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
31
|
Deng Z, Huang G, Yuan X, Zhong G, Lin T, Pun CM, Huang Z, Liang Z. QMLS: quaternion mutual learning strategy for multi-modal brain tumor segmentation. Phys Med Biol 2023; 69:015014. [PMID: 38061066 DOI: 10.1088/1361-6560/ad135e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 12/07/2023] [Indexed: 12/27/2023]
Abstract
Objective.Due to non-invasive imaging and the multimodality of magnetic resonance imaging (MRI) images, MRI-based multi-modal brain tumor segmentation (MBTS) studies have attracted more and more attention in recent years. With the great success of convolutional neural networks in various computer vision tasks, lots of MBTS models have been proposed to address the technical challenges of MBTS. However, the problem of limited data collection usually exists in MBTS tasks, making existing studies typically have difficulty in fully exploring the multi-modal MRI images to mine complementary information among different modalities.Approach.We propose a novel quaternion mutual learning strategy (QMLS), which consists of a voxel-wise lesion knowledge mutual learning mechanism (VLKML mechanism) and a quaternion multi-modal feature learning module (QMFL module). Specifically, the VLKML mechanism allows the networks to converge to a robust minimum so that aggressive data augmentation techniques can be applied to expand the limited data fully. In particular, the quaternion-valued QMFL module treats different modalities as components of quaternions to sufficiently learn complementary information among different modalities on the hypercomplex domain while significantly reducing the number of parameters by about 75%.Main results.Extensive experiments on the dataset BraTS 2020 and BraTS 2019 indicate that QMLS achieves superior results to current popular methods with less computational cost.Significance.We propose a novel algorithm for brain tumor segmentation task that achieves better performance with fewer parameters, which helps the clinical application of automatic brain tumor segmentation.
Collapse
Affiliation(s)
- Zhengnan Deng
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, People's Republic of China
| | - Guoheng Huang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, People's Republic of China
| | - Xiaochen Yuan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, People's Republic of China
| | - Guo Zhong
- School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, 510006, People's Republic of China
| | - Tongxu Lin
- School of Automation, Guangdong University of Technology, Guangzhou, 510006, People's Republic of China
| | - Chi-Man Pun
- Department of Computer and Information Science, University of Macau, Macao, People's Republic of China
| | - Zhixin Huang
- Department of Neurology, Guangdong Second Provincial General Hospital, Guangzhou, 510317, People's Republic of China
| | - Zhixin Liang
- Department of Nuclear Medicine, Jinshazhou Hospital, Guangzhou University of Chinese Medicine, Guangzhou, 510168, People's Republic of China
| |
Collapse
|
32
|
Mhlanga ST, Viriri S. Deep learning techniques for isointense infant brain tissue segmentation: a systematic literature review. Front Med (Lausanne) 2023; 10:1240360. [PMID: 38193036 PMCID: PMC10773803 DOI: 10.3389/fmed.2023.1240360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 11/01/2023] [Indexed: 01/10/2024] Open
Abstract
Introduction To improve comprehension of initial brain growth in wellness along with sickness, it is essential to precisely segment child brain magnetic resonance imaging (MRI) into white matter (WM) and gray matter (GM), along with cerebrospinal fluid (CSF). Nonetheless, in the isointense phase (6-8 months of age), the inborn myelination and development activities, WM along with GM display alike stages of intensity in both T1-weighted and T2-weighted MRI, making tissue segmentation extremely difficult. Methods The comprehensive review of studies related to isointense brain MRI segmentation approaches is highlighted in this publication. The main aim and contribution of this study is to aid researchers by providing a thorough review to make their search for isointense brain MRI segmentation easier. The systematic literature review is performed from four points of reference: (1) review of studies concerning isointense brain MRI segmentation; (2) research contribution and future works and limitations; (3) frequently applied evaluation metrics and datasets; (4) findings of this studies. Results and discussion The systemic review is performed on studies that were published in the period of 2012 to 2022. A total of 19 primary studies of isointense brain MRI segmentation were selected to report the research question stated in this review.
Collapse
Affiliation(s)
| | - Serestina Viriri
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
33
|
Chen T, Hong R, Guo Y, Hao S, Hu B. MS²-GNN: Exploring GNN-Based Multimodal Fusion Network for Depression Detection. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:7749-7759. [PMID: 36194716 DOI: 10.1109/tcyb.2022.3197127] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Major depressive disorder (MDD) is one of the most common and severe mental illnesses, posing a huge burden on society and families. Recently, some multimodal methods have been proposed to learn a multimodal embedding for MDD detection and achieved promising performance. However, these methods ignore the heterogeneity/homogeneity among various modalities. Besides, earlier attempts ignore interclass separability and intraclass compactness. Inspired by the above observations, we propose a graph neural network (GNN)-based multimodal fusion strategy named modal-shared modal-specific GNN, which investigates the heterogeneity/homogeneity among various psychophysiological modalities as well as explores the potential relationship between subjects. Specifically, we develop a modal-shared and modal-specific GNN architecture to extract the inter/intramodal characteristics. Furthermore, a reconstruction network is employed to ensure fidelity within the individual modality. Moreover, we impose an attention mechanism on various embeddings to obtain a multimodal compact representation for the subsequent MDD detection task. We conduct extensive experiments on two public depression datasets and the favorable results demonstrate the effectiveness of the proposed algorithm.
Collapse
|
34
|
Gu Y, Guan Y, Yu Z, Dong B. SegCoFusion: An Integrative Multimodal Volumetric Segmentation Cooperating With Fusion Pipeline to Enhance Lesion Awareness. IEEE J Biomed Health Inform 2023; 27:5860-5871. [PMID: 37738185 DOI: 10.1109/jbhi.2023.3318131] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Multimodal volumetric segmentation and fusion are two valuable techniques for surgical treatment planning, image-guided interventions, tumor growth detection, radiotherapy map generation, etc. In recent years, deep learning has demonstrated its excellent capability in both of the above tasks, while these methods inevitably face bottlenecks. On the one hand, recent segmentation studies, especially the U-Net-style series, have reached the performance ceiling in segmentation tasks. On the other hand, it is almost impossible to capture the ground truth of the fusion in multimodal imaging, due to differences in physical principles among imaging modalities. Hence, most of the existing studies in the field of multimodal medical image fusion, which fuse only two modalities at a time with hand-crafted proportions, are subjective and task-specific. To address the above concerns, this work proposes an integration of multimodal segmentation and fusion, namely SegCoFusion, which consists of a novel feature frequency dividing network named FDNet and a segmentation part using a dual-single path feature supplementing strategy to optimize the segmentation inputs and suture with the fusion part. Furthermore, focusing on multimodal brain tumor volumetric fusion and segmentation, the qualitative and quantitative results demonstrate that SegCoFusion can break the ceiling both of segmentation and fusion methods. Moreover, the effectiveness of the proposed framework is also revealed by comparing it with state-of-the-art fusion methods on 2D two-modality fusion tasks, our method achieves better fusion performance than others. Therefore, the proposed SegCoFusion develops a novel perspective that improves the performance in volumetric fusion by cooperating with segmentation and enhances lesion awareness.
Collapse
|
35
|
Lu S, Xiao X, Yan Z, Cheng T, Tan X, Zhao R, Wu H, Shen L, Zhang Z. Prognosis Forecast of Re-Irradiation for Recurrent Nasopharyngeal Carcinoma Based on Deep Learning Multi-Modal Information Fusion. IEEE J Biomed Health Inform 2023; 27:6088-6099. [PMID: 37384472 DOI: 10.1109/jbhi.2023.3286656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Radiation therapy is the primary treatment for recurrent nasopharyngeal carcinoma. However, it may induce necrosis of the nasopharynx, leading to severe complications such as bleeding and headache. Therefore, forecasting necrosis of the nasopharynx and initiating timely clinical intervention has important implications for reducing complications caused by re-irradiation. This research informs clinical decision-making by making predictions on re-irradiation of recurrent nasopharyngeal carcinoma using deep learning multi-modal information fusion between multi-sequence nuclear magnetic resonance imaging and plan dose. Specifically, we assume that the hidden variables of model data can be divided into two categories: task-consistency and task-inconsistency. The task-consistency variables are characteristic variables contributing to target tasks, while the task-inconsistency variables are not apparently helpful. These modal characteristics are adaptively fused when the relevant tasks are expressed through the construction of supervised classification loss and self-supervised reconstruction loss. The cooperation of supervised classification loss and self-supervised reconstruction loss simultaneously reserves the information of characteristic space and controls potential interference simultaneously. Finally, multi-modal fusion effectively fuses information through an adaptive linking module. We evaluated this method on a multi-center dataset. and found the prediction based on multi-modal features fusion outperformed predictions based on single-modal, partial modal fusion or traditional machine learning methods.
Collapse
|
36
|
Ding W, Li L, Qiu J, Wang S, Huang L, Chen Y, Yang S, Zhuang X. Aligning Multi-Sequence CMR Towards Fully Automated Myocardial Pathology Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3474-3486. [PMID: 37347625 DOI: 10.1109/tmi.2023.3288046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Myocardial pathology segmentation (MyoPS) is critical for the risk stratification and treatment planning of myocardial infarction (MI). Multi-sequence cardiac magnetic resonance (MS-CMR) images can provide valuable information. For instance, balanced steady-state free precession cine sequences present clear anatomical boundaries, while late gadolinium enhancement and T2-weighted CMR sequences visualize myocardial scar and edema of MI, respectively. Existing methods usually fuse anatomical and pathological information from different CMR sequences for MyoPS, but assume that these images have been spatially aligned. However, MS-CMR images are usually unaligned due to the respiratory motions in clinical practices, which poses additional challenges for MyoPS. This work presents an automatic MyoPS framework for unaligned MS-CMR images. Specifically, we design a combined computing model for simultaneous image registration and information fusion, which aggregates multi-sequence features into a common space to extract anatomical structures (i.e., myocardium). Consequently, we can highlight the informative regions in the common space via the extracted myocardium to improve MyoPS performance, considering the spatial relationship between myocardial pathologies and myocardium. Experiments on a private MS-CMR dataset and a public dataset from the MYOPS2020 challenge show that our framework could achieve promising performance for fully automatic MyoPS.
Collapse
|
37
|
Li T, Xu Y, Wu T, Charlton JR, Bennett KM, Al-Hindawi F. BlobCUT: A Contrastive Learning Method to Support Small Blob Detection in Medical Imaging. Bioengineering (Basel) 2023; 10:1372. [PMID: 38135963 PMCID: PMC10740534 DOI: 10.3390/bioengineering10121372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 11/19/2023] [Accepted: 11/27/2023] [Indexed: 12/24/2023] Open
Abstract
Medical imaging-based biomarkers derived from small objects (e.g., cell nuclei) play a crucial role in medical applications. However, detecting and segmenting small objects (a.k.a. blobs) remains a challenging task. In this research, we propose a novel 3D small blob detector called BlobCUT. BlobCUT is an unpaired image-to-image (I2I) translation model that falls under the Contrastive Unpaired Translation paradigm. It employs a blob synthesis module to generate synthetic 3D blobs with corresponding masks. This is incorporated into the iterative model training as the ground truth. The I2I translation process is designed with two constraints: (1) a convexity consistency constraint that relies on Hessian analysis to preserve the geometric properties and (2) an intensity distribution consistency constraint based on Kullback-Leibler divergence to preserve the intensity distribution of blobs. BlobCUT learns the inherent noise distribution from the target noisy blob images and performs image translation from the noisy domain to the clean domain, effectively functioning as a denoising process to support blob identification. To validate the performance of BlobCUT, we evaluate it on a 3D simulated dataset of blobs and a 3D MRI dataset of mouse kidneys. We conduct a comparative analysis involving six state-of-the-art methods. Our findings reveal that BlobCUT exhibits superior performance and training efficiency, utilizing only 56.6% of the training time required by the state-of-the-art BlobDetGAN. This underscores the effectiveness of BlobCUT in accurately segmenting small blobs while achieving notable gains in training efficiency.
Collapse
Affiliation(s)
- Teng Li
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA; (T.L.); (Y.X.); (F.A.-H.)
| | - Yanzhe Xu
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA; (T.L.); (Y.X.); (F.A.-H.)
| | - Teresa Wu
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA; (T.L.); (Y.X.); (F.A.-H.)
| | - Jennifer R. Charlton
- Division Nephrology, Department of Pediatrics, University of Virginia, Charlottesville, VA 22903, USA;
| | - Kevin M. Bennett
- Department of Radiology, Washington University, St. Louis, MO 63130, USA;
| | - Firas Al-Hindawi
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA; (T.L.); (Y.X.); (F.A.-H.)
| |
Collapse
|
38
|
Wu L, Wang H, Chen Y, Zhang X, Zhang T, Shen N, Tao G, Sun Z, Ding Y, Wang W, Bu J. Beyond radiologist-level liver lesion detection on multi-phase contrast-enhanced CT images by deep learning. iScience 2023; 26:108183. [PMID: 38026220 PMCID: PMC10654534 DOI: 10.1016/j.isci.2023.108183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 07/22/2023] [Accepted: 10/09/2023] [Indexed: 12/01/2023] Open
Abstract
Accurate detection of liver lesions from multi-phase contrast-enhanced CT (CECT) scans is a fundamental step for precise liver diagnosis and treatment. However, the analysis of multi-phase contexts is heavily challenged by the misalignment caused by respiration coupled with the movement of organs. Here, we proposed an AI system for multi-phase liver lesion segmentation (named MULLET) for precise and fully automatic segmentation of real-patient CECT images. MULLET enables effectively embedding the important ROIs of CECT images and exploring multi-phase contexts by introducing a transformer-based attention mechanism. Evaluated on 1,229 CECT scans from 1,197 patients, MULLET demonstrated significant performance gains in terms of Dice, Recall, and F2 score, which are 5.80%, 6.57%, and 5.87% higher than state of the arts, respectively. MULLET has been successfully deployed in real-world settings. The deployed AI web server provides a powerful system to boost clinical workflows of liver lesion diagnosis and could be straightforwardly extended to general CECT analyses.
Collapse
Affiliation(s)
- Lei Wu
- Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University, Hangzhou, China
- Pujian Technology, Hangzhou, Zhejiang, China
| | - Haishuai Wang
- Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University, Hangzhou, China
| | - Yining Chen
- Department of Hepatobiliary and Pancreatic Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiang Zhang
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tianyun Zhang
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, Zhejiang, China
| | - Ning Shen
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, Zhejiang, China
| | - Guangyu Tao
- Department of Radiology, Shanghai Chest Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhongquan Sun
- Department of Hepatobiliary and Pancreatic Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yuan Ding
- Department of Hepatobiliary and Pancreatic Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Weilin Wang
- Department of Hepatobiliary and Pancreatic Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Jiajun Bu
- Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University, Hangzhou, China
| |
Collapse
|
39
|
Misra S, Yoon C, Kim K, Managuli R, Barr RG, Baek J, Kim C. Deep learning-based multimodal fusion network for segmentation and classification of breast cancers using B-mode and elastography ultrasound images. Bioeng Transl Med 2023; 8:e10480. [PMID: 38023698 PMCID: PMC10658476 DOI: 10.1002/btm2.10480] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 12/02/2022] [Accepted: 12/13/2022] [Indexed: 12/01/2023] Open
Abstract
Ultrasonography is one of the key medical imaging modalities for evaluating breast lesions. For differentiating benign from malignant lesions, computer-aided diagnosis (CAD) systems have greatly assisted radiologists by automatically segmenting and identifying features of lesions. Here, we present deep learning (DL)-based methods to segment the lesions and then classify benign from malignant, utilizing both B-mode and strain elastography (SE-mode) images. We propose a weighted multimodal U-Net (W-MM-U-Net) model for segmenting lesions where optimum weight is assigned on different imaging modalities using a weighted-skip connection method to emphasize its importance. We design a multimodal fusion framework (MFF) on cropped B-mode and SE-mode ultrasound (US) lesion images to classify benign and malignant lesions. The MFF consists of an integrated feature network (IFN) and a decision network (DN). Unlike other recent fusion methods, the proposed MFF method can simultaneously learn complementary information from convolutional neural networks (CNNs) trained using B-mode and SE-mode US images. The features from the CNNs are ensembled using the multimodal EmbraceNet model and DN classifies the images using those features. The experimental results (sensitivity of 100 ± 0.00% and specificity of 94.28 ± 7.00%) on the real-world clinical data showed that the proposed method outperforms the existing single- and multimodal methods. The proposed method predicts seven benign patients as benign three times out of five trials and six malignant patients as malignant five out of five trials. The proposed method would potentially enhance the classification accuracy of radiologists for breast cancer detection in US images.
Collapse
Affiliation(s)
- Sampa Misra
- Department of Electrical Engineering, Convergence IT Engineering, Mechanical Engineering, Medical Device Innovation Center, and Graduate School of Artificial IntelligencePohang University of Science and TechnologyPohangSouth Korea
| | - Chiho Yoon
- Department of Electrical Engineering, Convergence IT Engineering, Mechanical Engineering, Medical Device Innovation Center, and Graduate School of Artificial IntelligencePohang University of Science and TechnologyPohangSouth Korea
| | - Kwang‐Ju Kim
- Daegu‐Gyeongbuk Research CenterElectronics and Telecommunications Research Institute (ETRI)DaeguSouth Korea
| | - Ravi Managuli
- Department of BioengineeringUniversity of WashingtonSeattleWashingtonUSA
| | - Richard G. Barr
- Department of RadiologyNortheastern Ohio Medical UniversityYoungstownOhioUSA
| | - Jongduk Baek
- School of Integrated TechnologyYonsei UniversitySeoulSouth Korea
| | - Chulhong Kim
- Department of Electrical Engineering, Convergence IT Engineering, Mechanical Engineering, Medical Device Innovation Center, and Graduate School of Artificial IntelligencePohang University of Science and TechnologyPohangSouth Korea
| |
Collapse
|
40
|
Suh PS, Jung W, Suh CH, Kim J, Oh J, Heo H, Shim WH, Lim JS, Lee JH, Kim HS, Kim SJ. Development and validation of a deep learning-based automatic segmentation model for assessing intracranial volume: comparison with NeuroQuant, FreeSurfer, and SynthSeg. Front Neurol 2023; 14:1221892. [PMID: 37719763 PMCID: PMC10503131 DOI: 10.3389/fneur.2023.1221892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 08/07/2023] [Indexed: 09/19/2023] Open
Abstract
Background and purpose To develop and validate a deep learning-based automatic segmentation model for assessing intracranial volume (ICV) and to compare the accuracy determined by NeuroQuant (NQ), FreeSurfer (FS), and SynthSeg. Materials and methods This retrospective study included 60 subjects [30 Alzheimer's disease (AD), 21 mild cognitive impairment (MCI), 9 cognitively normal (CN)] from a single tertiary hospital for the training and validation group (50:10). The test group included 40 subjects (20 AD, 10 MCI, 10 CN) from the ADNI dataset. We propose a robust ICV segmentation model based on the foundational 2D UNet architecture trained with four types of input images (both single and multimodality using scaled or unscaled T1-weighted and T2-FLAIR MR images). To compare with our model, NQ, FS, and SynthSeg were also utilized in the test group. We evaluated the model performance by measuring the Dice similarity coefficient (DSC) and average volume difference. Results The single-modality model trained with scaled T1-weighted images showed excellent performance with a DSC of 0.989 ± 0.002 and an average volume difference of 0.46% ± 0.38%. Our multimodality model trained with both unscaled T1-weighted and T2-FLAIR images showed similar performance with a DSC of 0.988 ± 0.002 and an average volume difference of 0.47% ± 0.35%. The overall average volume difference with our model showed relatively higher accuracy than NQ (2.15% ± 1.72%), FS (3.69% ± 2.93%), and SynthSeg (1.88% ± 1.18%). Furthermore, our model outperformed the three others in each subgroup of patients with AD, MCI, and CN subjects. Conclusion Our deep learning-based automatic ICV segmentation model showed excellent performance for the automatic evaluation of ICV.
Collapse
Affiliation(s)
- Pae Sun Suh
- Department of Radiology, Asan Medical Center, Seoul, Republic of Korea
| | | | - Chong Hyun Suh
- Department of Radiology, Asan Medical Center, Seoul, Republic of Korea
| | | | - Jio Oh
- R&D Center, VUNO, Seoul, Republic of Korea
| | - Hwon Heo
- Department of Radiology, Asan Medical Center, Seoul, Republic of Korea
| | - Woo Hyun Shim
- Department of Radiology, Asan Medical Center, Seoul, Republic of Korea
| | - Jae-Sung Lim
- Department of Neurology, Asan Medical Center, College of Medicine, University of Ulsan, Ulsan, Republic of Korea
| | - Jae-Hong Lee
- Department of Neurology, Asan Medical Center, College of Medicine, University of Ulsan, Ulsan, Republic of Korea
| | - Ho Sung Kim
- Department of Radiology, Asan Medical Center, Seoul, Republic of Korea
| | - Sang Joon Kim
- Department of Radiology, Asan Medical Center, Seoul, Republic of Korea
| |
Collapse
|
41
|
Liu H, Zhuang Y, Song E, Xu X, Ma G, Cetinkaya C, Hung CC. A modality-collaborative convolution and transformer hybrid network for unpaired multi-modal medical image segmentation with limited annotations. Med Phys 2023; 50:5460-5478. [PMID: 36864700 DOI: 10.1002/mp.16338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/07/2023] [Accepted: 02/22/2023] [Indexed: 03/04/2023] Open
Abstract
BACKGROUND Multi-modal learning is widely adopted to learn the latent complementary information between different modalities in multi-modal medical image segmentation tasks. Nevertheless, the traditional multi-modal learning methods require spatially well-aligned and paired multi-modal images for supervised training, which cannot leverage unpaired multi-modal images with spatial misalignment and modality discrepancy. For training accurate multi-modal segmentation networks using easily accessible and low-cost unpaired multi-modal images in clinical practice, unpaired multi-modal learning has received comprehensive attention recently. PURPOSE Existing unpaired multi-modal learning methods usually focus on the intensity distribution gap but ignore the scale variation problem between different modalities. Besides, within existing methods, shared convolutional kernels are frequently employed to capture common patterns in all modalities, but they are typically inefficient at learning global contextual information. On the other hand, existing methods highly rely on a large number of labeled unpaired multi-modal scans for training, which ignores the practical scenario when labeled data is limited. To solve the above problems, we propose a modality-collaborative convolution and transformer hybrid network (MCTHNet) using semi-supervised learning for unpaired multi-modal segmentation with limited annotations, which not only collaboratively learns modality-specific and modality-invariant representations, but also could automatically leverage extensive unlabeled scans for improving performance. METHODS We make three main contributions to the proposed method. First, to alleviate the intensity distribution gap and scale variation problems across modalities, we develop a modality-specific scale-aware convolution (MSSC) module that can adaptively adjust the receptive field sizes and feature normalization parameters according to the input. Secondly, we propose a modality-invariant vision transformer (MIViT) module as the shared bottleneck layer for all modalities, which implicitly incorporates convolution-like local operations with the global processing of transformers for learning generalizable modality-invariant representations. Third, we design a multi-modal cross pseudo supervision (MCPS) method for semi-supervised learning, which enforces the consistency between the pseudo segmentation maps generated by two perturbed networks to acquire abundant annotation information from unlabeled unpaired multi-modal scans. RESULTS Extensive experiments are performed on two unpaired CT and MR segmentation datasets, including a cardiac substructure dataset derived from the MMWHS-2017 dataset and an abdominal multi-organ dataset consisting of the BTCV and CHAOS datasets. Experiment results show that our proposed method significantly outperforms other existing state-of-the-art methods under various labeling ratios, and achieves a comparable segmentation performance close to single-modal methods with fully labeled data by only leveraging a small portion of labeled data. Specifically, when the labeling ratio is 25%, our proposed method achieves overall mean DSC values of 78.56% and 76.18% in cardiac and abdominal segmentation, respectively, which significantly improves the average DSC value of two tasks by 12.84% compared to single-modal U-Net models. CONCLUSIONS Our proposed method is beneficial for reducing the annotation burden of unpaired multi-modal medical images in clinical applications.
Collapse
Affiliation(s)
- Hong Liu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yuzhou Zhuang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Enmin Song
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xiangyang Xu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Coskun Cetinkaya
- Center for Machine Vision and Security Research, Kennesaw State University, Kennesaw, Georgia, USA
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Kennesaw, Georgia, USA
| |
Collapse
|
42
|
Jiang M, Chiu B. A Dual-Stream Centerline-Guided Network for Segmentation of the Common and Internal Carotid Arteries From 3D Ultrasound Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2690-2705. [PMID: 37015114 DOI: 10.1109/tmi.2023.3263537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Segmentation of the carotid section encompassing the common carotid artery (CCA), the bifurcation and the internal carotid artery (ICA) from three-dimensional ultrasound (3DUS) is required to measure the vessel wall volume (VWV) and localized vessel-wall-plus-plaque thickness (VWT), shown to be sensitive to treatment effect. We proposed an approach to combine a centerline extraction network (CHG-Net) and a dual-stream centerline-guided network (DSCG-Net) to segment the lumen-intima (LIB) and media-adventitia boundaries (MAB) from 3DUS images. Correct arterial location is essential for successful segmentation of the carotid section encompassing the bifurcation. We addressed this challenge by using the arterial centerline to enhance the localization accuracy of the segmentation network. The CHG-Net was developed to generate a heatmap indicating high probability regions for the centerline location, which was then integrated with the 3DUS image by the DSCG-Net to generate the MAB and LIB. The DSCG-Net includes a scale-based and a spatial attention mechanism to fuse multi-level features extracted by the encoder, and a centerline heatmap reconstruction side-branch connected to the end of the encoder to increase the generalization ability of the network. Experiments involving 224 3DUS volumes produce a Dice similarity coefficient (DSC) of 95.8±1.9% and 92.3±5.4% for CCA MAB and LIB, respectively, and 93.2±4.4% and 89.0±10.0% for ICA MAB and LIB, respectively. Our approach outperformed four state-of-the-art 3D CNN models, even after their performances were boosted by centerline guidance. The efficiency afforded by the framework would allow it to be incorporated into the clinical workflow for improved quantification of plaque change.
Collapse
|
43
|
Dou M, Chen Z, Tang Y, Sheng L, Zhou J, Wang X, Yao Y. Segmentation of rectal tumor from multi-parametric MRI images using an attention-based fusion network. Med Biol Eng Comput 2023; 61:2379-2389. [PMID: 37084029 DOI: 10.1007/s11517-023-02828-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 03/08/2023] [Indexed: 04/22/2023]
Abstract
Accurate segmentation of rectal tumors is the most crucial task in determining the stage of rectal cancer and developing suitable therapies. However, complex image backgrounds, irregular edge, and poor contrast hinder the related research. This study presents an attention-based multi-modal fusion module to effectively integrate complementary information from different MRI images and suppress redundancy. In addition, a deep learning-based segmentation model (AF-UNet) is designed to achieve accurate segmentation of rectal tumors. This model takes multi-parametric MRI images as input and effectively integrates the features from different multi-parametric MRI images by embedding the attention fusion module. Finally, three types of MRI images (T2, ADC, DWI) of 250 patients with rectal cancer were collected, with the tumor regions delineated by two oncologists. The experimental results show that the proposed method is superior to the most advanced image segmentation method with a Dice coefficient of [Formula: see text], which is also better than other multi-modal fusion methods. Framework of the AF-UNet. This model takes multi-modal MRI images as input, and integrates complementary information using attention mechanism and suppresses redundancy.
Collapse
Affiliation(s)
- Meng Dou
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhebin Chen
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yuanling Tang
- Department of Abdominal Oncology, West China Hospital, Sichuan University, Chengdu, China
| | - Leiming Sheng
- Department of Abdominal Oncology, West China Hospital, Sichuan University, Chengdu, China
| | - Jitao Zhou
- Department of Abdominal Oncology, West China Hospital, Sichuan University, Chengdu, China
| | - Xin Wang
- Department of Abdominal Oncology, West China Hospital, Sichuan University, Chengdu, China
| | - Yu Yao
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
44
|
Jiang M, Yuan B, Kou W, Yan W, Marshall H, Yang Q, Syer T, Punwani S, Emberton M, Barratt DC, Cho CCM, Hu Y, Chiu B. Prostate cancer segmentation from MRI by a multistream fusion encoder. Med Phys 2023; 50:5489-5504. [PMID: 36938883 DOI: 10.1002/mp.16374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 02/15/2023] [Accepted: 03/03/2023] [Indexed: 03/21/2023] Open
Abstract
BACKGROUND Targeted prostate biopsy guided by multiparametric magnetic resonance imaging (mpMRI) detects more clinically significant lesions than conventional systemic biopsy. Lesion segmentation is required for planning MRI-targeted biopsies. The requirement for integrating image features available in T2-weighted and diffusion-weighted images poses a challenge in prostate lesion segmentation from mpMRI. PURPOSE A flexible and efficient multistream fusion encoder is proposed in this work to facilitate the multiscale fusion of features from multiple imaging streams. A patch-based loss function is introduced to improve the accuracy in segmenting small lesions. METHODS The proposed multistream encoder fuses features extracted in the three imaging streams at each layer of the network, thereby allowing improved feature maps to propagate downstream and benefit segmentation performance. The fusion is achieved through a spatial attention map generated by optimally weighting the contribution of the convolution outputs from each stream. This design provides flexibility for the network to highlight image modalities according to their relative influence on the segmentation performance. The encoder also performs multiscale integration by highlighting the input feature maps (low-level features) with the spatial attention maps generated from convolution outputs (high-level features). The Dice similarity coefficient (DSC), serving as a cost function, is less sensitive to incorrect segmentation for small lesions. We address this issue by introducing a patch-based loss function that provides an average of the DSCs obtained from local image patches. This local average DSC is equally sensitive to large and small lesions, as the patch-based DSCs associated with small and large lesions have equal weights in this average DSC. RESULTS The framework was evaluated in 931 sets of images acquired in several clinical studies at two centers in Hong Kong and the United Kingdom. In particular, the training, validation, and test sets contain 615, 144, and 172 sets of images, respectively. The proposed framework outperformed single-stream networks and three recently proposed multistream networks, attaining F1 scores of 82.2 and 87.6% in the lesion and patient levels, respectively. The average inference time for an axial image was 11.8 ms. CONCLUSION The accuracy and efficiency afforded by the proposed framework would accelerate the MRI interpretation workflow of MRI-targeted biopsy and focal therapies.
Collapse
Affiliation(s)
- Mingjie Jiang
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
| | - Baohua Yuan
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
- Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Weixuan Kou
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
| | - Wen Yan
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
- Centre for Medical Image Computing, Wellcome/EPSRC Centre for Interventional & Surgical Sciences, Department of Medical Physics & Biomedical Engineering, University College London, London, UK
| | - Harry Marshall
- Schulich School of Medicine & Dentistry, Western University, Ontario, Canada
| | - Qianye Yang
- Centre for Medical Image Computing, Wellcome/EPSRC Centre for Interventional & Surgical Sciences, Department of Medical Physics & Biomedical Engineering, University College London, London, UK
| | - Tom Syer
- Centre for Medical Imaging, University College London, London, UK
| | - Shonit Punwani
- Centre for Medical Imaging, University College London, London, UK
| | - Mark Emberton
- Division of Surgery & Interventional Science, University College London, London, UK
| | - Dean C Barratt
- Centre for Medical Image Computing, Wellcome/EPSRC Centre for Interventional & Surgical Sciences, Department of Medical Physics & Biomedical Engineering, University College London, London, UK
| | - Carmen C M Cho
- Prince of Wales Hospital and Department of Imaging and Intervention Radiology, Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yipeng Hu
- Centre for Medical Image Computing, Wellcome/EPSRC Centre for Interventional & Surgical Sciences, Department of Medical Physics & Biomedical Engineering, University College London, London, UK
| | - Bernard Chiu
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
45
|
Wang N, Lin S, Li X, Li K, Shen Y, Gao Y, Ma L. MISSU: 3D Medical Image Segmentation via Self-Distilling TransUNet. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2740-2750. [PMID: 37018113 DOI: 10.1109/tmi.2023.3264433] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
U-Nets have achieved tremendous success in medical image segmentation. Nevertheless, it may have limitations in global (long-range) contextual interactions and edge-detail preservation. In contrast, the Transformer module has an excellent ability to capture long-range dependencies by leveraging the self-attention mechanism into the encoder. Although the Transformer module was born to model the long-range dependency on the extracted feature maps, it still suffers high computational and spatial complexities in processing high-resolution 3D feature maps. This motivates us to design an efficient Transformer-based UNet model and study the feasibility of Transformer-based network architectures for medical image segmentation tasks. To this end, we propose to self-distill a Transformer-based UNet for medical image segmentation, which simultaneously learns global semantic information and local spatial-detailed features. Meanwhile, a local multi-scale fusion block is first proposed to refine fine-grained details from the skipped connections in the encoder by the main CNN stem through self-distillation, only computed during training and removed at inference with minimal overhead. Extensive experiments on BraTS 2019 and CHAOS datasets show that our MISSU achieves the best performance over previous state-of-the-art methods. Code and models are available at: https://github.com/wangn123/MISSU.git.
Collapse
|
46
|
Shen DD, Bao SL, Wang Y, Chen YC, Zhang YC, Li XC, Ding YC, Jia ZZ. An automatic and accurate deep learning-based neuroimaging pipeline for the neonatal brain. Pediatr Radiol 2023; 53:1685-1697. [PMID: 36884052 DOI: 10.1007/s00247-023-05620-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 03/09/2023]
Abstract
BACKGROUND Accurate segmentation of neonatal brain tissues and structures is crucial for studying normal development and diagnosing early neurodevelopmental disorders. However, there is a lack of an end-to-end pipeline for automated segmentation and imaging analysis of the normal and abnormal neonatal brain. OBJECTIVE To develop and validate a deep learning-based pipeline for neonatal brain segmentation and analysis of structural magnetic resonance images (MRI). MATERIALS AND METHODS Two cohorts were enrolled in the study, including cohort 1 (582 neonates from the developing Human Connectome Project) and cohort 2 (37 neonates imaged using a 3.0-tesla MRI scanner in our hospital).We developed a deep leaning-based architecture capable of brain segmentation into 9 tissues and 87 structures. Then, extensive validations were performed for accuracy, effectiveness, robustness and generality of the pipeline. Furthermore, regional volume and cortical surface estimation were measured through in-house bash script implemented in FSL (Oxford Centre for Functional MRI of the Brain Software Library) to ensure reliability of the pipeline. Dice similarity score (DSC), the 95th percentile Hausdorff distance (H95) and intraclass correlation coefficient (ICC) were calculated to assess the quality of our pipeline. Finally, we finetuned and validated our pipeline on 2-dimensional thick-slice MRI in cohorts 1 and 2. RESULTS The deep learning-based model showed excellent performance for neonatal brain tissue and structural segmentation, with the best DSC and the 95th percentile Hausdorff distance (H95) of 0.96 and 0.99 mm, respectively. In terms of regional volume and cortical surface analysis, our model showed good agreement with ground truth. The ICC values for the regional volume were all above 0.80. Considering the thick-slice image pipeline, the same trend was observed for brain segmentation and analysis. The best DSC and H95 were 0.92 and 3.00 mm, respectively. The regional volumes and surface curvature had ICC values just below 0.80. CONCLUSIONS We propose an automatic, accurate, stable and reliable pipeline for neonatal brain segmentation and analysis from thin and thick structural MRI. The external validation showed very good reproducibility of the pipeline.
Collapse
Affiliation(s)
- Dan Dan Shen
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China
| | - Shan Lei Bao
- Department of Nuclear Medicine, Affiliated Hospital and Medical School of Nantong University, Jiangsu, People's Republic of China
| | - Yan Wang
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China
| | - Ying Chi Chen
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China
| | - Yu Cheng Zhang
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China
| | - Xing Can Li
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China
| | - Yu Chen Ding
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China
| | - Zhong Zheng Jia
- Department of Medical Imaging, Affiliated Hospital and Medical School of Nantong University, NO.20 Xisi Road, Nantong, Jiangsu, 226001, People's Republic of China.
| |
Collapse
|
47
|
Wan P, Xue H, Liu C, Chen F, Kong W, Zhang D. Dynamic Perfusion Representation and Aggregation Network for Nodule Segmentation Using Contrast-Enhanced US. IEEE J Biomed Health Inform 2023; 27:3431-3442. [PMID: 37097791 DOI: 10.1109/jbhi.2023.3270307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023]
Abstract
Dynamic contrast-enhanced ultrasound (CEUS) imaging has been widely applied in lesion detection and characterization, due to its offered real-time observation of microvascular perfusion. Accurate lesion segmentation is of great importance to the quantitative and qualitative perfusion analysis. In this paper, we propose a novel dynamic perfusion representation and aggregation network (DpRAN) for the automatic segmentation of lesions using dynamic CEUS imaging. The core challenge of this work lies in enhancement dynamics modeling of various perfusion areas. Specifically, we divide enhancement features into the two scales: short-range enhancement patterns and long-range evolution tendency. To effectively represent real-time enhancement characteristics and aggregate them in a global view, we introduce the perfusion excitation (PE) gate and cross-attention temporal aggregation (CTA) module, respectively. Different from the common temporal fusion methods, we also introduce an uncertainty estimation strategy to assist the model to locate the critical enhancement point first, in which a relatively distinguished enhancement pattern is displayed. The segmentation performance of our DpRAN method is validated on our collected CEUS datasets of thyroid nodules. We obtain the mean dice coefficient (DSC) and intersection of union (IoU) of 0.794 and 0.676, respectively. Superior performance demonstrates its efficacy to capture distinguished enhancement characteristics for lesion recognition.
Collapse
|
48
|
Yang H, Zhou T, Zhou Y, Zhang Y, Fu H. Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation. IEEE J Biomed Health Inform 2023; 27:3349-3359. [PMID: 37126623 DOI: 10.1109/jbhi.2023.3271808] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Automated brain tumor segmentation is crucial for aiding brain disease diagnosis and evaluating disease progress. Currently, magnetic resonance imaging (MRI) is a routinely adopted approach in the field of brain tumor segmentation that can provide different modality images. It is critical to leverage multi-modal images to boost brain tumor segmentation performance. Existing works commonly concentrate on generating a shared representation by fusing multi-modal data, while few methods take into account modality-specific characteristics. Besides, how to efficiently fuse arbitrary numbers of modalities is still a difficult task. In this study, we present a flexible fusion network (termed F 2Net) for multi-modal brain tumor segmentation, which can flexibly fuse arbitrary numbers of multi-modal information to explore complementary information while maintaining the specific characteristics of each modality. Our F 2Net is based on the encoder-decoder structure, which utilizes two Transformer-based feature learning streams and a cross-modal shared learning network to extract individual and shared feature representations. To effectively integrate the knowledge from the multi-modality data, we propose a cross-modal feature-enhanced module (CFM) and a multi-modal collaboration module (MCM), which aims at fusing the multi-modal features into the shared learning network and incorporating the features from encoders into the shared decoder, respectively. Extensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our F 2Net over other state-of-the-art segmentation methods.
Collapse
|
49
|
Li L, Hu Z, Huang Y, Zhu W, Zhao C, Wang Y, Chen M, Yu J. BP-Net: Boundary and perfusion feature guided dual-modality ultrasound video analysis network for fibrous cap integrity assessment. Comput Med Imaging Graph 2023; 107:102246. [PMID: 37210966 DOI: 10.1016/j.compmedimag.2023.102246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 05/09/2023] [Accepted: 05/10/2023] [Indexed: 05/23/2023]
Abstract
Ultrasonography is one of the main imaging methods for monitoring and diagnosing atherosclerosis due to its non-invasiveness and low-cost. Automatic differentiation of carotid plaque fibrous cap integrity by using multi-modal ultrasound videos has significant diagnostic and prognostic value for cardiovascular and cerebrovascular disease patients. However, the task faces several challenges, including high variation in plaque location and shape, the absence of analysis mechanism focusing on fibrous cap, the lack of effective mechanism to capture the relevance among multi-modal data for feature fusion and selection, etc. To overcome these challenges, we propose a new target boundary and perfusion feature guided video analysis network (BP-Net) based on conventional B-mode ultrasound and contrast-enhanced ultrasound videos for assessing the integrity of fibrous cap. Based on our previously proposed plaque auto-tracking network, in our BP-Net, we further introduce the plaque edge attention module and reverse mechanism to focus the dual video analysis on the fiber cap of plaques. Moreover, to fully explore the rich information on the fibrous cap and inside/outside of the plaque, we propose a feature fusion module for B-mode and contrast video to filter out the most valuable features for fibrous cap integrity assessment. Finally, multi-head convolution attention is proposed and embedded into transformer-based network, which captures semantic features and global context information to obtain accurate evaluation of fibrous caps integrity. The experimental results demonstrate that the proposed method has high accuracy and generalizability with an accuracy of 92.35% and an AUC of 0.935, which outperforms than the state-of-the-art deep learning based methods. A series of comprehensive ablation studies suggest the effectiveness of each proposed component and show great potential in clinical application.
Collapse
Affiliation(s)
- Leyin Li
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Zhaoyu Hu
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Yunqian Huang
- Department of Ultrasound, Tongren Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Wenqian Zhu
- Department of Ultrasound, Tongren Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Chengqian Zhao
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Yuanyuan Wang
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Man Chen
- Department of Ultrasound, Tongren Hospital, Shanghai Jiao Tong University, Shanghai, China.
| | - Jinhua Yu
- School of Information Science and Technology, Fudan University, Shanghai, China.
| |
Collapse
|
50
|
Černý M, Kybic J, Májovský M, Sedlák V, Pirgl K, Misiorzová E, Lipina R, Netuka D. Fully automated imaging protocol independent system for pituitary adenoma segmentation: a convolutional neural network-based model on sparsely annotated MRI. Neurosurg Rev 2023; 46:116. [PMID: 37162632 DOI: 10.1007/s10143-023-02014-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/08/2023] [Accepted: 04/28/2023] [Indexed: 05/11/2023]
Abstract
This study aims to develop a fully automated imaging protocol independent system for pituitary adenoma segmentation from magnetic resonance imaging (MRI) scans that can work without user interaction and evaluate its accuracy and utility for clinical applications. We trained two independent artificial neural networks on MRI scans of 394 patients. The scans were acquired according to various imaging protocols over the course of 11 years on 1.5T and 3T MRI systems. The segmentation model assigned a class label to each input pixel (pituitary adenoma, internal carotid artery, normal pituitary gland, background). The slice segmentation model classified slices as clinically relevant (structures of interest in slice) or irrelevant (anterior or posterior to sella turcica). We used MRI data of another 99 patients to evaluate the performance of the model during training. We validated the model on a prospective cohort of 28 patients, Dice coefficients of 0.910, 0.719, and 0.240 for tumour, internal carotid artery, and normal gland labels, respectively, were achieved. The slice selection model achieved 82.5% accuracy, 88.7% sensitivity, 76.7% specificity, and an AUC of 0.904. A human expert rated 71.4% of the segmentation results as accurate, 21.4% as slightly inaccurate, and 7.1% as coarsely inaccurate. Our model achieved good results comparable with recent works of other authors on the largest dataset to date and generalized well for various imaging protocols. We discussed future clinical applications, and their considerations. Models and frameworks for clinical use have yet to be developed and evaluated.
Collapse
Affiliation(s)
- Martin Černý
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic.
- 1st Faculty of Medicine, Charles University Prague, Kateřinská 1660/32, 121 08, Praha 2, Czech Republic.
| | - Jan Kybic
- Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 166 27, Praha 6, Czech Republic
| | - Martin Májovský
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| | - Vojtěch Sedlák
- Department of Radiodiagnostics, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| | - Karin Pirgl
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
- 3rd Faculty of Medicine, Charles University Prague, Ruská 87, 100 00, Praha 10, Czech Republic
| | - Eva Misiorzová
- Department of Neurosurgery, Faculty of Medicine, University of Ostrava, University Hospital Ostrava, 17. listopadu 1790/5, 708 52, Ostrava-Poruba, Czech Republic
| | - Radim Lipina
- Department of Neurosurgery, Faculty of Medicine, University of Ostrava, University Hospital Ostrava, 17. listopadu 1790/5, 708 52, Ostrava-Poruba, Czech Republic
| | - David Netuka
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| |
Collapse
|