1
|
Zhang Z, Zhang H, Zeng T, Yang G, Shi Z, Gao Z. Bridging multi-level gaps: Bidirectional reciprocal cycle framework for text-guided label-efficient segmentation in echocardiography. Med Image Anal 2025; 102:103536. [PMID: 40073581 DOI: 10.1016/j.media.2025.103536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 02/16/2025] [Accepted: 02/27/2025] [Indexed: 03/14/2025]
Abstract
Text-guided visual understanding is a potential solution for downstream task learning in echocardiography. It can reduce reliance on labeled large datasets and facilitate learning clinical tasks. This is because the text can embed highly condensed clinical information into predictions for visual tasks. The contrastive language-image pretraining (CLIP) based methods extract image-text features by constructing a contrastive learning pre-train process in a sequence of matched text and images. These methods adapt the pre-trained network parameters to improve downstream task performance with text guidance. However, these methods still have the challenge of the multi-level gap between image and text. It mainly stems from spatial-level, contextual-level, and domain-level gaps. It is difficult to deal with medical image-text pairs and dense prediction tasks. Therefore, we propose a bidirectional reciprocal cycle (BRC) framework to bridge the multi-level gaps. First, the BRC constructs pyramid reciprocal alignments of embedded global and local image-text feature representations. This matches complex medical expertise with corresponding phenomena. Second, BRC enforces the forward inference to be consistent with the reverse mapping (i.e., the text → feature is consistent with the feature → text or feature → image). This enforces the perception of the contextual relationship between input data and feature. Third, the BRC can adapt to the specific downstream segmentation task. This embeds complex text information to directly guide downstream tasks with a cross-modal attention mechanism. Compared with 22 existing methods, our BRC can achieve state-of-the-art performance on segmentation tasks (DSC = 95.2%). Extensive experiments on 11048 patients show that our method can significantly improve the accuracy and reduce the reliance on labeled data (DSC increased from 81.5% to 86.6% with text assistance in 1% labeled proportion data).
Collapse
Affiliation(s)
- Zhenxuan Zhang
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen 518107, China; Bioengineering Department and Imperial-X, Imperial College London, W12 7SL London, UK
| | - Heye Zhang
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen 518107, China; Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai 519080, China
| | - Tieyong Zeng
- Department of Mathematics, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Guang Yang
- Bioengineering Department and Imperial-X, Imperial College London, W12 7SL London, UK; National Heart and Lung Institute, Imperial College London, SW7 2AZ London, UK; Cardiovascular Research Centre, Royal Brompton Hospital, SW3 6NP London, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, WC2R 2LS London, UK
| | - Zhenquan Shi
- School of Information Science and Technology, Nantong University, China.
| | - Zhifan Gao
- School of Biomedical Engineering, Sun Yat-sen University, Shenzhen 518107, China.
| |
Collapse
|
2
|
Zhang W, Yang T, Fan J, Wang H, Ji M, Zhang H, Miao J. U-shaped network combining dual-stream fusion mamba and redesigned multilayer perceptron for myocardial pathology segmentation. Med Phys 2025. [PMID: 40247150 DOI: 10.1002/mp.17812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 03/10/2025] [Accepted: 03/16/2025] [Indexed: 04/19/2025] Open
Abstract
BACKGROUND Cardiac magnetic resonance imaging (CMR) provides critical pathological information, such as scars and edema, which are vital for diagnosing myocardial infarction (MI). However, due to the limited pathological information in single-sequence CMR images and the small size of pathological regions, automatic segmentation of myocardial pathology remains a significant challenge. PURPOSE In the paper, we propose a novel two-stage anatomical-pathological segmentation framework combining Kolmogorov-Arnold Networks (KAN) and Mamba, aiming to effectively segment myocardial pathology in multi-sequence CMR images. METHODS First, in the coarse segmentation stage, we employed a multiline parallel MambaUnet as the anatomical structure segmentation network to obtain shape prior information. This approach effectively addresses the class imbalance issue and aids in subsequent pathological segmentation. In the fine segmentation stage, we introduced a novel U-shaped segmentation network, KANMambaNet, which features a Dual-Stream Fusion Mamba module. This module enhances the network's ability to capture long-range dependencies while improving its capability to distinguish different pathological features in small regions. Additionally, we developed a Kolmogorov-Arnold Network-based multilayer perceptron (KAN MLP) module that utilizes learnable activation functions instead of fixed nonlinear functions. This design enhances the network's flexibility in handling various pathological features, enabling more accurate differentiation of the pathological characteristics at the boundary between edema and scar regions. Our method achieves competitive segmentation performance compared to state-of-the-art models, particularly in terms of the Dice coefficient. RESULTS We validated our model's performance on the MyoPS2020 dataset, achieving a Dice score of 0.8041 ± $\pm$ 0.0751 for myocardial edema and 0.9051 ± $\pm$ 0.0240 for myocardial scar. Compared to the baseline model MambaUnet, our edema segmentation performance improved by 0.1420, and scar segmentation performance improved by 0.1081. CONCLUSIONS We developed an innovative two-stage anatomical-pathological segmentation framework that integrates KAN and Mamba, effectively segmenting myocardial pathology in multi-sequence CMR images. The experimental results demonstrate that our proposed method achieves superior segmentation performance compared to other state-of-the-art methods.
Collapse
Affiliation(s)
- Wenjie Zhang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Tiejun Yang
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, China
- Key Laboratory of Grain Information Processing and Control (HAUT), Ministry of Education, Zhengzhou, China
- Henan Key Laboratory of Grain Photoelectric Detection and Control (HAUT), Zhengzhou, Henan, China
| | - Jiacheng Fan
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Heng Wang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Mingzhu Ji
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Huiyao Zhang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
| | - Jianyu Miao
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, China
| |
Collapse
|
3
|
Li S, Li X, Wang P, Liu K, Wei B, Cong J. An enhanced visual state space model for myocardial pathology segmentation in multi-sequence cardiac MRI. Med Phys 2025. [PMID: 40108817 DOI: 10.1002/mp.17761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Revised: 02/14/2025] [Accepted: 03/01/2025] [Indexed: 03/22/2025] Open
Abstract
BACKGROUND Myocardial pathology (scar and edema) segmentation plays a crucial role in the diagnosis, treatment, and prognosis of myocardial infarction (MI). However, the current mainstream models for myocardial pathology segmentation have the following limitations when faced with cardiac magnetic resonance(CMR) images with multiple objects and large changes in object scale: the remote modeling ability of convolutional neural networks is insufficient, and the computational complexity of transformers is high, which makes myocardial pathology segmentation challenging. PURPOSE This study aims to develop a novel model to address the image characteristics and algorithmic challenges faced in the myocardial pathology segmentation task and improve the accuracy and efficiency of myocardial pathology segmentation. METHODS We developed a novel visual state space (VSS)-based deep neural network, MPS-Mamba. In order to accurately and adequately extract CMR image features, the encoder employs a dual-branch structure to extract global and local features of the image. Among them, the VSS branch overcomes the limitations of the current mainstream models for myocardial pathology segmentation by modeling remote relationships through linear computability, while the convolutional-based branch provides complementary local information. Given the unique properties of the dual branches, we design a modular dual-branch fusion module for fusing dual branches to enhance the feature representation of the dual encoder. To improve the ability to model objects of different scales in cardiac magnetic resonance (CMR) images, a multi-scale feature fusion (MSF) module is designed to achieve effective integration and fine expression of multi-scale information. To further incorporate anatomical knowledge to optimize segmentation results, a decoder with three decoding branches is designed to output segmentation results of scar, edema, and myocardium, respectively. In addition, multiple sets of constraint functions are used to not only improve the segmentation accuracy of myocardial pathology but also effectively model the spatial position relationship between myocardium, scar, and edema. RESULTS The proposed method was comprehensively evaluated on the MyoPS 2020 dataset, and the results showed that MPS-Mamba achieved an average Dice score of 0.717 ± $\pm$ 0.169 in myocardial scar segmentation, which is superior to the current mainstream methods. In addition, MPS-Mamba also performed well in the edema segmentation task, with an average Dice score of 0.735 ± $\pm$ 0.073. The experimental results further demonstrate the effectiveness of MPS-Mamba in segmenting myocardial pathologies in multi-sequence CMR images, verifying its advantages in myocardial pathology segmentation tasks. CONCLUSIONS Given the effectiveness and superiority of MPS-Mamba, this method is expected to become a potential myocardial pathology segmentation tool that can effectively assist clinical diagnosis.
Collapse
Affiliation(s)
- Shuning Li
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Xiang Li
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Pingping Wang
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Kunmeng Liu
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Benzheng Wei
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| | - Jinyu Cong
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China
- Qingdao Key Laboratory of Artificial Intelligence Technology in Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Qingdao, China
| |
Collapse
|
4
|
Akbari S, Tabassian M, Pedrosa J, Queiros S, Papangelopoulou K, D'hooge J. BEAS-Net: A Shape-Prior-Based Deep Convolutional Neural Network for Robust Left Ventricular Segmentation in 2-D Echocardiography. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2024; 71:1565-1576. [PMID: 38913532 DOI: 10.1109/tuffc.2024.3418030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Left ventricle (LV) segmentation of 2-D echocardiography images is an essential step in the analysis of cardiac morphology and function and-more generally-diagnosis of cardiovascular diseases (CVD). Several deep learning (DL) algorithms have recently been proposed for the automatic segmentation of the LV, showing significant performance improvement over the traditional segmentation algorithms. However, unlike the traditional methods, prior information about the segmentation problem, e.g., anatomical shape information, is not usually incorporated for training the DL algorithms. This can degrade the generalization performance of the DL models on unseen images if their characteristics are somewhat different from those of the training images, e.g., low-quality testing images. In this study, a new shape-constrained deep convolutional neural network (CNN)-called B-spline explicit active surface (BEAS)-Net-is introduced for automatic LV segmentation. The BEAS-Net learns how to associate the image features, encoded by its convolutional layers, with anatomical shape-prior information derived by the BEAS algorithm to generate physiologically meaningful segmentation contours when dealing with artifactual or low-quality images. The performance of the proposed network was evaluated using three different in vivo datasets and was compared with a deep segmentation algorithm based on the U-Net model. Both the networks yielded comparable results when tested on images of acceptable quality, but the BEAS-Net outperformed the benchmark DL model on artifactual and low-quality images.
Collapse
|
5
|
Zhang Z, Yu C, Zhang H, Gao Z. Embedding Tasks Into the Latent Space: Cross-Space Consistency for Multi-Dimensional Analysis in Echocardiography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2215-2228. [PMID: 38329865 DOI: 10.1109/tmi.2024.3362964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
Multi-dimensional analysis in echocardiography has attracted attention due to its potential for clinical indices quantification and computer-aided diagnosis. It can utilize various information to provide the estimation of multiple cardiac indices. However, it still has the challenge of inter-task conflict. This is owing to regional confusion, global abnormalities, and time-accumulated errors. Task mapping methods have the potential to address inter-task conflict. However, they may overlook the inherent differences between tasks, especially for multi-level tasks (e.g., pixel-level, image-level, and sequence-level tasks). This may lead to inappropriate local and spurious task constraints. We propose cross-space consistency (CSC) to overcome the challenge. The CSC embeds multi-level tasks to the same-level to reduce inherent task differences. This allows multi-level task features to be consistent in a unified latent space. The latent space extracts task-common features and constrains the distance in these features. This constrains the task weight region that satisfies multiple task conditions. Extensive experiments compare the CSC with fifteen state-of-the-art echocardiographic analysis methods on five datasets (10,908 patients). The result shows that the CSC can provide left ventricular (LV) segmentation, (DSC = 0.932), keypoint detection (MAE = 3.06mm), and keyframe identification (accuracy = 0.943). These results demonstrate that our method can provide a multi-dimensional analysis of cardiac function and is robust in large-scale datasets.
Collapse
|
6
|
Freitas J, Gomes-Fonseca J, Tonelli AC, Correia-Pinto J, Fonseca JC, Queirós S. Automatic multi-view pose estimation in focused cardiac ultrasound. Med Image Anal 2024; 94:103146. [PMID: 38537416 DOI: 10.1016/j.media.2024.103146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 03/18/2024] [Accepted: 03/19/2024] [Indexed: 04/16/2024]
Abstract
Focused cardiac ultrasound (FoCUS) is a valuable point-of-care method for evaluating cardiovascular structures and function, but its scope is limited by equipment and operator's experience, resulting in primarily qualitative 2D exams. This study presents a novel framework to automatically estimate the 3D spatial relationship between standard FoCUS views. The proposed framework uses a multi-view U-Net-like fully convolutional neural network to regress line-based heatmaps representing the most likely areas of intersection between input images. The lines that best fit the regressed heatmaps are then extracted, and a system of nonlinear equations based on the intersection between view triplets is created and solved to determine the relative 3D pose between all input images. The feasibility and accuracy of the proposed pipeline were validated using a novel realistic in silico FoCUS dataset, demonstrating promising results. Interestingly, as shown in preliminary experiments, the estimation of the 2D images' relative poses enables the application of 3D image analysis methods and paves the way for 3D quantitative assessments in FoCUS examinations.
Collapse
Affiliation(s)
- João Freitas
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal; Algoritmi Center, School of Engineering, University of Minho, Guimarães, Portugal
| | - João Gomes-Fonseca
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal
| | | | - Jorge Correia-Pinto
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal; Department of Pediatric Surgery, Hospital de Braga, Braga, Portugal
| | - Jaime C Fonseca
- Algoritmi Center, School of Engineering, University of Minho, Guimarães, Portugal
| | - Sandro Queirós
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal.
| |
Collapse
|
7
|
Li D, Peng Y, Sun J, Guo Y. A task-unified network with transformer and spatial-temporal convolution for left ventricular quantification. Sci Rep 2023; 13:13529. [PMID: 37598235 PMCID: PMC10439898 DOI: 10.1038/s41598-023-40841-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 08/17/2023] [Indexed: 08/21/2023] Open
Abstract
Quantification of the cardiac function is vital for diagnosing and curing the cardiovascular diseases. Left ventricular function measurement is the most commonly used measure to evaluate the function of cardiac in clinical practice, how to improve the accuracy of left ventricular quantitative assessment results has always been the subject of research by medical researchers. Although considerable efforts have been put forward to measure the left ventricle (LV) automatically using deep learning methods, the accurate quantification is yet a challenge work as a result of the changeable anatomy structure of heart in the systolic diastolic cycle. Besides, most methods used direct regression method which lacks of visual based analysis. In this work, a deep learning segmentation and regression task-unified network with transformer and spatial-temporal convolution is proposed to segment and quantify the LV simultaneously. The segmentation module leverages a U-Net like 3D Transformer model to predict the contour of three anatomy structures, while the regression module learns spatial-temporal representations from the original images and the reconstruct feature map from segmentation path to estimate the finally desired quantification metrics. Furthermore, we employ a joint task loss function to train the two module networks. Our framework is evaluated on the MICCAI 2017 Left Ventricle Full Quantification Challenge dataset. The results of experiments demonstrate the effectiveness of our framework, which achieves competitive cardiac quantification metric results and at the same time produces visualized segmentation results that are conducive to later analysis.
Collapse
Affiliation(s)
- Dapeng Li
- Shandong University of Science and Technology, Qingdao, China
| | - Yanjun Peng
- Shandong University of Science and Technology, Qingdao, China.
- Shandong Province Key Laboratory of Wisdom Mining Information Technology, Qingdao, China.
| | - Jindong Sun
- Shandong University of Science and Technology, Qingdao, China
| | - Yanfei Guo
- Shandong University of Science and Technology, Qingdao, China
| |
Collapse
|
8
|
Tang S, Yu X, Cheang CF, Ji X, Yu HH, Choi IC. CLELNet: A continual learning network for esophageal lesion analysis on endoscopic images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107399. [PMID: 36780717 DOI: 10.1016/j.cmpb.2023.107399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 01/03/2023] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE A deep learning-based intelligent diagnosis system can significantly reduce the burden of endoscopists in the daily analysis of esophageal lesions. Considering the need to add new tasks in the diagnosis system, a deep learning model that can train a series of tasks incrementally using endoscopic images is essential for identifying the types and regions of esophageal lesions. METHOD In this paper, we proposed a continual learning-based esophageal lesion network (CLELNet), in which a convolutional autoencoder was designed to extract representation features of endoscopic images among different esophageal lesions. The proposed CLELNet consists of shared layers and task-specific layers. Shared layers are used to extract common features among different lesions while task-specific layers can complete different tasks. The first two tasks trained by the CLELNet are the classification (task 1) and the segmentation (task 2). We collected a dataset of esophageal endoscopic images from Macau Kiang Wu Hospital for training and testing the CLELNet. RESULTS The experimental results showed that the classification accuracy of task 1 was 95.96%, and the Intersection Over Union and the Dice Similarity Coefficient of task 2 were 65.66% and 78.08%, respectively. CONCLUSIONS The proposed CLELNet can realize task-incremental learning without forgetting the previous tasks and thus become a useful computer-aided diagnosis system in esophageal lesions analysis.
Collapse
Affiliation(s)
- Suigu Tang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Xiaoyuan Yu
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Chak Fong Cheang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR.
| | - Xiaoyu Ji
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Hon Ho Yu
- Kiang Wu Hospital, Rua de Coelho do Amaral, Macau SAR
| | - I Cheong Choi
- Kiang Wu Hospital, Rua de Coelho do Amaral, Macau SAR
| |
Collapse
|
9
|
Shoaib MA, Chuah JH, Ali R, Dhanalakshmi S, Hum YC, Khalil A, Lai KW. Fully Automatic Left Ventricle Segmentation Using Bilateral Lightweight Deep Neural Network. LIFE (BASEL, SWITZERLAND) 2023; 13:life13010124. [PMID: 36676073 PMCID: PMC9864753 DOI: 10.3390/life13010124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 12/22/2022] [Accepted: 12/29/2022] [Indexed: 01/04/2023]
Abstract
The segmentation of the left ventricle (LV) is one of the fundamental procedures that must be performed to obtain quantitative measures of the heart, such as its volume, area, and ejection fraction. In clinical practice, the delineation of LV is still often conducted semi-automatically, leaving it open to operator subjectivity. The automatic LV segmentation from echocardiography images is a challenging task due to poorly defined boundaries and operator dependency. Recent research has demonstrated that deep learning has the capability to employ the segmentation process automatically. However, the well-known state-of-the-art segmentation models still lack in terms of accuracy and speed. This study aims to develop a single-stage lightweight segmentation model that precisely and rapidly segments the LV from 2D echocardiography images. In this research, a backbone network is used to acquire both low-level and high-level features. Two parallel blocks, known as the spatial feature unit and the channel feature unit, are employed for the enhancement and improvement of these features. The refined features are merged by an integrated unit to segment the LV. The performance of the model and the time taken to segment the LV are compared to other established segmentation models, DeepLab, FCN, and Mask RCNN. The model achieved the highest values of the dice similarity index (0.9446), intersection over union (0.8445), and accuracy (0.9742). The evaluation metrics and processing time demonstrate that the proposed model not only provides superior quantitative results but also trains and segments the LV in less time, indicating its improved performance over competing segmentation models.
Collapse
Affiliation(s)
- Muhammad Ali Shoaib
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Faculty of Information and Communication Technology, BUITEMS, Quetta 87300, Pakistan
| | - Joon Huang Chuah
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
| | - Raza Ali
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Faculty of Information and Communication Technology, BUITEMS, Quetta 87300, Pakistan
| | - Samiappan Dhanalakshmi
- Department of Electronics and Communication Engineering, SRM Institute of Science and Technology, Kattankulathur 603203, India
| | - Yan Chai Hum
- Department of Mechatronics and Biomedical Engineering (DMBE), Lee Kong Chian Faculty of Engineering and Science (LKC FES), Universiti Tunku Abdul Rahman (UTAR), Jalan Sungai Long, Bandar Sungai Long, Cheras, Kajang 43000, Malaysia
| | - Azira Khalil
- Faculty of Science and Technology, Universiti Sains Islam Malaysia (USIM), Nilai 71800, Malaysia
| | - Khin Wee Lai
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Correspondence:
| |
Collapse
|
10
|
Painchaud N, Duchateau N, Bernard O, Jodoin PM. Echocardiography Segmentation With Enforced Temporal Consistency. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2867-2878. [PMID: 35533176 DOI: 10.1109/tmi.2022.3173669] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Convolutional neural networks (CNN) have demonstrated their ability to segment 2D cardiac ultrasound images. However, despite recent successes according to which the intra-observer variability on end-diastole and end-systole images has been reached, CNNs still struggle to leverage temporal information to provide accurate and temporally consistent segmentation maps across the whole cycle. Such consistency is required to accurately describe the cardiac function, a necessary step in diagnosing many cardiovascular diseases. In this paper, we propose a framework to learn the 2D+time apical long-axis cardiac shape such that the segmented sequences can benefit from temporal and anatomical consistency constraints. Our method is a post-processing that takes as input segmented echocardiographic sequences produced by any state-of-the-art method and processes it in two steps to (i) identify spatio-temporal inconsistencies according to the overall dynamics of the cardiac sequence and (ii) correct the inconsistencies. The identification and correction of cardiac inconsistencies relies on a constrained autoencoder trained to learn a physiologically interpretable embedding of cardiac shapes, where we can both detect and fix anomalies. We tested our framework on 98 full-cycle sequences from the CAMUS dataset, which are available alongside this paper. Our temporal regularization method not only improves the accuracy of the segmentation across the whole sequences, but also enforces temporal and anatomical consistency.
Collapse
|
11
|
RE-3DLVNet: Refined estimation of the left ventricle volume via interactive 3D segmentation and reinforced quantification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
12
|
Zhao C, Chen W, Qin J, Yang P, Xiang Z, Frangi AF, Chen M, Fan S, Yu W, Chen X, Xia B, Wang T, Lei B. IFT-Net: Interactive Fusion Transformer Network for Quantitative Analysis of Pediatric Echocardiography. Med Image Anal 2022; 82:102648. [DOI: 10.1016/j.media.2022.102648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 09/01/2022] [Accepted: 09/27/2022] [Indexed: 10/31/2022]
|
13
|
Sun J, Wu Z, Yu Z, Chen H, Du C, Xu L, Zhong J, Feng J, Coatrieux G, Coatrieux JL, Chen Y. Automatic video analysis framework for exposure region recognition in X-ray imaging automation. IEEE J Biomed Health Inform 2022; 26:4359-4370. [PMID: 35503854 DOI: 10.1109/jbhi.2022.3172369] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The deep learning-based automatic recognition of the scanning or exposing region in medical imaging automation is a promising new technique, which can decrease the heavy workload of the radiographers, optimize imaging workflow and improve image quality. However, there is little related research and practice in X-ray imaging. In this paper, we focus on two key problems in X-ray imaging automation: automatic recognition of the exposure moment and the exposure region. Consequently, we propose an automatic video analysis framework based on the hybrid model, approaching real-time performance. The framework consists of three interdependent components: Body Structure Detection, Motion State Tracing, and Body Modeling. Body Structure Detection disassembles the patient to obtain the corresponding body keypoints and body Bboxes. Combining and analyzing the two different types of body structure representations is to obtain rich spatial location information about the patient body structure. Motion State Tracing focuses on the motion state analysis of the exposure region to recognize the appropriate exposure moment. The exposure region is calculated by Body Modeling when the exposure moment appears. A large-scale dataset for X-ray examination scene is built to validate the performance of the proposed method. Extensive experiments demonstrate the superiority of the proposed method in automatically recognizing the exposure moment and exposure region. This paradigm provides the first method that can enable automatically and accurately recognize the exposure region in X-ray imaging without the help of the radiographer.
Collapse
|
14
|
Cui X, Cao Y, Liu Z, Sui X, Mi J, Zhang Y, Cui L, Li S. TRSA-Net: Task Relation Spatial co-Attention for Joint Segmentation, Quantification and Uncertainty Estimation on Paired 2D Echocardiography. IEEE J Biomed Health Inform 2022; 26:4067-4078. [PMID: 35503848 DOI: 10.1109/jbhi.2022.3171985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Clinical workflow of cardiac assessment on 2D echocardiography requires both accurate segmentation and quantification of the Left Ventricle (LV) from paired apical 4-chamber and 2-chamber. Moreover, uncertainty estimation is significant in clinically understanding the performance of a model. However, current research on 2D echocardiography ignores this vital task while joint segmentation with quantification, hence motivating the need for a unified optimization method. In this paper, we propose a multitask model with Task Relation Spatial co-Attention (referred as TRSA-Net) for joint segmentation, quantification, and uncertainty estimation on paired 2D echo. TRSA-Net achieves multitask joint learning by novelly exploring the spatial correlation between tasks. The task relation spatial co-attention learns the spatial mapping among task-specific features by non-local and co-excitation, which forcibly joints embedded spatial information in the segmentation and quantification. The Boundary-aware Structure Consistency (BSC) and Joint Indices Constraint (JIC) are integrated into the multitask learning optimization objective to guide the learning of segmentation and quantification paths. The BSC creatively promotes structural similarity of predictions, and JIC explores the internal relationship between three quantitative indices. We validate the efficacy of our TRSA-Net on the public CAMUS dataset. Extensive comparison and ablation experiments show that our approach can achieve competitive segmentation performance and highly accurate results on quantification.
Collapse
|
15
|
Cui X, Zhang P, Li Y, Liu Z, Xiao X, Zhang Y, Sun L, Cui L, Yang G, Li S. MCAL: An Anatomical Knowledge Learning Model for Myocardial Segmentation in 2-D Echocardiography. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2022; 69:1277-1287. [PMID: 35167446 DOI: 10.1109/tuffc.2022.3151647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Segmentation of the left ventricular (LV) myocardium in 2-D echocardiography is essential for clinical decision making, especially in geometry measurement and index computation. However, segmenting the myocardium is a time-consuming process and challenging due to the fuzzy boundary caused by the low image quality. The ground-truth label is employed as pixel-level class associations or shape regulation in segmentation, which works limit for effective feature enhancement for 2-D echocardiography. We propose a training strategy named multiconstrained aggregate learning (referred to as MCAL), which leverages anatomical knowledge learned through ground-truth labels to infer segmented parts and discriminate boundary pixels. The new framework encourages the model to focus on the features in accordance with the learned anatomical representations, and the training objectives incorporate a boundary distance transform weight (BDTW) to enforce a higher weight value on the boundary region, which helps to improve the segmentation accuracy. The proposed method is built as an end-to-end framework with a top-down, bottom-up architecture with skip convolution fusion blocks and carried out on two datasets (our dataset and the public CAMUS dataset). The comparison study shows that the proposed network outperforms the other segmentation baseline models, indicating that our method is beneficial for boundary pixels discrimination in segmentation.
Collapse
|
16
|
Shen L, Yu L, Zhao W, Pauly J, Xing L. Novel-view X-ray projection synthesis through geometry-integrated deep learning. Med Image Anal 2022; 77:102372. [PMID: 35131701 PMCID: PMC8916089 DOI: 10.1016/j.media.2022.102372] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 01/14/2022] [Accepted: 01/16/2022] [Indexed: 01/12/2023]
Abstract
X-ray imaging is a widely used approach to view the internal structure of a subject for clinical diagnosis, image-guided interventions and decision-making. The X-ray projections acquired at different view angles provide complementary information of patient's anatomy and are required for stereoscopic or volumetric imaging of the subject. In reality, obtaining multiple-view projections inevitably increases radiation dose and complicates clinical workflow. Here we investigate a strategy of obtaining the X-ray projection image at a novel view angle from a given projection image at a specific view angle to alleviate the need for actual projection measurement. Specifically, a Deep Learning-based Geometry-Integrated Projection Synthesis (DL-GIPS) framework is proposed for the generation of novel-view X-ray projections. The proposed deep learning model extracts geometry and texture features from a source-view projection, and then conducts geometry transformation on the geometry features to accommodate the change of view angle. At the final stage, the X-ray projection in the target view is synthesized from the transformed geometry and the shared texture features via an image generator. The feasibility and potential impact of the proposed DL-GIPS model are demonstrated using lung imaging cases. The proposed strategy can be generalized to a general case of multiple projections synthesis from multiple input views and potentially provides a new paradigm for various stereoscopic and volumetric imaging with substantially reduced efforts in data acquisition.
Collapse
Affiliation(s)
- Liyue Shen
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| | - Lequan Yu
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| | - Wei Zhao
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| | - John Pauly
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Lei Xing
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA; Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| |
Collapse
|
17
|
Wang KN, Yang X, Miao J, Li L, Yao J, Zhou P, Xue W, Zhou GQ, Zhuang X, Ni D. AWSnet: An Auto-weighted Supervision Attention Network for Myocardial Scar and Edema Segmentation in Multi-sequence Cardiac Magnetic Resonance Images. Med Image Anal 2022; 77:102362. [DOI: 10.1016/j.media.2022.102362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 10/26/2021] [Accepted: 01/10/2022] [Indexed: 10/19/2022]
|
18
|
X-CTRSNet: 3D cervical vertebra CT reconstruction and segmentation directly from 2D X-ray images. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107680] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
19
|
Convolutional squeeze-and-excitation network for ECG arrhythmia detection. Artif Intell Med 2021; 121:102181. [PMID: 34763803 DOI: 10.1016/j.artmed.2021.102181] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 11/21/2022]
Abstract
Automatic detection of arrhythmia through an electrocardiogram (ECG) is of great significance for the prevention and treatment of cardiovascular diseases. In Convolutional neural network, the ECG signal is converted into multiple feature channels with equal weights through the convolution operation. Multiple feature channels can provide richer and more comprehensive information, but also contain redundant information, which will affect the diagnosis of arrhythmia, so feature channels that contain arrhythmia information should be paid attention to and given larger weight. In this paper, we introduced the Squeeze-and-Excitation (SE) block for the first time for the automatic detection of multiple types of arrhythmias with ECG. Our algorithm combines the residual convolutional module and the SE block to extract features from the original ECG signal. The SE block adaptively enhances the discriminative features and suppresses noise by explicitly modeling the interdependence between the channels, which can adaptively integrate information from different feature channels of ECG. The one-dimensional convolution operation over the time dimension is used to extract temporal information and the shortcut connection of the Se-Residual convolutional module in the proposed model makes the network easier to optimize. Thanks to the powerful feature extraction capabilities of the network, which can effectively extract discriminative arrhythmia features in multiple feature channels, so that no extra data preprocessing including denoising in other methods are need for our framework. It thus improves the working efficiency and keeps the collected biological information without loss. Experiments conducted with the 12-lead ECG dataset of the China Physiological Signal Challenge (CPSC) 2018 and the dataset of PhysioNet/Computing in Cardiology (CinC) Challenge 2017. The experiment results show that our model gains great performance and has great potential in clinical.
Collapse
|
20
|
Lyu T, Yang G, Zhao X, Shu H, Luo L, Chen D, Xiong J, Yang J, Li S, Coatrieux JL, Chen Y. Dissected aorta segmentation using convolutional neural networks. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 211:106417. [PMID: 34587564 DOI: 10.1016/j.cmpb.2021.106417] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 09/12/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Aortic dissection is a severe cardiovascular pathology in which an injury of the intimal layer of the aorta allows blood flowing into the aortic wall, forcing the wall layers apart. Such situation presents a high mortality rate and requires an in-depth understanding of the 3-D morphology of the dissected aorta to plan the right treatment. An accurate automatic segmentation algorithm is therefore needed. METHOD In this paper, we propose a deep-learning-based algorithm to segment dissected aorta on computed tomography angiography (CTA) images. The algorithm consists of two steps. Firstly, a 3-D convolutional neural network (CNN) is applied to divide the 3-D volume into two anatomical portions. Secondly, two 2-D CNNs based on pyramid scene parsing network (PSPnet) segment each specific portion separately. An edge extraction branch was added to the 2-D model to get higher segmentation accuracy on intimal flap area. RESULTS The experiments conducted and the comparisons made show that the proposed solution performs well with an average dice index over 92%. The combination of 3-D and 2-D models improves the aorta segmentation accuracy compared to 3-D only models and the segmentation robustness compared to 2-D only models. The edge extraction branch improves the DICE index near aorta boundaries from 73.41% to 81.39%. CONCLUSIONS The proposed algorithm has satisfying performance for capturing the aorta structure while avoiding false positives on the intimal flaps.
Collapse
Affiliation(s)
- Tianling Lyu
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Guanyu Yang
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Xingran Zhao
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Huazhong Shu
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Limin Luo
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China
| | - Duanduan Chen
- Department of Biomedical Engineering, Beijing Institute of Technology, Beijing, China
| | | | - Jian Yang
- School of Optoelectronics, Beijing Institute of Technology, Beijing, China
| | - Shuo Li
- Digital Imaging Group of London, London, Canada
| | | | - Yang Chen
- Laboratory of Imaging Science and Technology, Southeast University, Nanjing, China; School of Cyber Science and Engineering, Southeast University, Nanjing, China; Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China.
| |
Collapse
|
21
|
de Siqueira VS, Borges MM, Furtado RG, Dourado CN, da Costa RM. Artificial intelligence applied to support medical decisions for the automatic analysis of echocardiogram images: A systematic review. Artif Intell Med 2021; 120:102165. [PMID: 34629153 DOI: 10.1016/j.artmed.2021.102165] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 08/07/2021] [Accepted: 08/31/2021] [Indexed: 12/16/2022]
Abstract
The echocardiogram is a test that is widely used in Heart Disease Diagnoses. However, its analysis is largely dependent on the physician's experience. In this regard, artificial intelligence has become an essential technology to assist physicians. This study is a Systematic Literature Review (SLR) of primary state-of-the-art studies that used Artificial Intelligence (AI) techniques to automate echocardiogram analyses. Searches on the leading scientific article indexing platforms using a search string returned approximately 1400 articles. After applying the inclusion and exclusion criteria, 118 articles were selected to compose the detailed SLR. This SLR presents a thorough investigation of AI applied to support medical decisions for the main types of echocardiogram (Transthoracic, Transesophageal, Doppler, Stress, and Fetal). The article's data extraction indicated that the primary research interest of the studies comprised four groups: 1) Improvement of image quality; 2) identification of the cardiac window vision plane; 3) quantification and analysis of cardiac functions, and; 4) detection and classification of cardiac diseases. The articles were categorized and grouped to show the main contributions of the literature to each type of ECHO. The results indicate that the Deep Learning (DL) methods presented the best results for the detection and segmentation of the heart walls, right and left atrium and ventricles, and classification of heart diseases using images/videos obtained by echocardiography. The models that used Convolutional Neural Network (CNN) and its variations showed the best results for all groups. The evidence produced by the results presented in the tabulation of the studies indicates that the DL contributed significantly to advances in echocardiogram automated analysis processes. Although several solutions were presented regarding the automated analysis of ECHO, this area of research still has great potential for further studies to improve the accuracy of results already known in the literature.
Collapse
Affiliation(s)
- Vilson Soares de Siqueira
- Federal Institute of Tocantins, Av. Bernado Sayão, S/N, Santa Maria, Colinas do Tocantins, TO, Brazil; Federal University of Goias, Alameda Palmeiras, Quadra D, Câmpus Samambaia, Goiânia, GO, Brazil.
| | - Moisés Marcos Borges
- Diagnostic Imaging Center - CDI, Av. Portugal, 1155, St. Marista, Goiânia, GO, Brazil
| | - Rogério Gomes Furtado
- Diagnostic Imaging Center - CDI, Av. Portugal, 1155, St. Marista, Goiânia, GO, Brazil
| | - Colandy Nunes Dourado
- Diagnostic Imaging Center - CDI, Av. Portugal, 1155, St. Marista, Goiânia, GO, Brazil. http://www.cdigoias.com.br
| | - Ronaldo Martins da Costa
- Federal University of Goias, Alameda Palmeiras, Quadra D, Câmpus Samambaia, Goiânia, GO, Brazil.
| |
Collapse
|
22
|
Guo S, Xu L, Feng C, Xiong H, Gao Z, Zhang H. Multi-level semantic adaptation for few-shot segmentation on cardiac image sequences. Med Image Anal 2021; 73:102170. [PMID: 34380105 DOI: 10.1016/j.media.2021.102170] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 06/04/2021] [Accepted: 07/12/2021] [Indexed: 01/01/2023]
Abstract
Obtaining manual labels is time-consuming and labor-intensive on cardiac image sequences. Few-shot segmentation can utilize limited labels to learn new tasks. However, it suffers from two challenges: spatial-temporal distribution bias and long-term information bias. These challenges derive from the impact of the time dimension on cardiac image sequences, resulting in serious over-adaptation. In this paper, we propose the multi-level semantic adaptation (MSA) for few-shot segmentation on cardiac image sequences. The MSA addresses the two biases by exploring the domain adaptation and the weight adaptation on the semantic features in multiple levels, including sequence-level, frame-level, and pixel-level. First, the MSA proposes the dual-level feature adjustment for domain adaptation in spatial and temporal directions. This adjustment explicitly aligns the frame-level feature and the sequence-level feature to improve the model adaptation on diverse modalities. Second, the MSA explores the hierarchical attention metric for weight adaptation in the frame-level feature and the pixel-level feature. This metric focuses on the similar frame and the target region to promote the model discrimination on the border features. The extensive experiments demonstrate that our MSA is effective in few-shot segmentation on cardiac image sequences with three modalities, i.e. MR, CT, and Echo (e.g. the average Dice is 0.9243), as well as superior to the ten state-of-the-art methods.
Collapse
Affiliation(s)
- Saidi Guo
- School of Biomedical Engineering, Sun Yat-sen University, China
| | - Lin Xu
- General Hospital of the Southern Theatre Command, PLA, Guangdong, China; The First School of Clinical Medicine, Southern Medical University, Guangdong, China
| | - Cheng Feng
- Department of Ultrasound, The Third People's Hospital of Shenzhen, Guangdong, China
| | - Huahua Xiong
- Department of Ultrasound, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Guangdong, China
| | - Zhifan Gao
- School of Biomedical Engineering, Sun Yat-sen University, China.
| | - Heye Zhang
- School of Biomedical Engineering, Sun Yat-sen University, China.
| |
Collapse
|
23
|
He Y, Li T, Ge R, Yang J, Kong Y, Zhu J, Shu H, Yang G, Li S. Few-shot Learning for Deformable Medical Image Registration with Perception-Correspondence Decoupling and Reverse Teaching. IEEE J Biomed Health Inform 2021; 26:1177-1187. [PMID: 34232899 DOI: 10.1109/jbhi.2021.3095409] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Deformable medical image registration estimates corresponding deformation to align the regions of interest (ROIs) of two images to a same spatial coordinate system. However, recent unsupervised registration models only have correspondence ability without perception, making misalignment on blurred anatomies and distortion on task-unconcerned backgrounds. Label-constrained (LC) registration models embed the perception ability via labels, but the lack of texture constraints in labels and the expensive labeling costs causes distortion internal ROIs and overfitted perception. We propose the first few-shot deformable medical image registration framework, Perception-Correspondence Registration (PC-Reg), which embeds perception ability to registration models only with few labels, thus greatly improving registration accuracy and reducing distortion. 1) We propose the Perception-Correspondence Decoupling which decouples the perception and correspondence actions of registration to two CNNs. Therefore, independent optimizations and feature representations are available avoiding interference of the correspondence due to the lack of texture constraints. 2) For few-shot learning, we propose Reverse Teaching which aligns labeled and unlabeled images to each other to provide supervision information to the structure and style knowledge in unlabeled images, thus generating additional training data. Therefore, these data will reversely teach our perception CNN more style and structure knowledge, improving its generalization ability. Our experiments on three datasets with only five labels demonstrate that our PC-Reg has competitive registration accuracy and effective distortion-reducing ability. Compared with LC-VoxelMorph(lambda=1), we achieve the 12.5%, 6.3% and 1.0% Reg-DSC improvements on three datasets, revealing our framework with great potential in clinical application.
Collapse
|
24
|
Dual attention enhancement feature fusion network for segmentation and quantitative analysis of paediatric echocardiography. Med Image Anal 2021; 71:102042. [PMID: 33784600 DOI: 10.1016/j.media.2021.102042] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Revised: 03/09/2021] [Accepted: 03/10/2021] [Indexed: 12/19/2022]
Abstract
Paediatric echocardiography is a standard method for screening congenital heart disease (CHD). The segmentation of paediatric echocardiography is essential for subsequent extraction of clinical parameters and interventional planning. However, it remains a challenging task due to (1) the considerable variation of key anatomic structures, (2) the poor lateral resolution affecting accurate boundary definition, (3) the existence of speckle noise and artefacts in echocardiographic images. In this paper, we propose a novel deep network to address these challenges comprehensively. We first present a dual-path feature extraction module (DP-FEM) to extract rich features via a channel attention mechanism. A high- and low-level feature fusion module (HL-FFM) is devised based on spatial attention, which selectively fuses rich semantic information from high-level features with spatial cues from low-level features. In addition, a hybrid loss is designed to deal with pixel-level misalignment and boundary ambiguities. Based on the segmentation results, we derive key clinical parameters for diagnosis and treatment planning. We extensively evaluate the proposed method on 4,485 two-dimensional (2D) paediatric echocardiograms from 127 echocardiographic videos. The proposed method consistently achieves better segmentation performance than other state-of-the-art methods, whichdemonstratesfeasibility for automatic segmentation and quantitative analysis of paediatric echocardiography. Our code is publicly available at https://github.com/end-of-the-century/Cardiac.
Collapse
|
25
|
Xu C, Zhang D, Chong J, Chen B, Li S. Synthesis of gadolinium-enhanced liver tumors on nonenhanced liver MR images using pixel-level graph reinforcement learning. Med Image Anal 2021; 69:101976. [PMID: 33535110 DOI: 10.1016/j.media.2021.101976] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 01/11/2021] [Accepted: 01/18/2021] [Indexed: 01/24/2023]
Abstract
If successful, synthesis of gadolinium (Gd)-enhanced liver tumors on nonenhanced liver MR images will be critical for liver tumor diagnosis and treatment. This synthesis will offer a safe, efficient, and low-cost clinical alternative to eliminate the use of contrast agents in the current clinical workflow and significantly benefit global healthcare systems. In this study, we propose a novel pixel-level graph reinforcement learning method (Pix-GRL). This method directly takes regular nonenhanced liver images as input and outputs AI-enhanced liver tumor images, thereby making them comparable to traditional Gd-enhanced liver tumor images. In Pix-GRL, each pixel has a pixel-level agent, and the agent explores the pixels features and outputs a pixel-level action to iteratively change the pixel value, ultimately generating AI-enhanced liver tumor images. Most importantly, Pix-GRL creatively embeds a graph convolution to represent all the pixel-level agents. A graph convolution is deployed to the agent for feature exploration to improve the effectiveness through the aggregation of long-range contextual features, as well as outputting the action to enhance the efficiency through shared parameter training between agents. Moreover, in our Pix-GRL method, a novel reward is used to measure pixel-level action to significantly improve the performance by considering the improvement in each action in each pixel with its own future state, as well as those of neighboring pixels. Pix-GRL significantly upgrades the existing medical DRL methods from a single agent to multiple pixel-level agents, becoming the first DRL method for medical image synthesis. Comprehensive experiments on three types of liver tumor datasets (benign, cancerous, and healthy controls) with 325 patients (24,375 images) show that our novel Pix-GRL method outperforms existing medical image synthesis learning methods. It achieved an SSIM of 0.85 ± 0.06 and a Pearson correlation coefficient of 0.92 in terms of the tumor size. These results prove that the potential exists to develop a successful clinical alternative to Gd-enhanced liver MR imaging.
Collapse
Affiliation(s)
- Chenchu Xu
- School of Computer Science and Technology, Anhui University, Hefei, China; Department of Medical Imaging, Western University, London ON, Canada
| | - Dong Zhang
- Department of Medical Imaging, Western University, London ON, Canada
| | - Jaron Chong
- Department of Medical Imaging, Western University, London ON, Canada
| | - Bo Chen
- School of Health Science, Western University, London ON, Canada
| | - Shuo Li
- Department of Medical Imaging, Western University, London ON, Canada.
| |
Collapse
|
26
|
Leclerc S, Smistad E, Ostvik A, Cervenansky F, Espinosa F, Espeland T, Rye Berg EA, Belhamissi M, Israilov S, Grenier T, Lartizien C, Jodoin PM, Lovstakken L, Bernard O. LU-Net: A Multistage Attention Network to Improve the Robustness of Segmentation of Left Ventricular Structures in 2-D Echocardiography. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2020; 67:2519-2530. [PMID: 32746187 DOI: 10.1109/tuffc.2020.3003403] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Segmentation of cardiac structures is one of the fundamental steps to estimate volumetric indices of the heart. This step is still performed semiautomatically in clinical routine and is, thus, prone to interobserver and intraobserver variabilities. Recent studies have shown that deep learning has the potential to perform fully automatic segmentation. However, the current best solutions still suffer from a lack of robustness in terms of accuracy and number of outliers. The goal of this work is to introduce a novel network designed to improve the overall segmentation accuracy of left ventricular structures (endocardial and epicardial borders) while enhancing the estimation of the corresponding clinical indices and reducing the number of outliers. This network is based on a multistage framework where both the localization and segmentation steps are optimized jointly through an end-to-end scheme. Results obtained on a large open access data set show that our method outperforms the current best-performing deep learning solution with a lighter architecture and achieved an overall segmentation accuracy lower than the intraobserver variability for the epicardial border (i.e., on average a mean absolute error of 1.5 mm and a Hausdorff distance of 5.1mm) with 11% of outliers. Moreover, we demonstrate that our method can closely reproduce the expert analysis for the end-diastolic and end-systolic left ventricular volumes, with a mean correlation of 0.96 and a mean absolute error of 7.6 ml. Concerning the ejection fraction of the left ventricle, results are more contrasted with a mean correlation coefficient of 0.83 and an absolute mean error of 5.0%, producing scores that are slightly below the intraobserver margin. Based on this observation, areas for improvement are suggested.
Collapse
|
27
|
Zhou GQ, Huo EZ, Yuan M, Zhou P, Wang RL, Wang KN, Chen Y, He XP. A Single-Shot Region-Adaptive Network for Myotendinous Junction Segmentation in Muscular Ultrasound Images. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2020; 67:2531-2542. [PMID: 32167889 DOI: 10.1109/tuffc.2020.2979481] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Tracking the myotendinous junction (MTJ) in consecutive ultrasound images is crucial for understanding the mechanics and pathological conditions of the muscle-tendon unit. However, the lack of reliable and efficient identification of MTJ due to poor image quality and boundary ambiguity restricts its application in motion analysis. In recent years, with the rapid development of deep learning, the region-based convolution neural network (RCNN) has shown great potential in the field of simultaneous objection detection and instance segmentation in medical images. This article proposes a region-adaptive network (RAN) to localize MTJ region and to segment it in a single shot. Our model learns about the salient information of MTJ with the help of a composite architecture. Herein, a region-based multitask learning network explores the region containing MTJ, while a parallel end-to-end U-shaped path extracts the MTJ structure from the adaptively selected region for combating data imbalance and boundary ambiguity. By demonstrating the ultrasound images of the gastrocnemius, we showed that the RAN achieves superior segmentation performance when compared with the state-of-the-art Mask RCNN method with an average Dice score of 80.1%. Our proposed method is robust and reliable for advanced muscle and tendon function examinations obtained by ultrasound imaging.
Collapse
|
28
|
Dynamically constructed network with error correction for accurate ventricle volume estimation. Med Image Anal 2020; 64:101723. [DOI: 10.1016/j.media.2020.101723] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 05/07/2020] [Accepted: 05/08/2020] [Indexed: 11/20/2022]
|
29
|
He Y, Yang G, Yang J, Chen Y, Kong Y, Wu J, Tang L, Zhu X, Dillenseger JL, Shao P, Zhang S, Shu H, Coatrieux JL, Li S. Dense biased networks with deep priori anatomy and hard region adaptation: Semi-supervised learning for fine renal artery segmentation. Med Image Anal 2020; 63:101722. [DOI: 10.1016/j.media.2020.101722] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 05/02/2020] [Accepted: 05/06/2020] [Indexed: 12/24/2022]
|
30
|
Ge R, Yang G, Chen Y, Luo L, Feng C, Ma H, Ren J, Li S. K-Net: Integrate Left Ventricle Segmentation and Direct Quantification of Paired Echo Sequence. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:1690-1702. [PMID: 31765307 DOI: 10.1109/tmi.2019.2955436] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The integration of segmentation and direct quantification on the left ventricle (LV) from the paired apical views(i.e., apical 4-chamber and 2-chamber together) echo sequence clinically achieves the comprehensive cardiac assessment: multiview segmentation for anatomical morphology, and multidimensional quantification for contractile function. Direct quantification of LV, i.e., to automatically quantify multiple LV indices directly from the image via task-aware feature representation and regression, avoids accumulative error from the inter-step target. This integration sequentially makes a stereoscopical reflection of cardiac activity jointly from the paired orthogonal cross views sequences, overcoming limited observation with a single plane. We propose a K-shaped Unified Network (K-Net), the first end-to-end framework to simultaneously segment LV from apical 4-chamber and 2-chamber views, and directly quantify LV from major- and minor-axis dimensions (1D), area (2D), and volume (3D), in sequence. It works via four components: 1) the K-Net architecture with the Attention Junction enables heterogeneous tasks learning of segmentation task of pixel-wise classification, and direct quantification task of image-wise regression, by interactively introducing the information from segmentation to jointly promote spatial attention map to guide quantification focusing on LV-related region, and transferring quantification feedback to make global constraint on segmentation; 2) the Bi-ResLSTMs distributed in K-Net layer-by-layer hierarchically extract spatial-temporal information in echo sequence, with bidirectional recurrent and short-cut connection to model spatial-temporal information among all frames; 3) the Information Valve tailing the Bi-ResLSTMs selectively exchanges information among multiple views, by stimulating complementary information and suppressing redundant information to make the efficient cross-flow for each view; 4) the Evolution Loss comprehensively guides sequential data learning, with static constraint for frame values, and dynamic constraint for inter-frame value changes. The experiments show that our K-Net gains high performance with a Dice coefficient up to 91.44% and a mean absolute error of the major-axis dimension down to 2.74mm, which reveal its clinical potential.
Collapse
|
31
|
Dong S, Luo G, Tam C, Wang W, Wang K, Cao S, Chen B, Zhang H, Li S. Deep Atlas Network for Efficient 3D Left Ventricle Segmentation on Echocardiography. Med Image Anal 2020; 61:101638. [DOI: 10.1016/j.media.2020.101638] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Revised: 01/06/2020] [Accepted: 01/09/2020] [Indexed: 10/25/2022]
|