1
|
Huang Z, Shi J, Li X. Quantum Few-Shot Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2025; 55:194-206. [PMID: 39453810 DOI: 10.1109/tcyb.2024.3476339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2024]
Abstract
Few-shot learning algorithms frequently exhibit suboptimal performance due to the limited availability of labeled data. This article presents a novel quantum few-shot image classification methodology aimed at enhancing the efficacy of few-shot learning algorithms at both the data and parameter levels. Initially, a quantum augmentation image representation technique is introduced, leveraging the local phase of quantum states to support few-shot learning algorithms at the data level. This approach enriches classical data while maintaining its intrinsic physical properties. Subsequently, a parameterized quantum circuit is employed to construct the classification model. This circuit, characterized by a reduced number of trainable parameters, shows increased resilience to overfitting, thereby offering a significant advantage at the parameter level for few-shot learning algorithms. The proposed approach is validated using three datasets, with experimental results indicating that it outperforms classical methods in few-shot learning scenarios while requiring fewer computational resources.
Collapse
|
2
|
Cai Z, Jing XY, Shao L. Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:4623-4635. [PMID: 33201832 DOI: 10.1109/tcyb.2020.3032194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Existing domain adaptation (DA) methods generally assume that different domains have identical label space, and the training data are only sampled from a single domain. This unrealistic assumption is quite restricted for real-world applications, since it neglects the more practical scenario, where the source domain can contain the categories that are not shared by the target domain, and the training data can be collected from multiple modalities. In this article, we address a more difficult but practical problem, which recognizes RGB images through training on RGB-D data under the label space inequality scenario. There are three challenges in this task: 1) source and target domains are affected by the domain mismatch issue, which results in that the trained models perform imperfectly on the test data; 2) depth images are absent in the target domain (e.g., target images are captured by smartphones), when the source domain contains both the RGB and depth data. It makes the ordinary visual recognition approaches hardly applied to this task; and 3) in the real world, the source and target domains always have different numbers of categories, which would result in a negative transfer bottleneck being more prominent. Toward tackling the above challenges, we formulate a deep model, called visual-depth matching network (VDMN), where two new modules and a matching component can be trained in an end-to-end fashion jointly to identify the common and outlier categories effectively. The significance of VDMN is that it can take advantage of depth information and handle the domain distribution mismatch under label inequality simultaneously. The experimental results reveal that VDMN exceeds the state-of-the-art performance on various DA datasets, especially under the label inequality scenario.
Collapse
|
3
|
Xue X, Li Y, Yin X, Shang C, Peng T, Shen Q. Semantic-Aware Real-Time Correlation Tracking Framework for UAV Videos. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:2418-2429. [PMID: 32701457 DOI: 10.1109/tcyb.2020.3005453] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Discriminative correlation filter (DCF) has contributed tremendously to address the problem of object tracking benefitting from its high computational efficiency. However, it has suffered from performance degradation in unmanned aerial vehicle (UAV) tracking. This article presents a novel semantic-aware real-time correlation tracking framework (SARCT) for UAV videos to enhance the performance of DCF trackers without incurring excessive computing cost. Specifically, SARCT first constructs an additional detection module to generate ROI proposals and to filter any response regarding the target irrelevant area. Then, a novel semantic segmentation module based on semantic template generation and semantic coefficient prediction is further introduced to capture semantic information, which can provide precise ROI mask, thereby effectively suppressing background interference in the ROI proposals. By sharing features and specific network layers for object detection and semantic segmentation, SARCT reduces parameter redundancy to attain sufficient speed for real-time applications. Systematic experiments are conducted on three typical aerial datasets in order to evaluate the performance of the proposed SARCT. The results demonstrate that SARCT is able to improve the accuracy of conventional DCF-based trackers significantly, outperforming state-of-the-art deep trackers.
Collapse
|
4
|
Lei T, Wang R, Zhang Y, Wan Y, Liu C, Nandi AK. DefED-Net: Deformable Encoder-Decoder Network for Liver and Liver Tumor Segmentation. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES 2022. [DOI: 10.1109/trpms.2021.3059780] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
5
|
Li FY, Li W, Gao X, Xiao B. A Novel Framework with Weighted Decision Map Based on Convolutional Neural Network for Cardiac MR Segmentation. IEEE J Biomed Health Inform 2021; 26:2228-2239. [PMID: 34851840 DOI: 10.1109/jbhi.2021.3131758] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
For diagnosing cardiovascular disease, an accurate segmentation method is needed. There are several unre-solved issues in the complex field of cardiac magnetic resonance imaging, some of which have been partially addressed by using deep neural networks. To solve two problems of over-segmentation and under-segmentation of anatomical shapes in the short-axis view from different cardiac magnetic resonance sequences, we propose a novel two-stage framework with a weighted decision map based on convolutional neural networks to segment the myocardium (Myo), left ventricle (LV), and right ventricle (RV) simultaneously. The framework comprises a deci-sion map extractor and a cardiac segmenter. A cascaded U-Net++ is used as a decision map extractor to acquire the decision map that decides the category of each pixel. Cardiac segmenter is a multiscale dual-path feature ag-gregation network (MDFA-Net) which consists of a densely connected network and an asymmetric encoding and decoding network. The input to the cardiac seg-menter is derived from processed original images weighted by the output of the decision map extractor. We conducted experiments on two datasets of mul-ti-sequence cardiac magnetic resonance segmentation challenge 2019 (MS-CMRSeg 2019) and myocardial pa-thology segmentation challenge 2020 (MyoPS 2020). Test results obtained on MyoPS 2020 show that proposed method with average Dice coefficient of 84.70%, 86.00% and 86.31% in the segmentation task of Myo, LV, and RV, respectively.
Collapse
|
6
|
Cai Q, Qian Y, Zhou S, Li J, Yang YH, Wu F, Zhang D. AVLSM: Adaptive Variational Level Set Model for Image Segmentation in the Presence of Severe Intensity Inhomogeneity and High Noise. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 31:43-57. [PMID: 34793300 DOI: 10.1109/tip.2021.3127848] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Intensity inhomogeneity and noise are two common issues in images but inevitably lead to significant challenges for image segmentation and is particularly pronounced when the two issues simultaneously appear in one image. As a result, most existing level set models yield poor performance when applied to this images. To this end, this paper proposes a novel hybrid level set model, named adaptive variational level set model (AVLSM) by integrating an adaptive scale bias field correction term and a denoising term into one level set framework, which can simultaneously correct the severe inhomogeneous intensity and denoise in segmentation. Specifically, an adaptive scale bias field correction term is first defined to correct the severe inhomogeneous intensity by adaptively adjusting the scale according to the degree of intensity inhomogeneity while segmentation. More importantly, the proposed adaptive scale truncation function in the term is model-agnostic, which can be applied to most off-the-shelf models and improves their performance for image segmentation with severe intensity inhomogeneity. Then, a denoising energy term is constructed based on the variational model, which can remove not only common additive noise but also multiplicative noise often occurred in medical image during segmentation. Finally, by integrating the two proposed energy terms into a variational level set framework, the AVLSM is proposed. The experimental results on synthetic and real images demonstrate the superiority of AVLSM over most state-of-the-art level set models in terms of accuracy, robustness and running time.
Collapse
|
7
|
Yu H, Sun P, He F, Hu Z. A weighted region-based level set method for image segmentation with intensity inhomogeneity. PLoS One 2021; 16:e0255948. [PMID: 34411147 PMCID: PMC8376002 DOI: 10.1371/journal.pone.0255948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 07/27/2021] [Indexed: 11/18/2022] Open
Abstract
Image segmentation is a fundamental task in image processing and is still a challenging problem when processing images with high noise, low resolution and intensity inhomogeneity. In this paper, a weighted region-based level set method, which is based on the techniques of local statistical theory, level set theory and curve evolution, is proposed. Specifically, a new weighted pressure force function (WPF) is first presented to flexibly drive the closed contour to shrink or expand outside and inside of the object. Second, a faster and smoother regularization term is added to ensure the stability of the curve evolution and that there is no need for initialization in curve evolution. Third, the WPF is integrated into the region-based level set framework to accelerate the speed of the curve evolution and improve the accuracy of image segmentation. Experimental results on medical and natural images demonstrate that the proposed segmentation model is more efficient and robust to noise than other state-of-the-art models.
Collapse
Affiliation(s)
- Haiping Yu
- School of Computer Science, Huanggang Normal University, Huanggang, China
| | - Ping Sun
- School of Computer Science, Huanggang Normal University, Huanggang, China
| | - Fazhi He
- School of Computer Science, Wuhan University, Wuhan, China
| | - Zhihua Hu
- School of Computer Science, Huanggang Normal University, Huanggang, China
| |
Collapse
|
8
|
Li S, Jiang H, Li H, Yao YD. AW-SDRLSE: Adaptive Weighting and Scalable Distance Regularized Level Set Evolution for Lymphoma Segmentation on PET Images. IEEE J Biomed Health Inform 2021; 25:1173-1184. [PMID: 32841130 DOI: 10.1109/jbhi.2020.3017546] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Accurate lymphoma segmentation on Positron Emission Tomography (PET) images is of great importance for medical diagnoses, such as for distinguishing benign and malignant. To this end, this paper proposes an adaptive weighting and scalable distance regularized level set evolution (AW-SDRLSE) method for delineating lymphoma boundaries on 2D PET slices. There are three important characteristics with respect to AW-SDRLSE: 1) A scalable distance regularization term is proposed and a parameter q can control the contour's convergence rate and precision in theory. 2) A novel dynamic annular mask is proposed to calculate mean intensities of local interior and exterior regions and further define the region energy term. 3) As the level set method is sensitive to parameters, we thus propose an adaptive weighting strategy for the length and area energy terms using local region intensity and boundary direction information. AW-SDRLSE is evaluated on 90 cases of real PET data with a mean Dice coefficient of 0.8796. Comparative results demonstrate the accuracy and robustness of AW-SDRLSE as well as its performance advantages as compared with related level set methods. In addition, experimental results indicate that AW-SDRLSE can be a fine segmentation method for improving the lymphoma segmentation results obtained by deep learning (DL) methods significantly.
Collapse
|
9
|
Yu J, Yao J, Zhang J, Yu Z, Tao D. SPRNet: Single-Pixel Reconstruction for One-Stage Instance Segmentation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1731-1742. [PMID: 32167921 DOI: 10.1109/tcyb.2020.2969046] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Object instance segmentation is one of the most fundamental but challenging tasks in computer vision, and it requires the pixel-level image understanding. Most existing approaches address this problem by adding a mask prediction branch to a two-stage object detector with the region proposal network (RPN). Although producing good segmentation results, the efficiency of these two-stage approaches is far from satisfactory, restricting their applicability in practice. In this article, we propose a one-stage framework, single-pixel reconstruction net (SPRNet), which performs efficient instance segmentation by introducing a single-pixel reconstruction (SPR) branch to off-the-shelf one-stage detectors. The added SPR branch reconstructs the pixel-level mask from every single pixel in the convolution feature map directly. Using the same ResNet-50 backbone, SPRNet achieves comparable mask AP with Mask R-CNN at a higher inference speed and gains all-round improvements on box AP at every scale compared with RetinaNet.
Collapse
|
10
|
Zhang X, Wei Y, Yang Y, Huang TS. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:3855-3865. [PMID: 32497014 DOI: 10.1109/tcyb.2020.2992433] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
One-shot image semantic segmentation poses a challenging task of recognizing the object regions from unseen categories with only one annotated example as supervision. In this article, we propose a simple yet effective similarity guidance network to tackle the one-shot (SG-One) segmentation problem. We aim at predicting the segmentation mask of a query image with the reference to one densely labeled support image of the same category. To obtain the robust representative feature of the support image, we first adopt a masked average pooling strategy for producing the guidance features by only taking the pixels belonging to the support image into account. We then leverage the cosine similarity to build the relationship between the guidance features and features of pixels from the query image. In this way, the possibilities embedded in the produced similarity maps can be adopted to guide the process of segmenting objects. Furthermore, our SG-One is a unified framework that can efficiently process both support and query images within one network and be learned in an end-to-end manner. We conduct extensive experiments on Pascal VOC 2012. In particular, our SG-One achieves the mIoU score of 46.3%, surpassing the baseline methods.
Collapse
|
11
|
Sheng B, Li P, Mo S, Li H, Hou X, Wu Q, Qin J, Fang R, Feng DD. Retinal Vessel Segmentation Using Minimum Spanning Superpixel Tree Detector. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2707-2719. [PMID: 29994327 DOI: 10.1109/tcyb.2018.2833963] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The retinal vessel is one of the determining factors in an ophthalmic examination. Automatic extraction of retinal vessels from low-quality retinal images still remains a challenging problem. In this paper, we propose a robust and effective approach that qualitatively improves the detection of low-contrast and narrow vessels. Rather than using the pixel grid, we use a superpixel as the elementary unit of our vessel segmentation scheme. We regularize this scheme by combining the geometrical structure, texture, color, and space information in the superpixel graph. And the segmentation results are then refined by employing the efficient minimum spanning superpixel tree to detect and capture both global and local structure of the retinal images. Such an effective and structure-aware tree detector significantly improves the detection around the pathologic area. Experimental results have shown that the proposed technique achieves advantageous connectivity-area-length (CAL) scores of 80.92% and 69.06% on two public datasets, namely, DRIVE and STARE, thereby outperforming state-of-the-art segmentation methods. In addition, the tests on the challenging retinal image database have further demonstrated the effectiveness of our method. Our approach achieves satisfactory segmentation performance in comparison with state-of-the-art methods. Our technique provides an automated method for effectively extracting the vessel from fundus images.
Collapse
|