1
|
Geng X, Jiao L, Liu X, Li L, Chen P, Liu F, Yang S. A Spatial-Spectral Relation-Guided Fusion Network for Multisource Optical RS Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8991-9004. [PMID: 38954572 DOI: 10.1109/tnnls.2024.3413799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Multisource optical remote sensing (RS) image classification has obtained extensive research interest with demonstrated superiority. Existing approaches mainly improve classification performance by exploiting complementary information from multisource data. However, these approaches are insufficient in effectively extracting data features and utilizing correlations of multisource optical RS images. For this purpose, this article proposes a generalized spatial-spectral relation-guided fusion network (S2RGF-Net) for multisource optical RS image classification. First, we elaborate on spatial- and spectral-domain-specific feature encoders based on data characteristics to explore the rich feature information of optical RS data deeply. Subsequently, two relation-guided fusion strategies are proposed at the dual-level (intradomain and interdomain) to integrate multisource image information effectively. In the intradomain feature fusion, an adaptive de-redundancy fusion module (ADRF) is introduced to eliminate redundancy so that the spatial and spectral features are complete and compact, respectively. In interdomain feature fusion, we construct a spatial-spectral joint attention module (SSJA) based on interdomain relationships to sufficiently enhance the complementary features, so as to facilitate later fusion. Experiments on various multisource optical RS datasets demonstrate that S2RGF-Net outperforms other state-of-the-art (SOTA) methods.
Collapse
|
2
|
Kim M, Park T, Kang J, Kim MJ, Kwon MJ, Oh BY, Kim JW, Ha S, Yang WS, Cho BJ, Son I. Development and validation of automated three-dimensional convolutional neural network model for acute appendicitis diagnosis. Sci Rep 2025; 15:7711. [PMID: 40044743 PMCID: PMC11882796 DOI: 10.1038/s41598-024-84348-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 12/23/2024] [Indexed: 03/09/2025] Open
Abstract
Rapid, accurate preoperative imaging diagnostics of appendicitis are critical in surgical decisions of emergency care. This study developed a fully automated diagnostic framework using a 3D convolutional neural network (CNN) to identify appendicitis and clinical information from patients with abdominal pain, including contrast-enhanced abdominopelvic computed tomography images. A deep learning model-Information of Appendix (IA)-was developed, and the volume of interest (VOI) region corresponding to the anatomical location of the appendix was automatically extracted. It was analysed using a two-stage binary algorithm with transfer learning. The algorithm predicted three categories: non-, simple, and complicated appendicitis. The 3D-CNN architecture incorporated ResNet, DenseNet, and EfficientNet. The IA model utilising DenseNet169 demonstrated 79.5% accuracy (76.4-82.6%), 70.1% sensitivity (64.7-75.0%), 87.6% specificity (83.7-90.7%), and an area under the curve (AUC) of 0.865 (0.862-0.867), with a negative appendectomy rate of 12.4% in stage 1 classification identifying non-appendicitis versus. appendicitis. In stage 2, the IA model exhibited 76.1% accuracy (70.3-81.9%), 82.6% sensitivity (62.9-90.9%), 74.2% specificity (67.0-80.3%), and an AUC of 0.827 (0.820-0.833), differentiating simple and complicated appendicitis. This IA model can provide physicians with reliable diagnostic information on appendicitis with generality and reproducibility within the VOI.
Collapse
Affiliation(s)
- Minsung Kim
- Department of Surgery, Hallym University Medical Center, Hallym Sacred Heart Hospital, Hallym University College of Medicine, 22 Gwanpyeong-ro 170 beon-gil, Pyeongan-dong, Dongan-gu, Anyang, Gyeonggi-do, Republic of Korea
| | - Taeyong Park
- Medical Artificial Intelligence Center, Hallym University Medical Center, Anyang, Republic of Korea
| | - Jaewoong Kang
- Medical Artificial Intelligence Center, Hallym University Medical Center, Anyang, Republic of Korea
| | - Min-Jeong Kim
- Department of Radiology, Hallym Sacred Heart Hospital, Hallym University College of Medicine, Anyang, Republic of Korea
| | - Mi Jung Kwon
- Department of Pathology, Hallym Sacred Heart Hospital, Hallym University College of Medicine, Anyang, Republic of Korea
| | - Bo Young Oh
- Department of Surgery, Hallym University Medical Center, Hallym Sacred Heart Hospital, Hallym University College of Medicine, 22 Gwanpyeong-ro 170 beon-gil, Pyeongan-dong, Dongan-gu, Anyang, Gyeonggi-do, Republic of Korea
| | - Jong Wan Kim
- Department of Surgery, Dongtan Sacred Heart Hospital, Hallym University College of Medicine, Hwaseong, Republic of Korea
| | - Sangook Ha
- Department of Emergency Medicine, Hallym University Sacred Heart Hospital, Hallym University Medical Center, Anyang, Republic of Korea
| | - Won Seok Yang
- Department of Emergency Medicine, Hallym University Sacred Heart Hospital, Hallym University Medical Center, Anyang, Republic of Korea
| | - Bum-Joo Cho
- Medical Artificial Intelligence Center, Hallym University Medical Center, Anyang, Republic of Korea.
| | - Iltae Son
- Department of Surgery, Hallym University Medical Center, Hallym Sacred Heart Hospital, Hallym University College of Medicine, 22 Gwanpyeong-ro 170 beon-gil, Pyeongan-dong, Dongan-gu, Anyang, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
3
|
Tian J, Saddik AE, Xu X, Li D, Cao Z, Shen HT. Intrinsic Consistency Preservation With Adaptively Reliable Samples for Source-Free Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4738-4749. [PMID: 38379234 DOI: 10.1109/tnnls.2024.3362948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Unsupervised domain adaptation (UDA) aims to alleviate the domain shift by transferring knowledge learned from a labeled source dataset to an unlabeled target domain. Although UDA has seen promising progress recently, it requires access to data from both domains, making it problematic in source data-absent scenarios. In this article, we investigate a practical task source-free domain adaptation (SFDA) that alleviates the limitations of the widely studied UDA in simultaneously acquiring source and target data. In addition, we further study the imbalanced SFDA (ISFDA) problem, which addresses the intra-domain class imbalance and inter-domain label shift in SFDA. We observe two key issues in SFDA that: 1) target data form clusters in the representation space regardless of whether the target data points are aligned with the source classifier and 2) target samples with higher classification confidence are more reliable and have less variation in their classification confidence during adaptation. Motivated by these observations, we propose a unified method, named intrinsic consistency preservation with adaptively reliable samples (ICPR), to jointly cope with SFDA and ISFDA. Specifically, ICPR first encourages the intrinsic consistency in the predictions of neighbors for unlabeled samples with weak augmentation (standard flip-and-shift), regardless of their reliability. ICPR then generates strongly augmented views specifically for adaptively selected reliable samples and is trained to fix the intrinsic consistency between weakly and strongly augmented views of the same image concerning predictions of neighbors and their own. Additionally, we propose to use a prototype-like classifier to avoid the classification confusion caused by severe intra-domain class imbalance and inter-domain label shift. We demonstrate the effectiveness and general applicability of ICPR on six benchmarks of both SFDA and ISFDA tasks. The reproducible code of our proposed ICPR method is available at https://github.com/CFM-MSG/Code_ICPR.
Collapse
|
4
|
Tan Y, Zhang E, Li Y, Huang SL, Zhang XP. Transferability-Guided Cross-Domain Cross-Task Transfer Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2423-2436. [PMID: 38315592 DOI: 10.1109/tnnls.2024.3358094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
We propose two novel transferability metrics fast optimal transport-based conditional entropy (F-OTCE) and joint correspondence OTCE (JC-OTCE) to evaluate how much the source model (task) can benefit the learning of the target task and to learn more generalizable representations for cross-domain cross-task transfer learning. Unlike the original OTCE metric that requires evaluating the empirical transferability on auxiliary tasks, our metrics are auxiliary-free such that they can be computed much more efficiently. Specifically, F-OTCE estimates transferability by first solving an optimal transport (OT) problem between source and target distributions and then uses the optimal coupling to compute the negative conditional entropy (NCE) between the source and target labels. It can also serve as an objective function to enhance downstream transfer learning tasks including model finetuning and domain generalization (DG). Meanwhile, JC-OTCE improves the transferability accuracy of F-OTCE by including label distances in the OT problem, though it incurs additional computation costs. Extensive experiments demonstrate that F-OTCE and JC-OTCE outperform state-of-the-art auxiliary-free metrics by 21.1% and 25.8%, respectively, in correlation coefficient with the ground-truth transfer accuracy. By eliminating the training cost of auxiliary tasks, the two metrics reduce the total computation time of the previous method from 43 min to 9.32 and 10.78 s, respectively, for a pair of tasks. When applied in the model finetuning and DG tasks, F-OTCE shows significant improvements in the transfer accuracy in few-shot classification experiments, with up to 4.41% and 2.34% accuracy gains, respectively.
Collapse
|
5
|
Lin J, Tang Y, Wang J, Zhang W. Constrained Maximum Cross-Domain Likelihood for Domain Generalization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2013-2027. [PMID: 37440378 DOI: 10.1109/tnnls.2023.3292242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/15/2023]
Abstract
As a recent noticeable topic, domain generalization aims to learn a generalizable model on multiple source domains, which is expected to perform well on unseen test domains. Great efforts have been made to learn domain-invariant features by aligning distributions across domains. However, existing works are often designed based on some relaxed conditions which are generally hard to satisfy and fail to realize the desired joint distribution alignment. In this article, we propose a novel domain generalization method, which originates from an intuitive idea that a domain-invariant classifier can be learned by minimizing the Kullback-Leibler (KL)-divergence between posterior distributions from different domains. To enhance the generalizability of the learned classifier, we formalize the optimization objective as an expectation computed on the ground-truth marginal distribution. Nevertheless, it also presents two obvious deficiencies, one of which is the side-effect of entropy increase in KL-divergence and the other is the unavailability of ground-truth marginal distributions. For the former, we introduce a term named maximum in-domain likelihood to maintain the discrimination of the learned domain-invariant representation space. For the latter, we approximate the ground-truth marginal distribution with source domains under a reasonable convex hull assumption. Finally, a constrained maximum cross-domain likelihood (CMCL) optimization problem is deduced, by solving which the joint distributions are naturally aligned. An alternating optimization strategy is carefully designed to approximately solve this optimization problem. Extensive experiments on four standard benchmark datasets, i.e., Digits-DG, PACS, Office-Home, and miniDomainNet, highlight the superior performance of our method.
Collapse
|
6
|
Li S, Zhang R, Gong K, Xie M, Ma W, Gao G. Source-Free Active Domain Adaptation via Augmentation-Based Sample Query and Progressive Model Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2538-2550. [PMID: 38127604 DOI: 10.1109/tnnls.2023.3338294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Active domain adaptation (ADA), which enormously improves the performance of unsupervised domain adaptation (UDA) at the expense of annotating limited target data, has attracted a surge of interest. However, in real-world applications, the source data in conventional ADA are not always accessible due to data privacy and security issues. To alleviate this dilemma, we introduce a more practical and challenging setting, dubbed as source-free ADA (SFADA), where one can select a small quota of target samples for label query to assist the model learning, but labeled source data are unavailable. Therefore, how to query the most informative target samples and mitigate the domain gap without the aid of source data are two key challenges in SFADA. To address SFADA, we propose a unified method SQAdapt via augmentation-based ample uery and progressive model Adapt ation. In specific, an active selection module (ASM) is built for target label query, which exploits data augmentation to select the most informative target samples with high predictive sensitivity and uncertainty. Then, we further introduce a classifier adaptation module (CAM) to leverage both the labeled and unlabeled target data for progressively calibrating the classifier weights. Meanwhile, the source-like target samples with low selection scores are taken as source surrogates to realize the distribution alignment in the source-free scenario by the proposed distribution alignment module (DAM). Moreover, as a general active label query method, SQAdapt can be easily integrated into other source-free UDA (SFUDA) methods, and improve their performance. Comprehensive experiments on multiple benchmarks have shown that SQAdapt can achieve superior performance and even surpass most of the ADA methods.
Collapse
|
7
|
Qu J, Dong W, Yang Y, Zhang T, Li Y, Du Q. Cycle-Refined Multidecision Joint Alignment Network for Unsupervised Domain Adaptive Hyperspectral Change Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2634-2647. [PMID: 38170657 DOI: 10.1109/tnnls.2023.3347301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Hyperspectral change detection, which provides abundant information on land cover changes in the Earth's surface, has become one of the most crucial tasks in remote sensing. Recently, deep-learning-based change detection methods have shown remarkable performance, but the acquirement of labeled data is extremely expensive and time-consuming. It is intuitive to learn changes from the scene with sufficient labeled data and adapting them into an unlabeled new scene. However, the nonnegligible domain shift between different scenes leads to inevitable performance degradation. In this article, a cycle-refined multidecision joint alignment network (CMJAN) is proposed for unsupervised domain adaptive hyperspectral change detection, which realizes progressive alignment of the data distributions between the source and target domains with cycle-refined high-confidence labeled samples. There are two key characteristics: 1) progressively mitigate the distribution discrepancy to learn domain-invariant difference feature representation and 2) update the high-confidence training samples of the target domain in a cycle manner. The benefit is that the domain shift between the source and target domains is progressively alleviated to promote change detection performance on the target domain in an unsupervised manner. Experimental results on different datasets demonstrate that the proposed method can achieve better performance than the state-of-the-art change detection methods.
Collapse
|
8
|
Wang Y, Zheng W, Li Q, Chen S. Dual-Correction-Adaptation Network for Noisy Knowledge Transfer. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1081-1091. [PMID: 37856271 DOI: 10.1109/tnnls.2023.3322390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Abstract
Unsupervised domain adaptation (UDA) promotes target learning via a single -directional transfer from label-rich source domain to unlabeled target, while its reverse adaption from target to source has not been jointly considered yet. In real teaching practice, a teacher helps students learn and also gets promotion from students, and such a virtuous cycle inspires us to explore dual -directional transfer between domains. In fact, target pseudo-labels predicted by source commonly involve noise due to model bias; moreover, source domain usually contains innate noise, which inevitably aggravates target noise, leading to noise amplification. Transfer from target to source exploits target knowledge to rectify the adaptation, consequently enables better source transfer, and exploits a virtuous transfer circle. To this end, we propose a dual-correction-adaptation network (DualCAN), in which adaptation and correction cycle between domains, such that learning in both domains can be boosted gradually. To the best of our knowledge, this is the first naive attempt of dual-directional adaptation. Empirical results validate DualCAN with remarkable performance gains, particularly for extreme noisy tasks (e.g., approximately +10% on of Office-31 with 40% label corruption).
Collapse
|
9
|
Ge C, Huang R, Xie M, Lai Z, Song S, Li S, Huang G. Domain Adaptation via Prompt Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1160-1170. [PMID: 37943650 DOI: 10.1109/tnnls.2023.3327962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Unsupervised domain adaptation (UDA) aims to adapt models learned from a well-annotated source domain to a target domain, where only unlabeled samples are given. Current UDA approaches learn domain-invariant features by aligning source and target feature spaces through statistical discrepancy minimization or adversarial training. However, these constraints could lead to the distortion of semantic feature structures and loss of class discriminability. In this article, we introduce a novel prompt learning paradigm for UDA, named domain adaptation via prompt learning (DAPrompt). In contrast to prior works, our approach learns the underlying label distribution for target domain rather than aligning domains. The main idea is to embed domain information into prompts, a form of representation generated from natural language, which is then used to perform classification. This domain information is shared only by images from the same domain, thereby dynamically adapting the classifier according to each domain. By adopting this paradigm, we show that our model not only outperforms previous methods on several cross-domain benchmarks but also is very efficient to train and easy to implement.
Collapse
|
10
|
Li Z, Cai R, Chen J, Yan Y, Chen W, Zhang K, Ye J. Time-series domain adaptation via sparse associative structure alignment: Learning invariance and variance. Neural Netw 2024; 180:106659. [PMID: 39216292 DOI: 10.1016/j.neunet.2024.106659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 07/22/2024] [Accepted: 08/21/2024] [Indexed: 09/04/2024]
Abstract
Domain adaptation on time-series data, which is often encountered in the field of industry, like anomaly detection and sensor data forecasting, but received limited attention in academia, is an important but challenging task in real-world scenarios. Most of the existing methods for time-series data use the covariate shift assumption for non-time-series data to extract the domain-invariant representation, but this assumption is hard to meet in practice due to the complex dependence among variables and a small change of the time lags may lead to a huge change of future values. To address this challenge, we leverage the stableness of causal structures among different domains. To further avoid the strong assumptions in causal discovery like linear non-Gaussian assumption, we relax it to mine the stable sparse associative structures instead of discovering the causal structures directly. Besides the domain-invariant structures, we also find that some domain-specific information like the strengths of the structures is important for prediction. Based on the aforementioned intuition, we extend the sparse associative structure alignment model in the conference version to the Sparse Associative Structure Alignment model with domain-specific information enhancement (SASA2 in short), which aligns the invariant unweighted spare associative structures and considers the variant information for time-series unsupervised domain adaptation. Specifically, we first generate the segment set to exclude the obstacle of offsets. Second, we extract the unweighted sparse associative structures via sparse attention mechanisms. Third, we extract the domain-specific information via an autoregressive module. Finally, we employ a unidirectional alignment restriction to guide the transformation from the source to the target. Moreover, we further provide a generalization analysis to show the theoretical superiority of our method. Compared with existing methods, our method yields state-of-the-art performance, with a 5% relative improvement in three real-world datasets, covering different applications: air quality, in-hospital healthcare, and anomaly detection. Furthermore, visualization results of sparse associative structures illustrate what knowledge can be transferred, boosting the transparency and interpretability of our method.
Collapse
Affiliation(s)
- Zijian Li
- Guangdong University of Technology, Guangzhou, 510006, Guangdong, China; Mohamed bin Zayed University of Artificial Intelligence, Masdar City, Abu Dhabi, United Arab Emirates.
| | - Ruichu Cai
- Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| | - Jiawei Chen
- Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| | - Yuguang Yan
- Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| | - Wei Chen
- Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| | - Keli Zhang
- Huawei Noah's Ark Lab, Shenzhen, 518116, Guangdong, China.
| | - Junjian Ye
- Huawei Noah's Ark Lab, Shenzhen, 518116, Guangdong, China.
| |
Collapse
|
11
|
Ma S, Yuan Z, Wu Q, Huang Y, Hu X, Leung CH, Wang D, Huang Z. Deep Into the Domain Shift: Transfer Learning Through Dependence Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14409-14423. [PMID: 37279130 DOI: 10.1109/tnnls.2023.3279099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Classical domain adaptation methods acquire transferability by regularizing the overall distributional discrepancies between features in the source domain (labeled) and features in the target domain (unlabeled). They often do not differentiate whether the domain differences come from the marginals or the dependence structures. In many business and financial applications, the labeling function usually has different sensitivities to the changes in the marginals versus changes in the dependence structures. Measuring the overall distributional differences will not be discriminative enough in acquiring transferability. Without the needed structural resolution, the learned transfer is less optimal. This article proposes a new domain adaptation approach in which one can measure the differences in the internal dependence structure separately from those in the marginals. By optimizing the relative weights among them, the new regularization strategy greatly relaxes the rigidness of the existing approaches. It allows a learning machine to pay special attention to places where the differences matter the most. Experiments on three real-world datasets show that the improvements are quite notable and robust compared to various benchmark domain adaptation models.
Collapse
|
12
|
Wang H, Zhang J, Li Y, Wang D, Zhang T, Yang F, Li Y, Zhang Y, Yang L, Li P. Deep-learning features based on F18 fluorodeoxyglucose positron emission tomography/computed tomography ( 18F-FDG PET/CT) to predict preoperative colorectal cancer lymph node metastasis. Clin Radiol 2024; 79:e1152-e1158. [PMID: 38955636 DOI: 10.1016/j.crad.2024.05.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 04/04/2024] [Accepted: 05/24/2024] [Indexed: 07/04/2024]
Abstract
AIM The objective of this study was to create and authenticate a prognostic model for lymph node metastasis (LNM) in colorectal cancer (CRC) that integrates clinical, radiomics, and deep transfer learning features. MATERIALS AND METHODS In this study, we analyzed data from 119 CRC patients who underwent F18 fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) scanning. The patient cohort was divided into training and validation subsets in an 8:2 ratio, with an additional 33 external data points for testing. Initially, we conducted univariate analysis to screen clinical parameters. Radiomics features were extracted from manually drawn images using pyradiomics, and deep-learning features, radiomics features, and clinical features were selected using Least Absolute Shrinkage and Selection Operator (LASSO) regression and Spearman correlation coefficient. We then constructed a model by training a support vector machine (SVM), and evaluated the performance of the prediction model by comparing the area under the curve (AUC), sensitivity, and specificity. Finally, we developed nomograms combining clinical and radiological features for interpretation and analysis. RESULTS The deep learning radiomics (DLR) nomogram model, which was developed by integrating deep learning, radiomics, and clinical features, exhibited excellent performance. The area under the curve was (AUC = 0.934, 95% confidence interval [CI]: 0.884-0.983) in the training cohort, (AUC = 0.902, 95% CI: 0.769-1.000) in the validation cohort, and (AUC = 0.836, 95% CI: 0.673-0.998) in the test cohort. CONCLUSION We developed a preoperative predictive machine-learning model using deep transfer learning, radiomics, and clinical features to differentiate LNM status in CRC, aiding in treatment decision-making for patients.
Collapse
Affiliation(s)
- H Wang
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - J Zhang
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - Y Li
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - D Wang
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - T Zhang
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - F Yang
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - Y Li
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - Y Zhang
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - L Yang
- PET/MR Department, Harbin Medical University Cancer Hospital, Haping Road, Nangang District, Harbin, Heilongjiang Province, China.
| | - P Li
- Department of PET/CT, The Second Affiliated Hospital of Harbin Medical University, Baojian Road, Nangang District, Harbin, Heilongjiang Province, China.
| |
Collapse
|
13
|
Kong Z, Zhang W, Liu F, Luo W, Liu H, Shen L, Ramachandra R. Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10639-10650. [PMID: 37027593 DOI: 10.1109/tnnls.2023.3243229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Biometric systems are vulnerable to presentation attacks (PAs) performed using various PA instruments (PAIs). Even though there are numerous PA detection (PAD) techniques based on both deep learning and hand-crafted features, the generalization of PAD for unknown PAI is still a challenging problem. In this work, we empirically prove that the initialization of the PAD model is a crucial factor for generalization, which is rarely discussed in the community. Based on such observation, we proposed a self-supervised learning-based method, denoted as DF-DM. Specifically, DF-DM is based on a global-local view coupled with de-folding and de-mixing to derive the task-specific representation for PAD. During de-folding, the proposed technique will learn region-specific features to represent samples in a local pattern by explicitly minimizing the generative loss. While de-mixing drives detectors to obtain the instance-specific features with global information for more comprehensive representation by minimizing the interpolation-based consistency. Extensive experimental results show that the proposed method can achieve significant improvements in terms of both face and fingerprint PAD in more complicated and hybrid datasets when compared with the state-of-the-art methods. When training in CASIA-FASD and Idiap Replay-Attack, the proposed method can achieve an 18.60% equal error rate (EER) in OULU-NPU and MSU-MFSD, exceeding the baseline performance by 9.54%. The source code of the proposed technique is available at https://github.com/kongzhecn/dfdm.
Collapse
|
14
|
Huang J, Xiao N, Zhang L. Balancing Transferability and Discriminability for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5807-5814. [PMID: 36107892 DOI: 10.1109/tnnls.2022.3201623] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Unsupervised domain adaptation (UDA) aims to leverage a sufficiently labeled source domain to classify or represent the fully unlabeled target domain with a different distribution. Generally, the existing approaches try to learn a domain-invariant representation for feature transferability and add class discriminability constraints for feature discriminability. However, the feature transferability and discriminability are usually not synchronized, and there are even some contradictions between them, which is often ignored and, thus, reduces the accuracy of recognition. In this brief, we propose a deep multirepresentations adversarial learning (DMAL) method to explore and mitigate the inconsistency between feature transferability and discriminability in UDA task. Specifically, we consider feature representation learning at both the domain level and class level and explore four types of feature representations: domain-invariant, domain-specific, class-invariant, and class-specific. The first two types indicate the transferability of features, and the last two indicate the discriminability. We develop an adversarial learning strategy between the four representations to make the feature transferability and discriminability to be gradually synchronized. A series of experimental results verify that the proposed DMAL achieves comparable and promising results on six UDA datasets.
Collapse
|
15
|
Liu X, Xing F, You J, Lu J, Kuo CCJ, Fakhri GE, Woo J. Subtype-Aware Dynamic Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2820-2834. [PMID: 35895653 DOI: 10.1109/tnnls.2022.3192315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Unsupervised domain adaptation (UDA) has been successfully applied to transfer knowledge from a labeled source domain to target domains without their labels. Recently introduced transferable prototypical networks (TPNs) further address class-wise conditional alignment. In TPN, while the closeness of class centers between source and target domains is explicitly enforced in a latent space, the underlying fine-grained subtype structure and the cross-domain within-class compactness have not been fully investigated. To counter this, we propose a new approach to adaptively perform a fine-grained subtype-aware alignment to improve the performance in the target domain without the subtype label in both domains. The insight of our approach is that the unlabeled subtypes in a class have the local proximity within a subtype while exhibiting disparate characteristics because of different conditional and label shifts. Specifically, we propose to simultaneously enforce subtype-wise compactness and class-wise separation, by utilizing intermediate pseudo-labels. In addition, we systematically investigate various scenarios with and without prior knowledge of subtype numbers and propose to exploit the underlying subtype structure. Furthermore, a dynamic queue framework is developed to evolve the subtype cluster centroids steadily using an alternative processing scheme. Experimental results, carried out with multiview congenital heart disease data and VisDA and DomainNet, show the effectiveness and validity of our subtype-aware UDA, compared with state-of-the-art UDA methods.
Collapse
|
16
|
Han H, Liu H, Qiao J. Data-Knowledge-Driven Self-Organizing Fuzzy Neural Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2081-2093. [PMID: 35802545 DOI: 10.1109/tnnls.2022.3186671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Fuzzy neural networks (FNNs) hold the advantages of knowledge leveraging and adaptive learning, which have been widely used in nonlinear system modeling. However, it is difficult for FNNs to obtain the appropriate structure in the situation of insufficient data, which limits its generalization performance. To solve this problem, a data-knowledge-driven self-organizing FNN (DK-SOFNN) with a structure compensation strategy and a parameter reinforcement mechanism is proposed in this article. First, a structure compensation strategy is proposed to mine structural information from empirical knowledge to learn the structure of DK-SOFNN. Then, a complete model structure can be acquired by sufficient structural information. Second, a parameter reinforcement mechanism is developed to determine the parameter evolution direction of DK-SOFNN that is most suitable for the current model structure. Then, a robust model can be obtained by the interaction between parameters and dynamic structure. Finally, the proposed DK-SOFNN is theoretically analyzed on the fixed structure case and dynamic structure case. Then, the convergence conditions can be obtained to guide practical applications. The merits of DK-SOFNN are demonstrated by some benchmark problems and industrial applications.
Collapse
|
17
|
Sha X, Sun Z, Zhang J, Ong YS. Who Wants to Shop With You: Joint Product-Participant Recommendation for Group-Buying Service. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2353-2363. [PMID: 35853062 DOI: 10.1109/tnnls.2022.3190003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Recent years have witnessed the great success of group buying (GB) in social e-commerce, opening up a new way of online shopping. In this business model, a user can launch a GB as an initiator to share her interested product with social friends. The GB is clinched once enough friends join in as participants to copurchase the shared product. As such, a successful GB depends on not only whether the initiator can find her interested product but also whether the friends are willing to join in as participants. Most existing recommenders are incompetent in such complex scenario, as they merely seek to help users find their preferred products and cannot help identify potential participants to join in a GB. To this end, we propose a novel joint product-participant recommendation (J2PRec) framework, which recommends both candidate products and participants for maximizing the success rate of a GB. Specifically, J2PRec first designs a relational graph embedding module, which effectively encodes the various relations in GB for learning enhanced user and product embeddings. It then jointly learns the product and participant recommendation tasks under a probabilistic framework to maximize the GB likelihood, i.e., boost the success rate of a GB. Extensive experiments on three real-world datasets demonstrate the superiority of J2PRec for GB recommendation.
Collapse
|
18
|
Huang S, Wang T, Xiong H, Wen B, Huan J, Dou D. Temporal Output Discrepancy for Loss Estimation-Based Active Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2109-2123. [PMID: 35853066 DOI: 10.1109/tnnls.2022.3186855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
While deep learning succeeds in a wide range of tasks, it highly depends on the massive collection of annotated data which is expensive and time-consuming. To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset. Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this article we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. The core of our approach is a measurement temporal output discrepancy (TOD) that estimates the sample loss by evaluating the discrepancy of outputs given by models at different optimization steps. Our theoretical investigation shows that TOD lower-bounds the accumulated sample loss thus it can be used to select informative unlabeled samples. On basis of TOD, we further develop an effective unlabeled data sampling strategy as well as an unsupervised learning criterion for active learning. Due to the simplicity of TOD, our methods are efficient, flexible, and task-agnostic. Extensive experimental results demonstrate that our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks. In addition, we show that TOD can be utilized to select the best model of potentially the highest testing accuracy from a pool of candidate models.
Collapse
|
19
|
Zhou F, Qi X, Zhang K, Trajcevski G, Zhong T. MetaGeo: A General Framework for Social User Geolocation Identification With Few-Shot Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8950-8964. [PMID: 35259118 DOI: 10.1109/tnnls.2022.3154204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Identifying the geolocation of social media users is an important problem in a wide range of applications, spanning from disease outbreaks, emergency detection, local event recommendation, to fake news localization, online marketing planning, and even crime control and prevention. Researchers have attempted to propose various models by combining different sources of information, including text, social relation, and contextual data, which indeed has achieved promising results. However, existing approaches still suffer from certain constraints, such as: 1) a very few samples are available and 2) prediction models are not easy to be generalized for users from new regions-which are challenges that motivate our study. In this article, we propose a general framework for identifying user geolocation-MetaGeo, which is a meta-learning-based approach, learning the prior distribution of the geolocation task in order to quickly adapt the prediction toward users from new locations. Different from typical meta-learning settings that only learn a new concept from few-shot samples, MetaGeo improves the geolocation prediction with conventional settings by ensembling numerous mini-tasks. In addition, MetaGeo incorporates probabilistic inference to alleviate two issues inherent in training with few samples: location uncertainty and task ambiguity. To demonstrate the effectiveness of MetaGeo, we conduct extensive experimental evaluations on three real-world datasets and compare the performance with several state-of-the-art benchmark models. The results demonstrate the superiority of MetaGeo in both the settings where the predicted locations/regions are known or have not been seen during training.
Collapse
|
20
|
Chen S, Hong Z, Harandi M, Yang X. Domain Neural Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8630-8641. [PMID: 35259116 DOI: 10.1109/tnnls.2022.3151683] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Domain adaptation is concerned with the problem of generalizing a classification model to a target domain with little or no labeled data, by leveraging the abundant labeled data from a related source domain. The source and target domains possess different joint probability distributions, making it challenging for model generalization. In this article, we introduce domain neural adaptation (DNA): an approach that exploits nonlinear deep neural network to 1) match the source and target joint distributions in the network activation space and 2) learn the classifier in an end-to-end manner. Specifically, we employ the relative chi-square divergence to compare the two joint distributions, and show that the divergence can be estimated via seeking the maximal value of a quadratic functional over the reproducing kernel hilbert space. The analytic solution to this maximization problem enables us to explicitly express the divergence estimate as a function of the neural network mapping. We optimize the network parameters to minimize the estimated joint distribution divergence and the classification loss, yielding a classification model that generalizes well to the target domain. Empirical results on several visual datasets demonstrate that our solution is statistically better than its competitors.
Collapse
|
21
|
Han H, Liu H, Yang C, Qiao J. Transfer Learning Algorithm With Knowledge Division Level. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8602-8616. [PMID: 35230958 DOI: 10.1109/tnnls.2022.3151646] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
One of the major challenges of transfer learning algorithms is the domain drifting problem where the knowledge of source scene is inappropriate for the task of target scene. To solve this problem, a transfer learning algorithm with knowledge division level (KDTL) is proposed to subdivide knowledge of source scene and leverage them with different drifting degrees. The main properties of KDTL are three folds. First, a comparative evaluation mechanism is developed to detect and subdivide the knowledge into three kinds-the ineffective knowledge, the usable knowledge, and the efficient knowledge. Then, the ineffective and usable knowledge can be found to avoid the negative transfer problem. Second, an integrated framework is designed to prune the ineffective knowledge in the elastic layer, reconstruct the usable knowledge in the refined layer, and learn the efficient knowledge in the leveraged layer. Then, the efficient knowledge can be acquired to improve the learning performance. Third, the theoretical analysis of the proposed KDTL is analyzed in different phases. Then, the convergence property, error bound, and computational complexity of KDTL are provided for the successful applications. Finally, the proposed KDTL is tested by several benchmark problems and some real problems. The experimental results demonstrate that this proposed KDTL can achieve significant improvement over some state-of-the-art algorithms.
Collapse
|
22
|
Long T, Sun Y, Gao J, Hu Y, Yin B. Domain Adaptation as Optimal Transport on Grassmann Manifolds. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7196-7209. [PMID: 35061594 DOI: 10.1109/tnnls.2021.3139119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Domain adaptation in the Euclidean space is a challenging task on which researchers recently have made great progress. However, in practice, there are rich data representations that are not Euclidean. For example, many high-dimensional data in computer vision are in general modeled by a low-dimensional manifold. This prompts the demand of exploring domain adaptation between non-Euclidean manifold spaces. This article is concerned with domain adaption over the classic Grassmann manifolds. An optimal transport-based domain adaptation model on Grassmann manifolds has been proposed. The model implements the adaption between datasets by minimizing the Wasserstein distances between the projected source data and the target data on Grassmann manifolds. Four regularization terms are introduced to keep task-related consistency in the adaptation process. Furthermore, to reduce the computational cost, a simplified model preserving the necessary adaption property and its efficient algorithm is proposed and tested. The experiments on several publicly available datasets prove the proposed model outperforms several relevant baseline domain adaptation methods.
Collapse
|
23
|
Wang F, Wan Y, Li Z, Qi F, Li J. A cross-subject decoding algorithm for patients with disorder of consciousness based on P300 brain computer interface. Front Neurosci 2023; 17:1167125. [PMID: 37547152 PMCID: PMC10398338 DOI: 10.3389/fnins.2023.1167125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 06/19/2023] [Indexed: 08/08/2023] Open
Abstract
Background Brain computer interface (BCI) technology may provide a new way of communication for some patients with disorder of consciousness (DOC), which can directly connect the brain and external devices. However, the DOC patients' EEG differ significantly from that of the normal person and are difficult to collected, the decoding algorithm currently only is trained based on a small amount of the patient's own data and performs poorly. Methods In this study, a decoding algorithm called WD-ADSTCN based on domain adaptation is proposed to improve the DOC patients' P300 signal detection. We used the Wasserstein distance to filter the normal population data to increase the training data. Furthermore, an adversarial approach is adopted to resolve the differences between the normal and patient data. Results The results showed that in the cross-subject P300 detection of DOC patients, 7 of 11 patients achieved an average accuracy of over 70%. Furthermore, their clinical diagnosis changed and CRS-R scores improved three months after the experiment. Conclusion These results demonstrated that the proposed method could be employed in the P300 BCI system for the DOC patients, which has important implications for the clinical diagnosis and prognosis of these patients.
Collapse
Affiliation(s)
- Fei Wang
- School of Software, South China Normal University, Guangzhou, China
- Pazhou Lab, Guangzhou, China
| | - Yinxing Wan
- School of Software, South China Normal University, Guangzhou, China
| | - Zhuorong Li
- School of Software, South China Normal University, Guangzhou, China
| | - Feifei Qi
- Pazhou Lab, Guangzhou, China
- School of Internet Finance and Information Engineering, Guangdong University of Finance, Guangzhou, China
| | - Jingcong Li
- School of Software, South China Normal University, Guangzhou, China
- Pazhou Lab, Guangzhou, China
| |
Collapse
|
24
|
Moradi M, Hamidzadeh J. A domain adaptation method by incorporating belief function in twin quarter-sphere SVM. Knowl Inf Syst 2023. [DOI: 10.1007/s10115-023-01857-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
25
|
Wei P, Vo TV, Qu X, Ong YS, Ma Z. Transfer Kernel Learning for Multi-Source Transfer Gaussian Process Regression. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:3862-3876. [PMID: 35727778 DOI: 10.1109/tpami.2022.3184696] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multi-source transfer regression is a practical and challenging problem where capturing the diverse relatedness of different domains is the key of adaptive knowledge transfer. In this article, we propose an effective way of explicitly modeling the domain relatedness of each domain pair through transfer kernel learning. Specifically, we first discuss the advantages and disadvantages of existing transfer kernels in handling the multi-source transfer regression problem. To cope with the limitations of the existing transfer kernels, we further propose a novel multi-source transfer kernel kms. The proposed kms assigns a learnable parametric coefficient to model the relatedness of each inter-domain pair, and simultaneously regulates the relatedness of the intra-domain pair to be 1. Moreover, to capture the heterogeneous data characteristics of multiple domains, kms exploits different standard kernels for different domain pairs. We further provide a theorem that not only guarantees the positive semi-definiteness of kms but also conveys a semantic interpretation to the learned domain relatedness. Moreover, the theorem can be easily used in the learning of the corresponding transfer Gaussian process model with kms. Extensive empirical studies show the effectiveness of our proposed method on domain relatedness modelling and transfer performance.
Collapse
|
26
|
Yang J, Yang J, Wang S, Cao S, Zou H, Xie L. Advancing Imbalanced Domain Adaptation: Cluster-Level Discrepancy Minimization With a Comprehensive Benchmark. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1106-1117. [PMID: 34398781 DOI: 10.1109/tcyb.2021.3093888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Unsupervised domain adaptation methods have been proposed to tackle the problem of covariate shift by minimizing the distribution discrepancy between the feature embeddings of source domain and target domain. However, the standard evaluation protocols assume that the conditional label distributions of the two domains are invariant, which is usually not consistent with the real-world scenarios such as long-tailed distribution of visual categories. In this article, the imbalanced domain adaptation (IDA) is formulated for a more realistic scenario where both label shift and covariate shift occur between the two domains. Theoretically, when label shift exists, aligning the marginal distributions may result in negative transfer. Therefore, a novel cluster-level discrepancy minimization (CDM) is developed. CDM proposes cross-domain similarity learning to learn tight and discriminative clusters, which are utilized for both feature-level and distribution-level discrepancy minimization, palliating the negative effect of label shift during domain transfer. Theoretical justifications further demonstrate that CDM minimizes the target risk in a progressive manner. To corroborate the effectiveness of CDM, we propose two evaluation protocols according to the real-world situation and benchmark existing domain adaptation approaches. Extensive experiments demonstrate that negative transfer does occur due to label shift, while our approach achieves significant improvement on imbalanced datasets, including Office-31, Image-CLEF, and Office-Home.
Collapse
|
27
|
Huang X, Zhou N, Huang J, Zhang H, Pedrycz W, Choi KS. Center transfer for supervised domain adaptation. APPL INTELL 2023; 53:1-17. [PMID: 36718382 PMCID: PMC9878501 DOI: 10.1007/s10489-022-04414-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/15/2022] [Indexed: 01/27/2023]
Abstract
Domain adaptation (DA) is a popular strategy for pattern recognition and classification tasks. It leverages a large amount of data from the source domain to help train the model applied in the target domain. Supervised domain adaptation (SDA) approaches are desirable when only few labeled samples from the target domain are available. They can be easily adopted in many real-world applications where data collection is expensive. In this study, we propose a new supervision signal, namely center transfer loss (CTL), to efficiently align features under the SDA setting in the deep learning (DL) field. Unlike most previous SDA methods that rely on pairing up training samples, the proposed loss is trainable only using one-stream input based on the mini-batch strategy. The CTL exhibits two main functionalities in training to increase the performance of DL models, i.e., domain alignment and increasing the feature's discriminative power. The hyper-parameter to balance these two functionalities is waived in CTL, which is the second improvement from the previous approaches. Extensive experiments completed on well-known public datasets show that the proposed method performs better than recent state-of-the-art approaches.
Collapse
Affiliation(s)
- Xiuyu Huang
- Center for Smart Health, The Hong Kong Polytechnic University, Hong Kong SAR, 999077 China
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2R3 Canada
| | - Nan Zhou
- School of Electronic Information and Electronic Engineering, Chengdu University, Chengdu, 610000 China
| | - Jian Huang
- College of Control Engineering, Chengdu University of Information Technology, Chengdu, 610101 China
| | - Huaidong Zhang
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510000 China
| | - Witold Pedrycz
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2R3 Canada
- Systems Research Institute, Polish Academy of Sciences, 00-901 Warsaw, Poland
- Department of Electrical and Computer Engineering Faculty of Engineering, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
- Faculty of Engineering and Natural Sciences, Department of Computer Engineering, Istinye University, Sariyer/Istanbul, Türkiye
| | - Kup-Sze Choi
- Center for Smart Health, The Hong Kong Polytechnic University, Hong Kong SAR, 999077 China
| |
Collapse
|
28
|
Lu Y, Wong WK, Zeng B, Lai Z, Li X. Guided Discrimination and Correlation Subspace Learning for Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:2017-2032. [PMID: 37018080 DOI: 10.1109/tip.2023.3261758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
As a branch of transfer learning, domain adaptation leverages useful knowledge from a source domain to a target domain for solving target tasks. Most of the existing domain adaptation methods focus on how to diminish the conditional distribution shift and learn invariant features between different domains. However, two important factors are overlooked by most existing methods: 1) the transferred features should be not only domain invariant but also discriminative and correlated, and 2) negative transfer should be avoided as much as possible for the target tasks. To fully consider these factors in domain adaptation, we propose a guided discrimination and correlation subspace learning (GDCSL) method for cross-domain image classification. GDCSL considers the domain-invariant, category-discriminative, and correlation learning of data. Specifically, GDCSL introduces the discriminative information associated with the source and target data by minimizing the intraclass scatter and maximizing the interclass distance. By designing a new correlation term, GDCSL extracts the most correlated features from the source and target domains for image classification. The global structure of the data can be preserved in GDCSL because the target samples are represented by the source samples. To avoid negative transfer issues, we use a sample reweighting method to detect target samples with different confidence levels. A semi-supervised extension of GDCSL (Semi-GDCSL) is also proposed, and a novel label selection scheme is introduced to ensure the correction of the target pseudo-labels. Comprehensive and extensive experiments are conducted on several cross-domain data benchmarks. The experimental results verify the effectiveness of the proposed methods over state-of-the-art domain adaptation methods.
Collapse
|
29
|
Zhan Q, Liu G, Xie X, Sun G, Tang H. Effective Transfer Learning Algorithm in Spiking Neural Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13323-13335. [PMID: 34270439 DOI: 10.1109/tcyb.2021.3079097] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
As the third generation of neural networks, spiking neural networks (SNNs) have gained much attention recently because of their high energy efficiency on neuromorphic hardware. However, training deep SNNs requires many labeled data that are expensive to obtain in real-world applications, as traditional artificial neural networks (ANNs). In order to address this issue, transfer learning has been proposed and widely used in traditional ANNs, but it has limited use in SNNs. In this article, we propose an effective transfer learning framework for deep SNNs based on the domain in-variance representation. Specifically, we analyze the rationality of centered kernel alignment (CKA) as a domain distance measurement relative to maximum mean discrepancy (MMD) in deep SNNs. In addition, we study the feature transferability across different layers by testing on the Office-31, Office-Caltech-10, and PACS datasets. The experimental results demonstrate the transferability of SNNs and show the effectiveness of the proposed transfer learning framework by using CKA in SNNs.
Collapse
|
30
|
Zhou L, Ye M, Zhang D, Zhu C, Ji L. Prototype-Based Multisource Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5308-5320. [PMID: 33852394 DOI: 10.1109/tnnls.2021.3070085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Unsupervised domain adaptation aims to transfer knowledge from labeled source domain to unlabeled target domain. Recently, multisource domain adaptation (MDA) has begun to attract attention. Its performance should go beyond simply mixing all source domains together for knowledge transfer. In this article, we propose a novel prototype-based method for MDA. Specifically, for solving the problem that the target domain has no label, we use the prototype to transfer the semantic category information from source domains to target domain. First, a feature extraction network is applied to both source and target domains to obtain the extracted features from which the domain-invariant features and domain-specific features will be disentangled. Then, based on these two kinds of features, the named inherent class prototypes and domain prototypes are estimated, respectively. Then a prototype mapping to the extracted feature space is learned in the feature reconstruction process. Thus, the class prototypes for all source and target domains can be constructed in the extracted feature space based on the previous domain prototypes and inherent class prototypes. By forcing the extracted features are close to the corresponding class prototypes for all domains, the feature extraction network is progressively adjusted. In the end, the inherent class prototypes are used as a classifier in the target domain. Our contribution is that through the inherent class prototypes and domain prototypes, the semantic category information from source domains is transformed into the target domain by constructing the corresponding class prototypes. In our method, all source and target domains are aligned twice at the feature level for better domain-invariant features and more closer features to the class prototypes, respectively. Several experiments on public data sets also prove the effectiveness of our method.
Collapse
|
31
|
Xu F, Ma B, Chang H, Shan S. PRDP: Person Reidentification With Dirty and Poor Data. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:11014-11026. [PMID: 34473639 DOI: 10.1109/tcyb.2021.3105970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, we propose a novel method to simultaneously solve the data problem of dirty quality and poor quantity for person reidentification (ReID). Dirty quality refers to the wrong labels in image annotations. Poor quantity means that some identities have very few images (FewIDs). Training with these mislabeled data or FewIDs with triplet loss will lead to low generalization performance. To solve the label error problem, we propose a weighted label correction based on cross-entropy (wLCCE) strategy. Specifically, according to the influence range of the wrong labels, we first classify the mislabeled images into point label error and set label error. Then, we propose a weighted triplet loss (WTL) to correct the two label errors, respectively. To alleviate the poor quantity issue, we propose a feature simulation based on autoencoder (FSAE) method to generate some virtual samples for FewID. For the authenticity of the simulated features, we transfer the difference pattern of identities with multiple images (MultIDs) to FewIDs by training an autoencoder (AE)-based simulator. In this way, the FewIDs obtain richer expressions to distinguish from other identities. By dealing with a dirty and poor data problem, we can learn more robust ReID models using the triplet loss. We conduct extensive experiments on two public person ReID datasets: 1) Market-1501 and 2) DukeMTMC-reID, to verify the effectiveness of our approach.
Collapse
|
32
|
Unsupervised domain adaptation via discriminative feature learning and classifier adaptation from center-based distances. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
33
|
TSTELM: Two-Stage Transfer Extreme Learning Machine for Unsupervised Domain Adaptation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1582624. [PMID: 35898785 PMCID: PMC9313952 DOI: 10.1155/2022/1582624] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 06/21/2022] [Accepted: 06/23/2022] [Indexed: 11/26/2022]
Abstract
As a single-layer feedforward network (SLFN), extreme learning machine (ELM) has been successfully applied for classification and regression in machine learning due to its faster training speed and better generalization. However, it will perform poorly for domain adaptation in which the distributions between training data and testing data are inconsistent. In this article, we propose a novel ELM called two-stage transfer extreme learning machine (TSTELM) to solve this problem. At the statistical matching stage, we adopt maximum mean discrepancy (MMD) to narrow the distribution difference of the output layer between domains. In addition, at the subspace alignment stage, we align the source and target model parameters, design target cross-domain mean approximation, and add the output weight approximation to further promote the knowledge transferring across domains. Moreover, the prediction of test sample is jointly determined by the ELM parameters generated at the two stages. Finally, we investigate the proposed approach in classification task and conduct experiments on four public domain adaptation datasets. The result indicates that TSTELM could effectively enhance the knowledge transfer ability of ELM with higher accuracy than other existing transfer and non-transfer classifiers.
Collapse
|
34
|
Ragab M, Eldele E, Chen Z, Wu M, Kwoh CK, Li X. Self-Supervised Autoregressive Domain Adaptation for Time Series Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1341-1351. [PMID: 35737606 DOI: 10.1109/tnnls.2022.3183252] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Unsupervised domain adaptation (UDA) has successfully addressed the domain shift problem for visual applications. Yet, these approaches may have limited performance for time series data due to the following reasons. First, they mainly rely on the large-scale dataset (i.e., ImageNet) for source pretraining, which is not applicable for time series data. Second, they ignore the temporal dimension on the feature space of the source and target domains during the domain alignment step. Finally, most of the prior UDA methods can only align the global features without considering the fine-grained class distribution of the target domain. To address these limitations, we propose a SeLf-supervised AutoRegressive Domain Adaptation (SLARDA) framework. In particular, we first design a self-supervised (SL) learning module that uses forecasting as an auxiliary task to improve the transferability of source features. Second, we propose a novel autoregressive domain adaptation technique that incorporates temporal dependence of both source and target features during domain alignment. Finally, we develop an ensemble teacher model to align class-wise distribution in the target domain via a confident pseudo labeling approach. Extensive experiments have been conducted on three real-world time series applications with 30 cross-domain scenarios. The results demonstrate that our proposed SLARDA method significantly outperforms the state-of-the-art approaches for time series domain adaptation. Our source code is available at: https://github.com/mohamedr002/SLARDA.
Collapse
|
35
|
Kong L, Hu B, Liu X, Lu J, You J, Liu X. Constraining pseudo‐label in self‐training unsupervised domain adaptation with energy‐based model. INT J INTELL SYST 2022. [DOI: 10.1002/int.22930] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Lingsheng Kong
- Changchun Institute of Optics Fine Mechanics and Physics, of the Chinese Academy of Sciences Changchun China
| | - Bo Hu
- Department of Accounting National University of Singapore Singapore
| | - Xiongchang Liu
- Department of Information and Electrical Engineering China University of Mining and Technology Beijing China
| | - Jun Lu
- Beth Israel Deaconess Medical Center and Harvard Medical School Boston MA USA
| | - Jane You
- Department of Computing The Hong Kong Polytechnic University Hong Kong China
| | - Xiaofeng Liu
- Gordon Center for Medical Imaging Harvard University Cambridge MA USA
| |
Collapse
|
36
|
Artificial Intelligence-Based Prediction of Oroantral Communication after Tooth Extraction Utilizing Preoperative Panoramic Radiography. Diagnostics (Basel) 2022; 12:diagnostics12061406. [PMID: 35741216 PMCID: PMC9221677 DOI: 10.3390/diagnostics12061406] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 06/02/2022] [Accepted: 06/04/2022] [Indexed: 02/01/2023] Open
Abstract
Oroantral communication (OAC) is a common complication after tooth extraction of upper molars. Profound preoperative panoramic radiography analysis might potentially help predict OAC following tooth extraction. In this exploratory study, we evaluated n = 300 consecutive cases (100 OAC and 200 controls) and trained five machine learning algorithms (VGG16, InceptionV3, MobileNetV2, EfficientNet, and ResNet50) to predict OAC versus non-OAC (binary classification task) from the input images. Further, four oral and maxillofacial experts evaluated the respective panoramic radiography and determined performance metrics (accuracy, area under the curve (AUC), precision, recall, F1-score, and receiver operating characteristics curve) of all diagnostic approaches. Cohen’s kappa was used to evaluate the agreement between expert evaluations. The deep learning algorithms reached high specificity (highest specificity 100% for InceptionV3) but low sensitivity (highest sensitivity 42.86% for MobileNetV2). The AUCs from VGG16, InceptionV3, MobileNetV2, EfficientNet, and ResNet50 were 0.53, 0.60, 0.67, 0.51, and 0.56, respectively. Expert 1–4 reached an AUC of 0.550, 0.629, 0.500, and 0.579, respectively. The specificity of the expert evaluations ranged from 51.74% to 95.02%, whereas sensitivity ranged from 14.14% to 59.60%. Cohen’s kappa revealed a poor agreement for the oral and maxillofacial expert evaluations (Cohen’s kappa: 0.1285). Overall, present data indicate that OAC cannot be sufficiently predicted from preoperative panoramic radiography. The false-negative rate, i.e., the rate of positive cases (OAC) missed by the deep learning algorithms, ranged from 57.14% to 95.24%. Surgeons should not solely rely on panoramic radiography when evaluating the probability of OAC occurrence. Clinical testing of OAC is warranted after each upper-molar tooth extraction.
Collapse
|
37
|
Simplifying Text Mining Activities: Scalable and Self-Tuning Methodology for Topic Detection and Characterization. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12105125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
In recent years, the number and heterogeneity of large scientific datasets have been growing steadily. Moreover, the analysis of these data collections is not a trivial task. There are many algorithms capable of analyzing large datasets, but parameters need to be set for each of them. Moreover, larger datasets also mean greater complexity. All this leads to the need to develop innovative, scalable, and parameter-free solutions. The goal of this research activity is to design and develop an automated data analysis engine that effectively and efficiently analyzes large collections of text data with minimal user intervention. Both parameter-free algorithms and self-assessment strategies have been proposed to suggest algorithms and specific parameter values for each step that characterizes the analysis pipeline. The proposed solutions have been tailored to text corpora characterized by variable term distributions and different document lengths. In particular, a new engine called ESCAPE (enhanced self-tuning characterization of document collections after parameter evaluation) has been designed and developed. ESCAPE integrates two different solutions for document clustering and topic modeling: the joint approach and the probabilistic approach. Both methods include ad hoc self-optimization strategies to configure the specific algorithm parameters. Moreover, novel visualization techniques and quality metrics have been integrated to analyze the performances of both approaches and to help domain experts interpret the discovered knowledge. Both approaches are able to correctly identify meaningful partitions of a given document corpus by grouping them according to topics.
Collapse
|
38
|
Long T, Sun Y, Gao J, Hu Y, Yin B. Video Domain Adaptation based on Optimal Transport in Grassmann Manifolds. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.01.044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
39
|
Kim HE, Cosa-Linan A, Santhanam N, Jannesari M, Maros ME, Ganslandt T. Transfer learning for medical image classification: a literature review. BMC Med Imaging 2022; 22:69. [PMID: 35418051 PMCID: PMC9007400 DOI: 10.1186/s12880-022-00793-7] [Citation(s) in RCA: 183] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 03/30/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Transfer learning (TL) with convolutional neural networks aims to improve performances on a new task by leveraging the knowledge of similar tasks learned in advance. It has made a major contribution to medical image analysis as it overcomes the data scarcity problem as well as it saves time and hardware resources. However, transfer learning has been arbitrarily configured in the majority of studies. This review paper attempts to provide guidance for selecting a model and TL approaches for the medical image classification task. METHODS 425 peer-reviewed articles were retrieved from two databases, PubMed and Web of Science, published in English, up until December 31, 2020. Articles were assessed by two independent reviewers, with the aid of a third reviewer in the case of discrepancies. We followed the PRISMA guidelines for the paper selection and 121 studies were regarded as eligible for the scope of this review. We investigated articles focused on selecting backbone models and TL approaches including feature extractor, feature extractor hybrid, fine-tuning and fine-tuning from scratch. RESULTS The majority of studies (n = 57) empirically evaluated multiple models followed by deep models (n = 33) and shallow (n = 24) models. Inception, one of the deep models, was the most employed in literature (n = 26). With respect to the TL, the majority of studies (n = 46) empirically benchmarked multiple approaches to identify the optimal configuration. The rest of the studies applied only a single approach for which feature extractor (n = 38) and fine-tuning from scratch (n = 27) were the two most favored approaches. Only a few studies applied feature extractor hybrid (n = 7) and fine-tuning (n = 3) with pretrained models. CONCLUSION The investigated studies demonstrated the efficacy of transfer learning despite the data scarcity. We encourage data scientists and practitioners to use deep models (e.g. ResNet or Inception) as feature extractors, which can save computational costs and time without degrading the predictive power.
Collapse
Affiliation(s)
- Hee E Kim
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany.
| | - Alejandro Cosa-Linan
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Nandhini Santhanam
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Mahboubeh Jannesari
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Mate E Maros
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Thomas Ganslandt
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Wetterkreuz 15, 91058, Erlangen, Germany
| |
Collapse
|
40
|
Dong Y, Liu Q, Du B, Zhang L. Weighted Feature Fusion of Convolutional Neural Network and Graph Attention Network for Hyperspectral Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1559-1572. [PMID: 35077363 DOI: 10.1109/tip.2022.3144017] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Convolutional Neural Networks (CNN) and Graph Neural Networks (GNN), such as Graph Attention Networks (GAT), are two classic neural network models, which are applied to the processing of grid data and graph data respectively. They have achieved outstanding performance in hyperspectral images (HSIs) classification field, which have attracted great interest. However, CNN has been facing the problem of small samples and GNN has to pay a huge computational cost, which restrict the performance of the two models. In this paper, we propose Weighted Feature Fusion of Convolutional Neural Network and Graph Attention Network (WFCG) for HSI classification, by using the characteristics of superpixel-based GAT and pixel-based CNN, which proved to be complementary. We first establish GAT with the help of superpixel-based encoder and decoder modules. Then we combined the attention mechanism to construct CNN. Finally, the features are weighted fusion with the characteristics of two neural network models. Rigorous experiments on three real-world HSI data sets show WFCG can fully explore the high-dimensional feature of HSI, and obtain competitive results compared to other state-of-the art methods.
Collapse
|
41
|
Zheng Z, Yang J, Yu Z, Wang Y, Sun Z, Zheng B. Not every sample is efficient: Analogical generative adversarial network for unpaired image-to-image translation. Neural Netw 2022; 148:166-175. [PMID: 35144150 DOI: 10.1016/j.neunet.2022.01.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/22/2021] [Accepted: 01/20/2022] [Indexed: 10/19/2022]
Abstract
Image translation is to learn an effective mapping function that aims to convert an image from a source domain to another target domain. With the proposal and further developments of generative adversarial networks (GANs), the generative models have achieved great breakthroughs. The image-to-image (I2I) translation methods can mainly fall into two categories: Paired and Unpaired. The former paired methods usually require a large amount of input-output sample pairs to perform one-side image translation, which heavily limits its practicability. To address the lack of the paired samples, CycleGAN and its extensions utilize the cycle-consistency loss to provide an elegant and generic solution to perform the unpaired I2I translation between two domains based on unpaired data. This thread of dual learning-based methods usually adopts the random sampling strategy for optimizing and does not consider the content similarity between samples. However, not every sample is efficient and effective for the desired optimization and leads to optimal convergence. Inspired by analogical learning, which is to utilize the relationships and similarities between sample observations, we propose a novel generic metric-based sampling strategy to effectively select samples from different domains for training. Besides, we introduce a novel analogical adversarial loss to force the model to learn from the effective samples and alleviate the influence of the negative samples. Experimental results on various vision tasks have demonstrated the superior performance of the proposed method. The proposed method is also a generic framework that can be easily extended to other I2I translation methods and result in a performance gain.
Collapse
Affiliation(s)
- Ziqiang Zheng
- Ocean University of China/ Sanya Oceanographic Institution, Ocean University of China, No. 238, Songling Road, Qingdao/Sanya, Shandong/Hainan, China
| | - Jie Yang
- University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Zhibin Yu
- Ocean University of China/ Sanya Oceanographic Institution, Ocean University of China, No. 238, Songling Road, Qingdao/Sanya, Shandong/Hainan, China.
| | - Yubo Wang
- School of Life Science and Technology, Xidian University, Xi'an, Shanxi, China.
| | - Zhijian Sun
- Crossocean of Suzhou technology, No. 218 East Qingdao Road, Suzhou, China; Key Laboratory of System Control and Information Processing, Shanghai Jiao Tong University, Shanghai, China
| | - Bing Zheng
- Ocean University of China/ Sanya Oceanographic Institution, Ocean University of China, No. 238, Songling Road, Qingdao/Sanya, Shandong/Hainan, China
| |
Collapse
|
42
|
Chen S, Harandi M, Jin X, Yang X. Semi-Supervised Domain Adaptation via Asymmetric Joint Distribution Matching. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5708-5722. [PMID: 33055040 DOI: 10.1109/tnnls.2020.3027364] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
An intrinsic problem in domain adaptation is the joint distribution mismatch between the source and target domains. Therefore, it is crucial to match the two joint distributions such that the source domain knowledge can be properly transferred to the target domain. Unfortunately, in semi-supervised domain adaptation (SSDA) this problem still remains unsolved. In this article, we therefore present an asymmetric joint distribution matching (AJDM) approach, which seeks a couple of asymmetric matrices to linearly match the source and target joint distributions under the relative chi-square divergence. Specifically, we introduce a least square method to estimate the divergence, which is free from estimating the two joint distributions. Furthermore, we show that our AJDM approach can be generalized to a kernel version, enabling it to handle nonlinearity in the data. From the perspective of Riemannian geometry, learning the linear and nonlinear mappings are both formulated as optimization problems defined on the product of Riemannian manifolds. Numerical experiments on synthetic and real-world data sets demonstrate the effectiveness of the proposed approach and testify its superiority over existing SSDA techniques.
Collapse
|
43
|
Hedegaard L, Sheikh-Omar OA, Iosifidis A. Supervised Domain Adaptation: A Graph Embedding Perspective and a Rectified Experimental Protocol. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:8619-8631. [PMID: 34648445 DOI: 10.1109/tip.2021.3118978] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Domain Adaptation is the process of alleviating distribution gaps between data from different domains. In this paper, we show that Domain Adaptation methods using pair-wise relationships between source and target domain data can be formulated as a Graph Embedding in which the domain labels are incorporated into the structure of the intrinsic and penalty graphs. Specifically, we analyse the loss functions of three existing state-of-the-art Supervised Domain Adaptation methods and demonstrate that they perform Graph Embedding. Moreover, we highlight some generalisation and reproducibility issues related to the experimental setup commonly used to demonstrate the few-shot learning capabilities of these methods. To assess and compare Supervised Domain Adaptation methods accurately, we propose a rectified evaluation protocol, and report updated benchmarks on the standard datasets Office31 (Amazon, DSLR, and Webcam), Digits (MNIST, USPS, SVHN, and MNIST-M) and VisDA (Synthetic, Real).
Collapse
|
44
|
He Q, Dai Q, Wu X, He JY. A novel class restriction loss for unsupervised domain adaptation. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.07.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
45
|
Jin Q, Cui H, Sun C, Meng Z, Wei L, Su R. Domain adaptation based self-correction model for COVID-19 infection segmentation in CT images. EXPERT SYSTEMS WITH APPLICATIONS 2021; 176:114848. [PMID: 33746369 PMCID: PMC7954643 DOI: 10.1016/j.eswa.2021.114848] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 01/29/2021] [Accepted: 03/02/2021] [Indexed: 05/03/2023]
Abstract
The capability of generalization to unseen domains is crucial for deep learning models when considering real-world scenarios. However, current available medical image datasets, such as those for COVID-19 CT images, have large variations of infections and domain shift problems. To address this issue, we propose a prior knowledge driven domain adaptation and a dual-domain enhanced self-correction learning scheme. Based on the novel learning scheme, a domain adaptation based self-correction model (DASC-Net) is proposed for COVID-19 infection segmentation on CT images. DASC-Net consists of a novel attention and feature domain enhanced domain adaptation model (AFD-DA) to solve the domain shifts and a self-correction learning process to refine segmentation results. The innovations in AFD-DA include an image-level activation feature extractor with attention to lung abnormalities and a multi-level discrimination module for hierarchical feature domain alignment. The proposed self-correction learning process adaptively aggregates the learned model and corresponding pseudo labels for the propagation of aligned source and target domain information to alleviate the overfitting to noises caused by pseudo labels. Extensive experiments over three publicly available COVID-19 CT datasets demonstrate that DASC-Net consistently outperforms state-of-the-art segmentation, domain shift, and coronavirus infection segmentation methods. Ablation analysis further shows the effectiveness of the major components in our model. The DASC-Net enriches the theory of domain adaptation and self-correction learning in medical imaging and can be generalized to multi-site COVID-19 infection segmentation on CT images for clinical deployment.
Collapse
Affiliation(s)
- Qiangguo Jin
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
- CSIRO Data61, Sydney, Australia
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Australia
| | | | - Zhaopeng Meng
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | - Leyi Wei
- School of Software, Shandong University, Shandong, China
| | - Ran Su
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
46
|
Yuan C, Yang L. Capped L 2,p-norm metric based robust least squares twin support vector machine for pattern classification. Neural Netw 2021; 142:457-478. [PMID: 34273616 DOI: 10.1016/j.neunet.2021.06.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 06/25/2021] [Accepted: 06/29/2021] [Indexed: 11/27/2022]
Abstract
Least squares twin support vector machine (LSTSVM) is an effective and efficient learning algorithm for pattern classification. However, the distance in LSTSVM is measured by squared L2-norm metric that may magnify the influence of outliers. In this paper, a novel robust least squares twin support vector machine framework is proposed for binary classification, termed as CL2,p-LSTSVM, which utilizes capped L2,p-norm distance metric to reduce the influence of noise and outliers. The goal of CL2,p-LSTSVM is to minimize the capped L2,p-norm intra-class distance dispersion, and eliminate the influence of outliers during training process, where the value of the metric is controlled by the capped parameter, which can ensure better robustness. The proposed metric includes and extends the traditional metrics by setting appropriate values of p and capped parameter. This strategy not only retains the advantages of LSTSVM, but also improves the robustness in solving a binary classification problem with outliers. However, the nonconvexity of metric makes it difficult to optimize. We design an effective iterative algorithm to solve the CL2,p-LSTSVM. In each iteration, two systems of linear equations are solved. Simultaneously, we present some insightful analyses on the computational complexity and convergence of algorithm. Moreover, we extend the CL2,p-LSTSVM to nonlinear classifier and semi-supervised classification. Experiments are conducted on artificial datasets, UCI benchmark datasets, and image datasets to evaluate our method. Under different noise settings and different evaluation criteria, the experiment results show that the CL2,p-LSTSVM has better robustness than state-of-the-art approaches in most cases, which demonstrates the feasibility and effectiveness of the proposed method.
Collapse
Affiliation(s)
- Chao Yuan
- College of Information and Electrical Engineering, China Agricultural University, Beijing, Haidian, 100083, China
| | - Liming Yang
- College of Science, China Agricultural University, Beijing, Haidian, 100083, China.
| |
Collapse
|
47
|
Abstract
AbstractTaxi demand prediction is essential to build efficient traffic transportation systems for smart city. It helps to properly allocate vehicles, ease the traffic pressure and improve passengers’ experience. Traditional taxi demand prediction methods mostly rely on time-series forecasting techniques, which cannot model the nonlinearity embedded in data. Recent studies start to combine the Euclidean spatial features through grid-based methods. By considering the spatial correlations among different regions, we can capture how the temporal events have impacts on those with adjacent links or intersections and improve prediction precision. Some graph-based models are proposed to encode the non-Euclidean correlations as well. However, the temporal periodicity of data is often overlooked, and the study units are usually constructed as oversimplified grids. In this paper, we define places with specific semantic and humanistic experiences as study units, using a fuzzy set method based on adaptive kernel density estimation. Then, we introduce dual temporal gated multi-graph convolution network to predict the future taxi demand. Specifically, multi-graph convolution is used to model spatial correlations with graphs, including the neighborhood, functional similarities and landscape similarities based on street view images. As for the temporal dependencies modeling, we design the dual temporal gated branches to capture information hidden in both previous and periodic observations. Experiments on two real-world datasets show the effectiveness of our model over the baselines.
Collapse
|
48
|
Yang T, Tang X, Liu R. Dual temporal gated multi-graph convolution network for taxi demand prediction. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06092-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractTaxi demand prediction is essential to build efficient traffic transportation systems for smart city. It helps to properly allocate vehicles, ease the traffic pressure and improve passengers’ experience. Traditional taxi demand prediction methods mostly rely on time-series forecasting techniques, which cannot model the nonlinearity embedded in data. Recent studies start to combine the Euclidean spatial features through grid-based methods. By considering the spatial correlations among different regions, we can capture how the temporal events have impacts on those with adjacent links or intersections and improve prediction precision. Some graph-based models are proposed to encode the non-Euclidean correlations as well. However, the temporal periodicity of data is often overlooked, and the study units are usually constructed as oversimplified grids. In this paper, we define places with specific semantic and humanistic experiences as study units, using a fuzzy set method based on adaptive kernel density estimation. Then, we introduce dual temporal gated multi-graph convolution network to predict the future taxi demand. Specifically, multi-graph convolution is used to model spatial correlations with graphs, including the neighborhood, functional similarities and landscape similarities based on street view images. As for the temporal dependencies modeling, we design the dual temporal gated branches to capture information hidden in both previous and periodic observations. Experiments on two real-world datasets show the effectiveness of our model over the baselines.
Collapse
|
49
|
Guo N, Gu K, Qiao J, Bi J. Improved deep CNNs based on Nonlinear Hybrid Attention Module for image classification. Neural Netw 2021; 140:158-166. [PMID: 33765531 DOI: 10.1016/j.neunet.2021.01.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 10/18/2020] [Accepted: 01/07/2021] [Indexed: 11/27/2022]
Abstract
Recent years have witnessed numerous successful applications of incorporating attention module into feed-forward convolutional neural networks. Along this line of research, we design a novel lightweight general-purpose attention module by simultaneously taking channel attention and spatial attention into consideration. Specifically, inspired by the characteristics of channel attention and spatial attention, a nonlinear hybrid method is proposed to combine such two types of attention feature maps, which is highly beneficial to better network fine-tuning. Further, the parameters of each attention branch can be adjustable for the purpose of making the attention module more flexible and adaptable. From another point of view, we found that the currently popular SE, and CBAM modules are actually two particular cases of our proposed attention module. We also explore the latest attention module ADCM. To validate the module, we conduct experiments on CIFAR10, CIFAR100, Fashion MINIST datasets. Results show that, after integrating with our attention module, existing networks tend to be more efficient in training process and have better performance as compared with state-of-the-art competitors. Also, it is worthy to stress the following two points: (1) our attention module can be used in existing state-of-the-art deep architectures and get better performance at a small computational cost; (2) the module can be added to existing deep architectures in a simple way through stacking the integration of networks block and our module.
Collapse
Affiliation(s)
- Nan Guo
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Ke Gu
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Junfei Qiao
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Jing Bi
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
50
|
Shao J, Du B, Wu C, Gong M, Liu T. HRSiam: High-Resolution Siamese Network, Towards Space-Borne Satellite Video Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3056-3068. [PMID: 33556007 DOI: 10.1109/tip.2020.3045634] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Tracking moving objects from space-borne satellite videos is a new and challenging task. The main difficulty stems from the extremely small size of the target of interest. First, because the target usually occupies only a few pixels, it is hard to obtain discriminative appearance features. Second, the small object can easily suffer from occlusion and illumination variation, making the features of objects less distinguishable from features in surrounding regions. Current state-of-the-art tracking approaches mainly consider high-level deep features of a single frame with low spatial resolution, and hardly benefit from inter-frame motion information inherent in videos. Thus, they fail to accurately locate such small objects and handle challenging scenarios in satellite videos. In this article, we successfully design a lightweight parallel network with a high spatial resolution to locate the small objects in satellite videos. This architecture guarantees real-time and precise localization when applied to the Siamese Trackers. Moreover, a pixel-level refining model based on online moving object detection and adaptive fusion is proposed to enhance the tracking robustness in satellite videos. It models the video sequence in time to detect the moving targets in pixels and has ability to take full advantage of tracking and detecting. We conduct quantitative experiments on real satellite video datasets, and the results show the proposed HIGH-RESOLUTION SIAMESE NETWORK (HRSiam) achieves state-of-the-art tracking performance while running at over 30 FPS.
Collapse
|