1
|
Zhang N, Jiang Z, Li M, Zhang D. A novel multi-feature learning model for disease diagnosis using face skin images. Comput Biol Med 2024; 168:107837. [PMID: 38086142 DOI: 10.1016/j.compbiomed.2023.107837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 11/15/2023] [Accepted: 12/07/2023] [Indexed: 01/10/2024]
Abstract
BACKGROUND Facial skin characteristics can provide valuable information about a patient's underlying health conditions. OBJECTIVE In practice, there are often samples with divergent characteristics (commonly known as divergent samples) that can be attributed to environmental factors, living conditions, or genetic elements. These divergent samples significantly degrade the accuracy of diagnoses. METHODOLOGY To tackle this problem, we propose a novel multi-feature learning method called Multi-Feature Learning with Centroid Matrix (MFLCM), which aims to mitigate the influence of divergent samples on the accurate classification of samples located on the boundary. In this approach, we introduce a novel discriminator that incorporates a centroid matrix strategy and simultaneously adapt it to a classifier in a unified model. We effectively apply the centroid matrix to the embedding feature spaces, which are transformed from the multi-feature observation space, by calculating a relaxed Hamming distance. The purpose of the centroid vectors for each category is to act as anchors, ensuring that samples from the same class are positioned close to their corresponding centroid vector while being pushed further away from the remaining centroids. RESULTS Validation of the proposed method with clinical facial skin dataset showed that the proposed method achieved F1 scores of 92.59%, 83.35%, 82.84% and 85.46%, respectively for the detection the Healthy, Diabetes Mellitus (DM), Fatty Liver (FL) and Chronic Renal Failure (CRF). CONCLUSION Experimental results demonstrate the superiority of the proposed method compared with typical classifiers single-view-based and state-of-the-art multi-feature approaches. To the best of our knowledge, this study represents the first to demonstrate concept of multi-feature learning using only facial skin images as an effective non-invasive approach for simultaneously identifying DM, FL and CRF in Han Chinese, the largest ethnic group in the world.
Collapse
Affiliation(s)
- Nannan Zhang
- The Chinese University of Hong Kong (Shenzhen), Shenzhen, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen, China.
| | - Zhixing Jiang
- The Chinese University of Hong Kong (Shenzhen), Shenzhen, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen, China.
| | - Mu Li
- Harbin Institute of Technology at Shenzhen, Shenzhen, China.
| | - David Zhang
- The Chinese University of Hong Kong (Shenzhen), Shenzhen, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen, China.
| |
Collapse
|
2
|
Zhang N, Jiang Z, Li J, Zhang D. Multiple color representation and fusion for diabetes mellitus diagnosis based on back tongue images. Comput Biol Med 2023; 155:106652. [PMID: 36805220 DOI: 10.1016/j.compbiomed.2023.106652] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 02/02/2023] [Accepted: 02/08/2023] [Indexed: 02/16/2023]
Abstract
Tongue images have been proved to be effective in diabetes mellitus (DM) diagnosis. Without requirement of collecting blood sample, tongue image based diagnosis approach is non-invasive and convenient for the patients. Meanwhile, the colors of tongues play an important in aiding accurate diagnosis. However, the tongues' colors fall on a small color gamut that makes it difficult for the existing color descripts to identify and distinguish the tiny difference of the tongues. To tackle this problem, we introduce a novel color descriptor by representing the colors with the clustering centers, namely color centroid points, of the color points sampled from tongue images. In order to boost the capacity of the descriptor, we extend it into three color spaces, i.e., RGB, HSV and LAB to mine a rich set of color information and exploit the complementary information among the three spaces. Since there exist correlation and complementarity among the features extracted from the three color spaces, we propose a novel multiple color features fusion method for DM diagnosis. Particularly, two projections are learned to project the multiple features to their corresponding shared and specific subspaces, in which their similarity and diversity are firstly measured by the Euclidean Distance and Hilbert Schmidt Independence Criterion (HSIC), respectively. To fully exploit the similar and complementary information, the two components are jointly transformed to their label vector, efficiently embedding the discriminant prior into the model, leading to significant improvement in the diagnosis outcomes. Experimental results on clinical tongue dataset substantiated the effectiveness of our proposed clustering-based color descriptor and the proposed multiple colors fusion approach. Overall, the proposed pipeline for the diagnosis of DM using back tongue images, achieved an average accuracy of up to 93.38%, indicating its potential toward realization of a clinical diagnostic tool for DM. Without loss generality, we also assessed the performance of the novel multiple features fusion method on two public datasets. The experiments prove the superiority of our multiple features learning model on general real-life application.
Collapse
Affiliation(s)
- Nannan Zhang
- The Chinese University of Hong Kong (Shenzhen), Shenzhen, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen, China.
| | - Zhixing Jiang
- The Chinese University of Hong Kong (Shenzhen), Shenzhen, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen, China.
| | - JinXing Li
- Harbin Institute of Technology at Shenzhen, Shenzhen, China.
| | - David Zhang
- The Chinese University of Hong Kong (Shenzhen), Shenzhen, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen, China.
| |
Collapse
|
3
|
Zhang Q, Wen J, Zhou J, Zhang B. Missing-view completion for fatty liver disease detection. Comput Biol Med 2022; 150:106097. [PMID: 36244304 DOI: 10.1016/j.compbiomed.2022.106097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 08/22/2022] [Accepted: 09/10/2022] [Indexed: 11/15/2022]
Abstract
Fatty liver disease is a common disease that causes extra fat storage in an individual's liver. Patients with fatty liver disease may progress to cirrhosis and liver failure, further leading to liver cancer. The prevalence of fatty liver disease ranges from 10% to 30% in many countries. In general, detecting fatty liver requires professional neuroimaging modalities or methods such as computed tomography, ultrasound, and medical experts' practical experiences. Considering this point, finding intelligent electronic noninvasive diagnostic approaches are desired at present. Currently, most existing works in the area of computerized noninvasive disease detection often apply one view (modality) or perform multi-view (several modalities) analysis, e.g., face, tongue, and/or sublingual for disease detection. The multi-view data of patients provides more complementary information for diagnosis. However, due to the conditions of data acquisition, interference by human factors, etc., many multi-view data are defective with some missing-view information, making these multi-view data difficult to evaluate. This factor largely affects the performance of classifying disease and the development of fully computerized noninvasive methods. Thus, the purpose of this study is to address the missing view issue among noninvasive disease detection. In this work, a multi-view dataset containing facial, sublingual vein, and tongue images are initially processed to produce corresponding feature for incomplete multi-view disease diagnostic evaluation. Hereby, we propose a novel method, i.e., multi-view completion, to process the incomplete multi-view data in order to complete the missing-view information for classifying fatty liver disease from healthy candidates. In particular, this method can explore the intra-view and inter-view information to produce the missing-view data effectively. Extensive experiments on a collected dataset with 220 fatty liver patients and 220 healthy samples show that our proposed approach achieves better diagnostic results with missing-view completion compared to the original incomplete multi-view data under various classifiers. Related results prove that our method can effectively process the missing-view issue and improve the noninvasive disease detection performance.
Collapse
Affiliation(s)
- Qi Zhang
- PAMI Research Group, Dept. of Computer and Information Science, University of Macau, Macau, China
| | - Jie Wen
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, China
| | - Jianhang Zhou
- PAMI Research Group, Dept. of Computer and Information Science, University of Macau, Macau, China
| | - Bob Zhang
- PAMI Research Group, Dept. of Computer and Information Science, University of Macau, Macau, China; Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing Technology and Business University, Beijing, China.
| |
Collapse
|
4
|
Feature fusion based on joint sparse representations and wavelets for multiview classification. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01110-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
AbstractFeature-level-based fusion has attracted much interest. Generally, a dataset can be created in different views, features, or modalities. To improve the classification rate, local information is shared among different views by various fusion methods. However, almost all the methods use the views without considering their common aspects. In this paper, wavelet transform is considered to extract high and low frequencies of the views as common aspects to improve the classification rate. The fusion method for the decomposed parts is based on joint sparse representation in which a number of scenarios can be considered. The presented approach is tested on three datasets. The results obtained by this method prove competitive performance in terms of the datasets compared to the state-of-the-art results.
Collapse
|
5
|
Guo C, Jiang Z, He H, Liao Y, Zhang D. Wrist pulse signal acquisition and analysis for disease diagnosis: A review. Comput Biol Med 2022; 143:105312. [PMID: 35203039 DOI: 10.1016/j.compbiomed.2022.105312] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 01/22/2022] [Accepted: 02/07/2022] [Indexed: 11/26/2022]
Abstract
Pulse diagnosis (PD) plays an indispensable role in healthcare in China, India, Korea, and other Orient countries. It requires considerable training and experience to master. The results of pulse diagnosis rely heavily on the practitioner's subjective analysis, which means that the results from different physicians may be inconsistent. To overcome these drawbacks, computational pulse diagnosis (CPD) is used with advanced sensing techniques and analytical methods. Focusing on the main processes of CPD, this paper provides a systematic review of the latest advances in pulse signal acquisition, signal preprocessing, feature extraction, and signal recognition. The most relevant principles and applications are presented along with current progress. Extensive comparisons and analyses are conducted to evaluate the merits of different methods employed in CPD. While much progress has been made, a lack of datasets and benchmarks has limited the development of CPD. To address this gap and facilitate further research, we present a benchmark to evaluate different methods. We conclude with observations of the status and prospects of CPD.
Collapse
Affiliation(s)
- Chaoxun Guo
- The Chinese University of Hong Kong(Shenzhen), Shenzhen, 518172, Guangdong, China; Shenzhen Research Institute of Big Data, Shenzhen, 518172, Guangdong, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518172, Guangdong, China.
| | - Zhixing Jiang
- The Chinese University of Hong Kong(Shenzhen), Shenzhen, 518172, Guangdong, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518172, Guangdong, China.
| | - Haoze He
- New York University, New York, 10012, New York, United States
| | - Yining Liao
- The Chinese University of Hong Kong(Shenzhen), Shenzhen, 518172, Guangdong, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518172, Guangdong, China
| | - David Zhang
- The Chinese University of Hong Kong(Shenzhen), Shenzhen, 518172, Guangdong, China; Shenzhen Research Institute of Big Data, Shenzhen, 518172, Guangdong, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518172, Guangdong, China.
| |
Collapse
|
6
|
Hierarchical Fusion Using Subsets of Multi-Features for Historical Arabic Manuscript Dating. J Imaging 2022; 8:jimaging8030060. [PMID: 35324615 PMCID: PMC8954291 DOI: 10.3390/jimaging8030060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 02/21/2022] [Accepted: 02/24/2022] [Indexed: 11/21/2022] Open
Abstract
Automatic dating tools for historical documents can greatly assist paleographers and save them time and effort. This paper describes a novel method for estimating the date of historical Arabic documents that employs hierarchical fusions of multiple features. A set of traditional features and features extracted by a residual network (ResNet) are fused in a hierarchical approach using joint sparse representation. To address noise during the fusion process, a new approach based on subsets of multiple features is being considered. Following that, supervised and unsupervised classifiers are used for classification. We show that using hierarchical fusion based on subsets of multiple features in the KERTAS dataset can produce promising results and significantly improve the results.
Collapse
|
7
|
Li J, Zhang B, Lu G, Xu Y, Wu F, Zhang D. Harmonization Shared Autoencoder Gaussian Process Latent Variable Model With Relaxed Hamming Distance. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5093-5107. [PMID: 33027008 DOI: 10.1109/tnnls.2020.3026876] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multiview learning has shown its superiority in visual classification compared with the single-view-based methods. Especially, due to the powerful representation capacity, the Gaussian process latent variable model (GPLVM)-based multiview approaches have achieved outstanding performances. However, most of them only follow the assumption that the shared latent variables can be generated from or projected to the multiple observations but fail to exploit the harmonization in the back constraint and adaptively learn a classifier according to these learned variables, which would result in performance degradation. To tackle these two issues, in this article, we propose a novel harmonization shared autoencoder GPLVM with a relaxed Hamming distance (HSAGP-RHD). Particularly, an autoencoder structure with the Gaussian process (GP) prior is first constructed to learn the shared latent variable for multiple views. To enforce the agreement among various views in the encoder, a harmonization constraint is embedded into the model by making consistency for the view-specific similarity. Furthermore, we also propose a novel discriminative prior, which is directly imposed on the latent variable to simultaneously learn the fused features and adaptive classifier in a unit model. In detail, the centroid matrix corresponding to the centroids of different categories is first obtained. A relaxed Hamming distance (RHD)-based measurement is subsequently presented to measure the similarity and dissimilarity between the latent variable and centroids, not only allowing us to get the closed-form solutions but also encouraging the points belonging to the same class to be close, while those belonging to different classes to be far. Due to this novel prior, the category of the out-of-sample is also allowed to be simply assigned in the testing phase. Experimental results conducted on three real-world data sets demonstrate the effectiveness of the proposed method compared with state-of-the-art approaches.
Collapse
|
8
|
Zhang Q, Zhou J, Zhang B. Computational Traditional Chinese Medicine diagnosis: A literature survey. Comput Biol Med 2021; 133:104358. [PMID: 33831712 DOI: 10.1016/j.compbiomed.2021.104358] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 03/23/2021] [Accepted: 03/24/2021] [Indexed: 12/22/2022]
Abstract
BACKGROUND AND OBJECTIVE Traditional Chinese Medicine (TCM) diagnosis is based on the theoretical principles and knowledge, where it is steeped in thousands of years of history to diagnose various types of diseases and syndromes. It can be generally divided into four main diagnostic approaches: 1. Inspection, 2. Auscultation and olfaction, 3. Inquiry, and 4. Palpation, which are widely used in TCM hospitals in China and around the world. With the development of intelligent computing technology in recent years, computational TCM diagnosis has grown rapidly. METHODS In this paper, we aim to systematically summarize the development of computational TCM diagnosis based on four diagnostic approaches, mainly focusing on digital acquisition devices, collected datasets, and computational detection approaches (algorithms). Furthermore, all related works of this field are compared and explored in detail. RESULTS This survey provides the principles, applications, and current progress in computing for readers and researchers in terms of computational TCM diagnosis. Moreover, the future development direction, prospect, and technological trend of computational TCM diagnosis will also be discussed in this study. CONCLUSIONS Recent computational TCM diagnosis works are compared in detail to show the pros/cons, where we provide some meaningful suggestions and opinions on the future research approaches in this area. This work is useful for disease detection in computational TCM diagnosis as well as health management in the smart healthcare area. INDEX TERMS Computational diagnosis, Traditional Chinese Medicine, survey, smart healthcare.
Collapse
Affiliation(s)
- Qi Zhang
- The PAMI Research Group, Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau SAR, People's Republic of China
| | - Jianhang Zhou
- The PAMI Research Group, Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau SAR, People's Republic of China
| | - Bob Zhang
- The PAMI Research Group, Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau SAR, People's Republic of China.
| |
Collapse
|
9
|
Hu Y, Wen G, Liao H, Wang C, Dai D, Yu Z. Automatic Construction of Chinese Herbal Prescriptions From Tongue Images Using CNNs and Auxiliary Latent Therapy Topics. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:708-721. [PMID: 31059462 DOI: 10.1109/tcyb.2019.2909925] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The tongue image provides important physical information of humans. It is of great importance for diagnoses and treatments in clinical medicine. Herbal prescriptions are simple, noninvasive, and have low side effects. Thus, they are widely applied in China. Studies on the automatic construction technology of herbal prescriptions based on tongue images have great significance for deep learning to explore the relevance of tongue images for herbal prescriptions, it can be applied to healthcare services in mobile medical systems. In order to adapt to the tongue image in a variety of photographic environments and construct herbal prescriptions, a neural network framework for prescription construction is designed. It includes single/double convolution channels and fully connected layers. Furthermore, it proposes the auxiliary therapy topic loss mechanism to model the therapy of Chinese doctors and alleviate the interference of sparse output labels on the diversity of results. The experiment use the real-world tongue images and the corresponding prescriptions and the results can generate prescriptions that are close to the real samples, which verifies the feasibility of the proposed method for the automatic construction of herbal prescriptions from tongue images. Also, it provides a reference for automatic herbal prescription construction from more physical information.
Collapse
|
10
|
Li J, Lu G, Zhang B, You J, Zhang D. Shared Linear Encoder-Based Multikernel Gaussian Process Latent Variable Model for Visual Classification. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:534-547. [PMID: 31170087 DOI: 10.1109/tcyb.2019.2915789] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multiview learning has been widely studied in various fields and achieved outstanding performances in comparison to many single-view-based approaches. In this paper, a novel multiview learning method based on the Gaussian process latent variable model (GPLVM) is proposed. In contrast to existing GPLVM methods which only assume that there are transformations from the latent variable to the multiple observed inputs, our proposed method simultaneously takes a back constraint into account, encoding multiple observations to the latent variable by enjoying the Gaussian process (GP) prior. Particularly, to overcome the difficulty of the covariance matrix calculation in the encoder, a linear projection is designed to map different observations to a consistent subspace first. The obtained variable in this subspace is then projected to the latent variable in the manifold space with the GP prior. Furthermore, different from most GPLVM methods which strongly assume that the covariance matrices follow a certain kernel function, for example, radial basis function (RBF), we introduce a multikernel strategy to design the covariance matrix, being more reasonable and adaptive for the data representation. In order to apply the presented approach to the classification, a discriminative prior is also embedded to the learned latent variables to encourage samples belonging to the same category to be close and those belonging to different categories to be far. Experimental results on three real-world databases substantiate the effectiveness and superiority of the proposed method compared with state-of-the-art approaches.
Collapse
|
11
|
|
12
|
Li J, Li M, Lu G, Zhang B, Yin H, Zhang D. Similarity and diversity induced paired projection for cross-modal retrieval. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.06.032] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
13
|
Jiang Z, Guo C, Zang J, Lu G, Zhang D. Features fusion of multichannel wrist pulse signal based on KL-MGDCCA and decision level combination. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101751] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
14
|
Automated detection of diabetic subject using pre-trained 2D-CNN models with frequency spectrum images extracted from heart rate signals. Comput Biol Med 2019; 113:103387. [PMID: 31421276 DOI: 10.1016/j.compbiomed.2019.103387] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Revised: 08/08/2019] [Accepted: 08/08/2019] [Indexed: 11/24/2022]
Abstract
In this study, a deep-transfer learning approach is proposed for the automated diagnosis of diabetes mellitus (DM), using heart rate (HR) signals obtained from electrocardiogram (ECG) data. Recent progress in deep learning has contributed significantly to improvement in the quality of healthcare. In order for deep learning models to perform well, large datasets are required for training. However, a difficulty in the biomedical field is the lack of clinical data with expert annotation. A recent, commonly implemented technique to train deep learning models using small datasets is to transfer the weighting, developed from a large dataset, to the current model. This deep learning transfer strategy is generally employed for two-dimensional signals. Herein, the weighting of models pre-trained using two-dimensional large image data was applied to one-dimensional HR signals. The one-dimensional HR signals were then converted into frequency spectrum images, which were utilized for application to well-known pre-trained models, specifically: AlexNet, VggNet, ResNet, and DenseNet. The DenseNet pre-trained model yielded the highest classification average accuracy of 97.62%, and sensitivity of 100%, to detect DM subjects via HR signal recordings. In the future, we intend to further test this developed model by utilizing additional data along with cloud-based storage to diagnose DM via heart signal analysis.
Collapse
|
15
|
Li J, Zhang B, Lu G, Ren H, Zhang D. Visual Classification With Multikernel Shared Gaussian Process Latent Variable Model. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2886-2899. [PMID: 29994781 DOI: 10.1109/tcyb.2018.2831457] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Multiview learning methods often achieve improvement compared with single-view-based approaches in many applications. Due to the powerful nonlinear ability and probabilistic perspective of Gaussian process (GP), some GP-based multiview efforts were presented. However, most of these methods make a strong assumption on the kernel function (e.g., radial basis function), which limits the capacity of the real data modeling. In order to address this issue, in this paper, we propose a novel multiview approach by combining a multikernel and GP latent variable model. Instead of designing a deterministic kernel function, multiple kernel functions are established to automatically adapt various types of data. Considering a simple way of obtaining latent variables at the testing stage, a projection from the observed space to the latent space as a back constraint has also been simultaneously introduced into the proposed method. Additionally, different from some existing methods which apply the classifiers off-line, a hinge loss is embedded into the model to jointly learn the classification hyperplane, encouraging the latent variables belonging to the different classes to be separated. An efficient algorithm based on the gradient decent technique is constructed to optimize our method. Finally, we apply the proposed approach to three real-world datasets and the associated results demonstrate the effectiveness and superiority of our model compared with other state-of-the-art methods.
Collapse
|
16
|
Wu G, Zhang D, Chen W, Zuo W, Xia Z. Robust Deep Softmax Regression Against Label Noise for Unsupervised Domain Adaptation. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001419400020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Domain adaptation aims to generalize the classification model from a source domain to a different but related target domain. Recent studies have revealed the benefit of deep convolutional features trained on a large dataset (e.g. ImageNet) in alleviating domain discrepancy. However, literatures show that the transferability of features decreases as (i) the difference between the source and target domains increases, or (ii) the layers are toward the top layers. Therefore, even with deep features, domain adaptation remains necessary. In this paper, we propose a novel unsupervised domain adaptation (UDA) model for deep neural networks, which is learned with the labeled source samples and the unlabeled target ones simultaneously. For target samples without labels, pseudo labels are assigned to them according to their maximum classification scores during training of the UDA model. However, due to the domain discrepancy, label noise generally is inevitable, which degrades the performance of the domain adaptation model. Thus, to effectively utilize the target samples, three specific robust deep softmax regression (RDSR) functions are performed for them with high, medium and low classification confidence respectively. Extensive experiments show that our method yields the state-of-the-art results, demonstrating the effectiveness of the robust deep softmax regression classifier in UDA.
Collapse
Affiliation(s)
- Guangbin Wu
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, P. R. China
| | - David Zhang
- Department of Computing, The Hong Kong Polytechnic University, Hong Kong, P. R. China
| | - Weishan Chen
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, P. R. China
| | - Wangmeng Zuo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, P. R. China
| | - Zhuang Xia
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, P. R. China
| |
Collapse
|
17
|
|
18
|
|
19
|
Li J, Zhang B, Zhang D. Shared Autoencoder Gaussian Process Latent Variable Model for Visual Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4272-4286. [PMID: 29990089 DOI: 10.1109/tnnls.2017.2761401] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Multiview learning reveals the latent correlation among different modalities and utilizes the complementary information to achieve a better performance in many applications. In this paper, we propose a novel multiview learning model based on the Gaussian process latent variable model (GPLVM) to learn a set of nonlinear and nonparametric mapping functions and obtain a shared latent variable in the manifold space. Different from the previous work on the GPLVM, the proposed shared autoencoder Gaussian process (SAGP) latent variable model assumes that there is an additional mapping from the observed data to the shared manifold space. Due to the introduction of the autoencoder framework, both nonlinear projections from and to the observation are considered simultaneously. Additionally, instead of fully connecting used in the conventional autoencoder, the SAGP achieves the mappings utilizing the GP, which remarkably reduces the number of estimated parameters and avoids the phenomenon of overfitting. To make the proposed method adaptive for classification, a discriminative regularization is embedded into the proposed method. In the optimization process, an efficient algorithm based on the alternating direction method and gradient decent techniques is designed to solve the encoder and decoder parts alternatively. Experimental results on three real-world data sets substantiate the effectiveness and superiority of the proposed approach as compared with the state of the art.
Collapse
|
20
|
Kou L, Zhang D, Liu D. A Novel Medical E-Nose Signal Analysis System. SENSORS 2017; 17:s17040402. [PMID: 28379168 PMCID: PMC5419773 DOI: 10.3390/s17040402] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Revised: 02/04/2017] [Accepted: 02/16/2017] [Indexed: 11/29/2022]
Abstract
It has been proven that certain biomarkers in people’s breath have a relationship with diseases and blood glucose levels (BGLs). As a result, it is possible to detect diseases and predict BGLs by analysis of breath samples captured by e-noses. In this paper, a novel optimized medical e-nose system specified for disease diagnosis and BGL prediction is proposed. A large-scale breath dataset has been collected using the proposed system. Experiments have been organized on the collected dataset and the experimental results have shown that the proposed system can well solve the problems of existing systems. The methods have effectively improved the classification accuracy.
Collapse
Affiliation(s)
- Lu Kou
- Biometrics Research Center, Department of Computing, The Hong Kong Polytechnic University, Kowloon 999077, Hong Kong, China.
| | - David Zhang
- Biometrics Research Center, Department of Computing, The Hong Kong Polytechnic University, Kowloon 999077, Hong Kong, China.
- Department of Computer Science, Harbin Institute of Technology Shenzhen graduate school, Shenzhen 518055, China.
| | - Dongxu Liu
- Department of Computer Science, Harbin Institute of Technology Shenzhen graduate school, Shenzhen 518055, China.
| |
Collapse
|