1
|
Shen XJ, Xu Z, Wang L, Li Z, Liu G, Fan J, Zha Z. Extraordinarily Time- and Memory-Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain: From Shallow to Deep. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14989-15003. [PMID: 37527324 DOI: 10.1109/tnnls.2023.3282785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Canonical correlation analysis (CCA) is a correlation analysis technique that is widely used in statistics and the machine-learning community. However, the high complexity involved in the training process lays a heavy burden on the processing units and memory system, making CCA nearly impractical in large-scale data. To overcome this issue, a novel CCA method that tries to carry out analysis on the dataset in the Fourier domain is developed in this article. Appling Fourier transform on the data, we can convert the traditional eigenvector computation of CCA into finding some predefined discriminative Fourier bases that can be learned with only element-wise dot product and sum operations, without complex time-consuming calculations. As the eigenvalues come from the sum of individual sample products, they can be estimated in parallel. Besides, thanks to the data characteristic of pattern repeatability, the eigenvalues can be well estimated with partial samples. Accordingly, a progressive estimate scheme is proposed, in which the eigenvalues are estimated through feeding data batch by batch until the eigenvalues sequence is stable in order. As a result, the proposed method shows its characteristics of extraordinarily fast and memory efficiencies. Furthermore, we extend this idea to the nonlinear kernel and deep models and obtained satisfactory accuracy and extremely fast training time consumption as expected. An extensive discussion on the fast Fourier transform (FFT)-CCA is made in terms of time and memory efficiencies. Experimental results on several large-scale correlation datasets, such as MNIST8M, X-RAY MICROBEAM SPEECH, and Twitter Users Data, demonstrate the superiority of the proposed algorithm over state-of-the-art (SOTA) large-scale CCA methods, as our proposed method achieves almost same accuracy with the training time of our proposed method being 1000 times faster. This makes our proposed models best practice models for dealing with large-scale correlation datasets. The source code is available at https://github.com/Mrxuzhao/FFTCCA.
Collapse
|
2
|
Pan Z, Wang Y, Cao Y, Gui W. VAE-Based Interpretable Latent Variable Model for Process Monitoring. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6075-6088. [PMID: 37310819 DOI: 10.1109/tnnls.2023.3282047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Latent variable-based process monitoring (PM) models have been generously developed by shallow learning approaches, such as multivariate statistical analysis and kernel techniques. Owing to their explicit projection objectives, the extracted latent variables are usually meaningful and easily interpretable in mathematical terms. Recently, deep learning (DL) has been introduced to PM and has exhibited excellent performance because of its powerful presentation capability. However, its complex nonlinearity prevents it from being interpreted as human-friendly. It is a mystery how to design a proper network structure to achieve satisfactory PM performance for DL-based latent variable models (LVMs). In this article, a variational autoencoder-based interpretable LVM (VAE-ILVM) is developed for PM. Based on Taylor expansions, two propositions are proposed to guide the design of appropriate activation functions for VAE-ILVM, allowing nondisappearing fault impact terms contained in the generated monitoring metrics (MMs). During threshold learning, the sequence of counting that test statistics exceed the threshold is considered a martingale, a representative of weakly dependent stochastic processes. A de la Peña inequality is then adopted to learn a suitable threshold. Finally, two chemical examples verify the effectiveness of the proposed method. The use of de la Peña inequality significantly reduces the minimum required sample size for modeling.
Collapse
|
3
|
Yang C, Liu Q, Liu Y, Cheung YM. Transfer Dynamic Latent Variable Modeling for Quality Prediction of Multimode Processes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6061-6074. [PMID: 37079407 DOI: 10.1109/tnnls.2023.3265762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Quality prediction is beneficial to intelligent inspection, advanced process control, operation optimization, and product quality improvements of complex industrial processes. Most of the existing work obeys the assumption that training samples and testing samples follow similar data distributions. The assumption is, however, not true for practical multimode processes with dynamics. In practice, traditional approaches mostly establish a prediction model using the samples from the principal operating mode (POM) with abundant samples. The model is inapplicable to other modes with a few samples. In view of this, this article will propose a novel dynamic latent variable (DLV)-based transfer learning approach, called transfer DLV regression (TDLVR), for quality prediction of multimode processes with dynamics. The proposed TDLVR can not only derive the dynamics between process variables and quality variables in the POM but also extract the co-dynamic variations among process variables between the POM and the new mode. This can effectively overcome data marginal distribution discrepancy and enrich the information of the new mode. To make full use of the available labeled samples from the new mode, an error compensation mechanism is incorporated into the established TDLVR, termed compensated TDLVR (CTDLVR), to adapt to the conditional distribution discrepancy. Empirical studies show the efficacy of the proposed TDLVR and CTDLVR methods in several case studies, including numerical simulation examples and two real-industrial process examples.
Collapse
|
4
|
Ranta R, Le Cam S, Chaudet B, Tyvaert L, Maillard L, Colnat-Coulbois S, Louis-Dorr V. Approximate Canonical Correlation Analysis for common/specific subspace decompositions. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
5
|
Duan M, Li K, Li K, Tian Q. A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3418285] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Multi-task learning plays an important role in face multi-attribute prediction. At present, most researches excavate the shared information between attributes by sharing all convolutional layers. However, it is not appropriate to treat the low-level and high-level features of the face multi-attribute equally, because the high-level features are more biased toward the specific content of the category. In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes. MTCN shares all attribute features at the low-level layers, and then distinguishes each attribute feature at the high-level layers. To better excavate the correlations among high-level attribute features, each sub-network explores useful information from other networks to enhance its original information. Then a tensor canonical correlation analysis method is used to seek the correlations among the highest-level attributes, which enhances the original information of each attribute. After that, these features are mapped into a highly correlated space through the correlation matrix. Finally, we use sufficient experiments to verify the performance of MTCN on the CelebA and LFWA datasets and our MTCN achieves the best performance compared with the latest multi-attribute recognition algorithms under the same settings.
Collapse
Affiliation(s)
| | | | - Keqin Li
- State University of New York, USA
| | | |
Collapse
|
6
|
Daura LU, Tian G, Yi Q, Sophian A. Wireless power transfer-based eddy current non-destructive testing using a flexible printed coil array. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2020; 378:20190579. [PMID: 32921233 PMCID: PMC7536023 DOI: 10.1098/rsta.2019.0579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Eddy current testing (ECT) has been employed as a traditional non-destructive testing and evaluation (NDT&E) tool for many years. It has developed from single frequency to multiple frequencies, and eventually to pulsed and swept-frequency excitation. Recent progression of wireless power transfer (WPT) and flexible printed devices open opportunities to address challenges of defect detection and reconstruction under complex geometric situations. In this paper, a transmitter-receiver (Tx-Rx) flexible printed coil (FPC) array that uses the WPT approach featuring dual resonance responses for the first time has been proposed. The dual resonance responses can provide multiple parameters of samples, such as defect characteristics, lift-offs and material properties, while the flexible coil array allows area mapping of complex structures. To validate the proposed approach, experimental investigations of a single excitation coil with multiple receiving coils using the WPT principle were conducted on a curved pipe surface with a natural dent defect. The FPC array has one single excitation coil and 16 receiving (Rx) coils, which are used to measure the dent by using 21 C-scan points on the dedicated dent sample. The experimental data were then used for training and evaluation of dual resonance responses in terms of multiple feature extraction, selection and fusion for quantitative NDE. Four features, which include resonant magnitudes and principal components of the two resonant areas, were investigated for mapping and reconstructing the defective dent through correlation analysis for feature selection and feature fusion by deep learning. It shows that deep learning-based multiple feature fusion has outstanding performance for 3D defect reconstruction of WPT-based FPC-ECT. This article is part of the theme issue 'Advanced electromagnetic non-destructive evaluation and smart monitoring'.
Collapse
Affiliation(s)
- Lawal Umar Daura
- School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
- Electrical Engineering Department, Faculty of Engineering, Bayero University, Kano, Nigeria
| | - GuiYun Tian
- School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
- School of Automation Engineering, University of Electronic Science and Technology, Chengdu, People's Republic of China
- e-mail:
| | - Qiuji Yi
- School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Ali Sophian
- Department of Mechatronics Engineering, Faculty of Engineering, International Islamic University Malaysia, Kuala Lumpur, Malaysia
| |
Collapse
|
7
|
Yu J, Zhu C, Zhang J, Huang Q, Tao D. Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:661-674. [PMID: 31034423 DOI: 10.1109/tnnls.2019.2908982] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We propose an end-to-end place recognition model based on a novel deep neural network. First, we propose to exploit the spatial pyramid structure of the images to enhance the vector of locally aggregated descriptors (VLAD) such that the enhanced VLAD features can reflect the structural information of the images. To encode this feature extraction into the deep learning method, we build a spatial pyramid-enhanced VLAD (SPE-VLAD) layer. Next, we impose weight constraints on the terms of the traditional triplet loss (T-loss) function such that the weighted T-loss (WT-loss) function avoids the suboptimal convergence of the learning process. The loss function can work well under weakly supervised scenarios in that it determines the semantically positive and negative samples of each query through not only the GPS tags but also the Euclidean distance between the image representations. The SPE-VLAD layer and the WT-loss layer are integrated with the VGG-16 network or ResNet-18 network to form a novel end-to-end deep neural network that can be easily trained via the standard backpropagation method. We conduct experiments on three benchmark data sets, and the results demonstrate that the proposed model defeats the state-of-the-art deep learning approaches applied to place recognition.
Collapse
|
8
|
Hou S, Liu H, Sun Q. Sparse regularized discriminative canonical correlation analysis for multi-view semi-supervised learning. Neural Comput Appl 2019. [DOI: 10.1007/s00521-018-3582-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
Yu Y, Tang S, Aizawa K, Aizawa A. Category-Based Deep CCA for Fine-Grained Venue Discovery From Multimodal Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1250-1258. [PMID: 30106743 DOI: 10.1109/tnnls.2018.2856253] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this work, travel destinations and business locations are taken as venues. Discovering a venue by a photograph is very important for visual context-aware applications. Unfortunately, few efforts paid attention to complicated real images such as venue photographs generated by users. Our goal is fine-grained venue discovery from heterogeneous social multimodal data. To this end, we propose a novel deep learning model, category-based deep canonical correlation analysis. Given a photograph as input, this model performs: 1) exact venue search (find the venue where the photograph was taken) and 2) group venue search (find relevant venues that have the same category as the photograph), by the cross-modal correlation between the input photograph and textual description of venues. In this model, data in different modalities are projected to a same space via deep networks. Pairwise correlation (between different modality data from the same venue) for exact venue search and category-based correlation (between different modality data from different venues with the same category) for group venue search are jointly optimized. Because a photograph cannot fully reflect rich text description of a venue, the number of photographs per venue in the training phase is increased to capture more aspects of a venue. We build a new venue-aware multimodal data set by integrating Wikipedia featured articles and Foursquare venue photographs. Experimental results on this data set confirm the feasibility of the proposed method. Moreover, the evaluation over another publicly available data set confirms that the proposed method outperforms state of the arts for cross-modal retrieval between image and text.
Collapse
|
10
|
|
11
|
Dong X, Wu F, Jing XY. Semi-supervised multiple kernel intact discriminant space learning for image recognition. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3367-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|