1
|
Zhang L, Cui W, Li B, Chen Z, Wu M, Gee TS. Privacy-Preserving Cross-Environment Human Activity Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1765-1775. [PMID: 34818206 DOI: 10.1109/tcyb.2021.3126831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recent studies have demonstrated the success of using the channel state information (CSI) from the WiFi signal to analyze human activities in a fixed and well-controlled environment. Those systems usually degrade when being deployed in new environments. A straightforward solution to solve this limitation is to collect and annotate data samples from different environments with advanced learning strategies. Although workable as reported, those methods are often privacy sensitive because the training algorithms need to access the data from different environments, which may be owned by different organizations. We present a practical method for the WiFi-based privacy-preserving cross-environment human activity recognition (HAR). It collects and shares information from different environments, while maintaining the privacy of individual person being involved. At the core of our approach is the utilization of the Johnson-Lindenstrauss transform, which is theoretically shown to be differentially private. Based on that, we further design an adversarial learning strategy to generate environment-invariant representations for HAR. We demonstrate the effectiveness of the proposed method with different data modalities from two real-life environments. More specifically, on the raw CSI dataset, it shows 2.18% and 1.24% improvements over challenging baselines for two environments, respectively. Moreover, with the discrete wavelet transform features, it further yields 5.71% and 1.55% improvements, respectively.
Collapse
|
2
|
Liu B, Jiang L, Fan S. Reducing Anthropomorphic Hand Degrees of Actuation with Grasp-Function-Dependent and Joint-Element-Sparse Hand Synergies. INT J HUM ROBOT 2022. [DOI: 10.1142/s0219843621500171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, a set of grasp-function-dependent and joint-element-sparse hand synergies was proposed. First, hand synergies were extracted from five basic categories of movements by principal component analysis (PCA). Then, varimax rotation was applied on these synergies, so each sparse synergy only represented a limited number of joints. Next, according to the contribution to these sparse synergies, finger joints were clustered into different joint modules. Finally, integrating the joint modules in different categories of hand movements, the minimum number of actuators and joint synergic modules for anthropomorphic hands were determined. The results showed that using 5 groups of joint modules and 7–9 actuators we can achieve the best performance of grasp function and motion flexibility. Furthermore, through the reasonable design of adaptive and hyperextension functional joint modules, anthropomorphic hands can better meet the requirements of different tasks like power grasping and precision pinching. Comparing with traditional finger-based actuation strategy, the joint coupling scheme achieved better anthropomorphic performance and larger workspace. These above findings will benefit the development of mechanical structure design and control method of anthropomorphic hands.
Collapse
Affiliation(s)
- Bingchen Liu
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology (HIT), Harbin 150001, P. R. China
| | - Li Jiang
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology (HIT), Harbin 150001, P. R. China
| | - Shaowei Fan
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology (HIT), Harbin 150001, P. R. China
| |
Collapse
|
3
|
Wu H, Song C, Yue S, Wang Z, Xiao J, Liu Y. Dynamic video mix-up for cross-domain action recognition. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.11.054] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
4
|
Liu B, Jiang L, Fan S, Dai J. Learning Grasp Configuration Through Object-Specific Hand Primitives for Posture Planning of Anthropomorphic Hands. Front Neurorobot 2021; 15:740262. [PMID: 34603004 PMCID: PMC8480411 DOI: 10.3389/fnbot.2021.740262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 08/20/2021] [Indexed: 11/13/2022] Open
Abstract
The proposal of postural synergy theory has provided a new approach to solve the problem of controlling anthropomorphic hands with multiple degrees of freedom. However, generating the grasp configuration for new tasks in this context remains challenging. This study proposes a method to learn grasp configuration according to the shape of the object by using postural synergy theory. By referring to past research, an experimental paradigm is first designed that enables the grasping of 50 typical objects in grasping and operational tasks. The angles of the finger joints of 10 subjects were then recorded when performing these tasks. Following this, four hand primitives were extracted by using principal component analysis, and a low-dimensional synergy subspace was established. The problem of planning the trajectories of the joints was thus transformed into that of determining the synergy input for trajectory planning in low-dimensional space. The average synergy inputs for the trajectories of each task were obtained through the Gaussian mixture regression, and several Gaussian processes were trained to infer the inputs trajectories of a given shape descriptor for similar tasks. Finally, the feasibility of the proposed method was verified by simulations involving the generation of grasp configurations for a prosthetic hand control. The error in the reconstructed posture was compared with those obtained by using postural synergies in past work. The results show that the proposed method can realize movements similar to those of the human hand during grasping actions, and its range of use can be extended from simple grasping tasks to complex operational tasks.
Collapse
Affiliation(s)
- Bingchen Liu
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China
| | - Li Jiang
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China
| | - Shaowei Fan
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China
| | - Jinghui Dai
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
5
|
Zhang Y, Zhou Z, Pan W, Bai H, Liu W, Wang L, Lin C. Epilepsy Signal Recognition Using Online Transfer TSK Fuzzy Classifier Underlying Classification Error and Joint Distribution Consensus Regularization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1667-1678. [PMID: 32750863 DOI: 10.1109/tcbb.2020.3002562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this study, an online transfer TSK fuzzy classifier O-T-TSK-FC is proposed for recognizing epilepsy signals. Compared with most of the existing transfer learning models, O-T-TSK-FC enjoys its merits from the following three aspects: 1) Since different patients often response to the same neuronal firing stimulation in different neural manners, the labeled data in the source domain cannot accurately represent the primary EEG data in the target domain. Therefore, we design an objective function which can integrate with subject-specific data in the target domain to induce the target predictive function. 2) A new regularization used for knowledge transfer is proposed from the perspective of error consensus, and its rationality is explained from the perspective of probability density estimation. 3) Clustering is used to partition source domains so as to reduce the computation of O-T-TSK-FC without affecting its performance. Based on the EEG signals collected from Bonn University, six different online scenarios for transfer learning are constructed. Experimental results on them show that O-T-TSK-FC performs better than benchmarking algorithms and robustly.
Collapse
|
6
|
Smart aging monitoring and early dementia recognition (SAMEDR): uncovering the hidden wellness parameter for preventive well-being monitoring to categorize cognitive impairment and dementia in community-dwelling elderly subjects through AI. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06139-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractReasoning weakening because of dementia degrades the performance in activities of daily living (ADL). Present research work distinguishes care needs, dangers and monitors the effect of dementia on an individual. This research contrasts in ADL design execution between dementia-affected people and other healthy elderly with heterogeneous sensors. More than 300,000 sensors associated activation data were collected from the dementia patients and healthy controls with wellness sensors networks. Generated ADLs were envisioned and understood through the activity maps, diversity and other wellness parameters to categorize wellness healthy, and dementia affected the elderly. Diversity was significant between diseased and healthy subjects. Heterogeneous unobtrusive sensor data evaluate behavioral patterns associated with ADL, helpful to reveal the impact of cognitive degradation, to measure ADL variation throughout dementia. The primary focus of activity recognition in the current research is to transfer dementia subject occupied homes models to generalized age-matched healthy subject data models to utilize new services, label classified datasets and produce limited datasets due to less training. Current research proposes a novel Smart Aging Monitoring and Early Dementia Recognition system that provides the exchange of data models between dementia subject occupied homes (DSOH) to healthy subject occupied homes (HSOH) in a move to resolve the deficiency of training data. At that point, the key attributes are mapped onto each other utilizing a sensor data fusion that assures to retain the diversities between various HSOH & DSOH by diminishing the divergence between them. Moreover, additional tests have been conducted to quantify the excellence of the offered framework: primary, in contradiction of the precision of feature mapping techniques; next, computing the merit of categorizing data at DSOH; and, the last, the aptitude of the projected structure to function thriving due to noise data. The outcomes show encouraging pointers and highlight the boundaries of the projected approach.
Collapse
|
7
|
Hang W, Liang S, Choi KS, Chung FL, Wang S. Selective Transfer Classification Learning With Classification-Error-Based Consensus Regularization. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2021. [DOI: 10.1109/tetci.2019.2892762] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
8
|
Kouw WM, Loog M. A Review of Domain Adaptation without Target Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:766-785. [PMID: 31603771 DOI: 10.1109/tpami.2019.2945942] [Citation(s) in RCA: 112] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: How can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based, and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting, and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.
Collapse
|
9
|
Zheng Q, He Z, Liang C, Chen J, Lin CW, Tao D. Transferring fashion to surveillance with weak labels. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05528-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04365-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
11
|
Kittler J, Zor C. Delta Divergence: A Novel Decision Cognizant Measure of Classifier Incongruence. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2331-2343. [PMID: 29993566 DOI: 10.1109/tcyb.2018.2825353] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In pattern recognition, disagreement between two classifiers regarding the predicted class membership of an observation can be indicative of an anomaly and its nuance. Since, in general, classifiers base their decisions on class a posteriori probabilities, the most natural approach to detecting classifier incongruence is to use divergence. However, existing divergences are not particularly suitable to gauge classifier incongruence. In this paper, we postulate the properties that a divergence measure should satisfy and propose a novel divergence measure, referred to as delta divergence. In contrast to existing measures, it focuses on the dominant (most probable) hypotheses and, thus, reduces the effect of the probability mass distributed over the non dominant hypotheses (clutter). The proposed measure satisfies other important properties, such as symmetry, and independence of classifier confidence. The relationship of the proposed divergence to some baseline measures, and its superiority, is shown experimentally.
Collapse
|
12
|
Prevete R, Donnarumma F, d'Avella A, Pezzulo G. Evidence for sparse synergies in grasping actions. Sci Rep 2018; 8:616. [PMID: 29330467 PMCID: PMC5766604 DOI: 10.1038/s41598-017-18776-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 11/30/2017] [Indexed: 01/09/2023] Open
Abstract
Converging evidence shows that hand-actions are controlled at the level of synergies and not single muscles. One intriguing aspect of synergy-based action-representation is that it may be intrinsically sparse and the same synergies can be shared across several distinct types of hand-actions. Here, adopting a normative angle, we consider three hypotheses for hand-action optimal-control: sparse-combination hypothesis (SC) – sparsity in the mapping between synergies and actions - i.e., actions implemented using a sparse combination of synergies; sparse-elements hypothesis (SE) – sparsity in synergy representation – i.e., the mapping between degrees-of-freedom (DoF) and synergies is sparse; double-sparsity hypothesis (DS) – a novel view combining both SC and SE – i.e., both the mapping between DoF and synergies and between synergies and actions are sparse, each action implementing a sparse combination of synergies (as in SC), each using a limited set of DoFs (as in SE). We evaluate these hypotheses using hand kinematic data from six human subjects performing nine different types of reach-to-grasp actions. Our results support DS, suggesting that the best action representation is based on a relatively large set of synergies, each involving a reduced number of degrees-of-freedom, and that distinct sets of synergies may be involved in distinct tasks.
Collapse
Affiliation(s)
- Roberto Prevete
- Department of Electric Engineering and Information Technologies (DIETI) Università di Napoli Federico II, Naples, Italy
| | - Francesco Donnarumma
- Institute of Cognitive Sciences and Technologies, National Research Council (ISTC-CNR), Via S. Martino della Battaglia, 44, 00185, Rome, Italy.
| | - Andrea d'Avella
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, University of Messina, Messina, Italy.,Laboratory of Neuromotor Physiology, Santa Lucia Foundation, Rome, Italy
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council (ISTC-CNR), Via S. Martino della Battaglia, 44, 00185, Rome, Italy
| |
Collapse
|
13
|
Yan K, Kou L, Zhang D. Learning Domain-Invariant Subspace Using Domain Features and Independence Maximization. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:288-299. [PMID: 28092587 DOI: 10.1109/tcyb.2016.2633306] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Domain adaptation algorithms are useful when the distributions of the training and the test data are different. In this paper, we focus on the problem of instrumental variation and time-varying drift in the field of sensors and measurement, which can be viewed as discrete and continuous distributional change in the feature space. We propose maximum independence domain adaptation (MIDA) and semi-supervised MIDA to address this problem. Domain features are first defined to describe the background information of a sample, such as the device label and acquisition time. Then, MIDA learns a subspace which has maximum independence with the domain features, so as to reduce the interdomain discrepancy in distributions. A feature augmentation strategy is also designed to project samples according to their backgrounds so as to improve the adaptation. The proposed algorithms are flexible and fast. Their effectiveness is verified by experiments on synthetic datasets and four real-world ones on sensors, measurement, and computer vision. They can greatly enhance the practicability of sensor systems, as well as extend the application scope of existing domain adaptation algorithms by uniformly handling different kinds of distributional change.
Collapse
|
14
|
Kankanhalli M. Benchmarking a Multimodal and Multiview and Interactive Dataset for Human Action Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:1781-1794. [PMID: 27429453 DOI: 10.1109/tcyb.2016.2582918] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Human action recognition is an active research area in both computer vision and machine learning communities. In the past decades, the machine learning problem has evolved from conventional single-view learning problem, to cross-view learning, cross-domain learning and multitask learning, where a large number of algorithms have been proposed in the literature. Despite having large number of action recognition datasets, most of them are designed for a subset of the four learning problems, where the comparisons between algorithms can further limited by variances within datasets, experimental configurations, and other factors. To the best of our knowledge, there exists no dataset that allows concurrent analysis on the four learning problems. In this paper, we introduce a novel multimodal and multiview and interactive (M2I) dataset, which is designed for the evaluation of human action recognition methods under all four scenarios. This dataset consists of 1760 action samples from 22 action categories, including nine person-person interactive actions and 13 person-object interactive actions. We systematically benchmark state-of-the-art approaches on M2I dataset on all four learning problems. Overall, we evaluated 13 approaches with nine popular feature and descriptor combinations. Our comprehensive analysis demonstrates that M2I dataset is challenging due to significant intraclass and view variations, and multiple similar action categories, as well as provides solid foundation for the evaluation of existing state-of-the-art algorithms.
Collapse
|
15
|
Rodriguez M, Orrite C, Medrano C, Makris D. One-Shot Learning of Human Activity With an MAP Adapted GMM and Simplex-HMM. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:1769-1780. [PMID: 28113739 DOI: 10.1109/tcyb.2016.2558447] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper presents a novel activity class representation using a single sequence for training. The contribution of this representation lays on the ability to train an one-shot learning recognition system, useful in new scenarios where capturing and labeling sequences is expensive or impractical. The method uses a universal background model of local descriptors obtained from source databases available on-line and adapts it to a new sequence in the target scenario through a maximum a posteriori adaptation. Each activity sample is encoded in a sequence of normalized bag of features and modeled by a new hidden Markov model formulation, where the expectation-maximization algorithm for training is modified to deal with observations consisting in vectors in a unit simplex. Extensive experiments in recognition have been performed using one-shot learning over the public datasets Weizmann, KTH, and IXMAS. These experiments demonstrate the discriminative properties of the representation and the validity of application in recognition systems, achieving state-of-the-art results.
Collapse
|
16
|
Zhang J, Han Y, Tang J, Hu Q, Jiang J. Semi-Supervised Image-to-Video Adaptation for Video Action Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:960-973. [PMID: 26992186 DOI: 10.1109/tcyb.2016.2535122] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Human action recognition has been well explored in applications of computer vision. Many successful action recognition methods have shown that action knowledge can be effectively learned from motion videos or still images. For the same action, the appropriate action knowledge learned from different types of media, e.g., videos or images, may be related. However, less effort has been made to improve the performance of action recognition in videos by adapting the action knowledge conveyed from images to videos. Most of the existing video action recognition methods suffer from the problem of lacking sufficient labeled training videos. In such cases, over-fitting would be a potential problem and the performance of action recognition is restrained. In this paper, we propose an adaptation method to enhance action recognition in videos by adapting knowledge from images. The adapted knowledge is utilized to learn the correlated action semantics by exploring the common components of both labeled videos and images. Meanwhile, we extend the adaptation method to a semi-supervised framework which can leverage both labeled and unlabeled videos. Thus, the over-fitting can be alleviated and the performance of action recognition is improved. Experiments on public benchmark datasets and real-world datasets show that our method outperforms several other state-of-the-art action recognition methods.
Collapse
|
17
|
Zhang L, Shum HPH, Shao L. Manifold Regularized Experimental Design for Active Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:969-981. [PMID: 28114017 DOI: 10.1109/tip.2016.2635440] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Various machine learning and data mining tasks in classification require abundant data samples to be labeled for training. Conventional active learning methods aim at labeling the most informative samples for alleviating the labor of the user. Many previous studies in active learning select one sample after another in a greedy manner. However, this is not very effective because the classification models has to be retrained for each newly labeled sample. Moreover, many popular active learning approaches utilize the most uncertain samples by leveraging the classification hyperplane of the classifier, which is not appropriate since the classification hyperplane is inaccurate when the training data are small-sized. The problem of insufficient training data in real-world systems limits the potential applications of these approaches. This paper presents a novel method of active learning called manifold regularized experimental design (MRED), which can label multiple informative samples at one time for training. In addition, MRED gives an explicit geometric explanation for the selected samples to be labeled by the user. Different from existing active learning methods, our method avoids the intrinsic problems caused by insufficiently labeled samples in real-world applications. Various experiments on synthetic datasets, the Yale face database and the Corel image database have been carried out to show how MRED outperforms existing methods.
Collapse
|
18
|
Zhang Z, Liu S, Wang C, Xiao B, Zhou W. Multiple Continuous Virtual Paths Based Cross-View Action Recognition. INT J PATTERN RECOGN 2016. [DOI: 10.1142/s0218001416550144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a novel method for cross-view action recognition via multiple continuous virtual paths which connect the source view and the target view. Each point on one virtual path is a virtual view which is obtained by a linear transformation of an action descriptor. All the virtual views are concatenated into an infinite-dimensional feature to characterize continuous changes from the source to the target view. To utilize these infinite-dimensional features directly, we propose a virtual view kernel (VVK) to compute the similarity between two infinite-dimensional features, which can be readily used to construct any kernelized classifiers. In addition, a constraint term is introduced to fully utilize the information contained in the unlabeled samples which are easier to obtain from the target view. The rationality behind the constraint is that any action video belongs to only one class. To further explore complementary visual information, we utilize multiple continuous virtual paths. The original source and target views are projected to different auxiliary source and target views using the random projection technique. Then we fuse all the VVKs generated from all pairs of auxiliary views. Our method is verified on the IXMAS and MuHAVi datasets, and the experimental results demonstrate that our method achieves better performance than the state-of-the-art methods.
Collapse
Affiliation(s)
- Zhong Zhang
- Tianjin Key Laboratory of Wireless Mobile, Communications and Power Transmission, Tianjin Normal University, Tianjin 300387, P. R. China
| | - Shuang Liu
- Tianjin Key Laboratory of Wireless Mobile, Communications and Power Transmission, Tianjin Normal University, Tianjin 300387, P. R. China
| | - Chunheng Wang
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, P. R. China
| | - Baihua Xiao
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, P. R. China
| | - Wen Zhou
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, P. R. China
| |
Collapse
|
19
|
Liu F, Xu X, Qiu S, Qing C, Tao D. Simple to Complex Transfer Learning for Action Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:949-960. [PMID: 26841395 DOI: 10.1109/tip.2015.2512107] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Recognizing complex human actions is very challenging, since training a robust learning model requires a large amount of labeled data, which is difficult to acquire. Considering that each complex action is composed of a sequence of simple actions which can be easily obtained from existing data sets, this paper presents a simple to complex action transfer learning model (SCA-TLM) for complex human action recognition. SCA-TLM improves the performance of complex action recognition by leveraging the abundant labeled simple actions. In particular, it optimizes the weight parameters, enabling the complex actions to be learned to be reconstructed by simple actions. The optimal reconstruct coefficients are acquired by minimizing the objective function, and the target weight parameters are then represented as a combination of source weight parameters. The main advantage of the proposed SCA-TLM compared with existing approaches is that we exploit simple actions to recognize complex actions instead of only using complex actions as training samples. To validate the proposed SCA-TLM, we conduct extensive experiments on two well-known complex action data sets: 1) Olympic Sports data set and 2) UCF50 data set. The results show the effectiveness of the proposed SCA-TLM for complex action recognition.
Collapse
|
20
|
Gu X, Chung FL, Ishibuchi H, Wang S. Multitask Coupled Logistic Regression and its Fast Implementation for Large Multitask Datasets. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:1953-1966. [PMID: 25423663 DOI: 10.1109/tcyb.2014.2362771] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
When facing multitask-learning problems, it is desirable that the learning method could find the correct input-output features and share the commonality among multiple domains and also scale-up for large multitask datasets. We introduce the multitask coupled logistic regression (LR) framework called LR-based multitask classification learning algorithm (MTC-LR), which is a new method for generating each classifier for each task, capable of sharing the commonality among multitask domains. The basic idea of MTC-LR is to use all individual LR based classifiers, each one appropriate for each task domain, but in contrast to other support vector machine (SVM)-based proposals, learning all the parameter vectors of all individual classifiers by using the conjugate gradient method, in a global way and without the use of kernel trick, and being easily extended into its scaled version. We theoretically show that the addition of a new term in the cost function of the set of LRs (that penalizes the diversity among multiple tasks) produces a coupling of multiple tasks that allows MTC-LR to improve the learning performance in a LR way. This finding can make us easily integrate it with a state-of-the-art fast LR algorithm called dual coordinate descent method (CDdual) to develop its fast version MTC-LR-CDdual for large multitask datasets. The proposed algorithm MTC-LR-CDdual is also theoretically analyzed. Our experimental results on artificial and real-datasets indicate the effectiveness of the proposed algorithm MTC-LR-CDdual in classification accuracy, speed, and robustness.
Collapse
|
21
|
Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del Bimbo A. 3-D Human Action Recognition by Shape Analysis of Motion Trajectories on Riemannian Manifold. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:1340-1352. [PMID: 25216492 DOI: 10.1109/tcyb.2014.2350774] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Recognizing human actions in 3-D video sequences is an important open problem that is currently at the heart of many research domains including surveillance, natural interfaces and rehabilitation. However, the design and development of models for action recognition that are both accurate and efficient is a challenging task due to the variability of the human pose, clothing and appearance. In this paper, we propose a new framework to extract a compact representation of a human action captured through a depth sensor, and enable accurate action recognition. The proposed solution develops on fitting a human skeleton model to acquired data so as to represent the 3-D coordinates of the joints and their change over time as a trajectory in a suitable action space. Thanks to such a 3-D joint-based framework, the proposed solution is capable to capture both the shape and the dynamics of the human body, simultaneously. The action recognition problem is then formulated as the problem of computing the similarity between the shape of trajectories in a Riemannian manifold. Classification using k-nearest neighbors is finally performed on this manifold taking advantage of Riemannian geometry in the open curve shape space. Experiments are carried out on four representative benchmarks to demonstrate the potential of the proposed solution in terms of accuracy/latency for a low-latency action recognition. Comparative results with state-of-the-art methods are reported.
Collapse
|
22
|
Liu AA, Su YT, Jia PP, Gao Z, Hao T, Yang ZX. Multipe/single-view human action recognition via part-induced multitask structural learning. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:1194-1208. [PMID: 25167566 DOI: 10.1109/tcyb.2014.2347057] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper proposes a unified framework for multiple/single-view human action recognition. First, we propose the hierarchical partwise bag-of-words representation which encodes both local and global visual saliency based on the body structure cue. Then, we formulate the multiple/single-view human action recognition as a part-regularized multitask structural learning (MTSL) problem which has two advantages on both model learning and feature selection: 1) preserving the consistence between the body-based action classification and the part-based action classification with the complementary information among different action categories and multiple views and 2) discovering both action-specific and action-shared feature subspaces to strengthen the generalization ability of model learning. Moreover, we contribute two novel human action recognition datasets, TJU (a single-view multimodal dataset) and MV-TJU (a multiview multimodal dataset). The proposed method is validated on three kinds of challenging datasets, including two single-view RGB datasets (KTH and TJU), two well-known depth dataset (MSR action 3-D and MSR daily activity 3-D), and one novel multiview multimodal dataset (MV-TJU). The extensive experimental results show that this method can outperform the popular 2-D/3-D part model-based methods and several other competing methods for multiple/single-view human action recognition in both RGB and depth modalities. To our knowledge, this paper is the first to demonstrate the applicability of MTSL with part-based regularization on multiple/single-view human action recognition in both RGB and depth modalities.
Collapse
|
23
|
Zhang S, Yao H, Sun X, Wang K, Zhang J, Lu X, Zhang Y. Action recognition based on overcomplete independent components analysis. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2013.12.052] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
24
|
Liu L, Shao L, Zhen X, Li X. Learning discriminative key poses for action recognition. IEEE TRANSACTIONS ON CYBERNETICS 2013; 43:1860-1870. [PMID: 23757577 DOI: 10.1109/tsmcb.2012.2231959] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
In this paper, we present a new approach for human action recognition based on key-pose selection and representation. Poses in video frames are described by the proposed extensive pyramidal features (EPFs), which include the Gabor, Gaussian, and wavelet pyramids. These features are able to encode the orientation, intensity, and contour information and therefore provide an informative representation of human poses. Due to the fact that not all poses in a sequence are discriminative and representative, we further utilize the AdaBoost algorithm to learn a subset of discriminative poses. Given the boosted poses for each video sequence, a new classifier named weighted local naive Bayes nearest neighbor is proposed for the final action classification, which is demonstrated to be more accurate and robust than other classifiers, e.g., support vector machine (SVM) and naive Bayes nearest neighbor. The proposed method is systematically evaluated on the KTH data set, the Weizmann data set, the multiview IXMAS data set, and the challenging HMDB51 data set. Experimental results manifest that our method outperforms the state-of-the-art techniques in terms of recognition rate.
Collapse
|
25
|
Seah CW, Ong YS, Tsang IW. Combating Negative Transfer From Predictive Distribution Differences. IEEE TRANSACTIONS ON CYBERNETICS 2013; 43:1153-1165. [PMID: 26502426 DOI: 10.1109/tsmcb.2012.2225102] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Domain adaptation (DA), which leverages labeled data from related source domains, comes in handy when the label information of the target domain is scarce or unavailable. However, as the source data do not come from the same origin as that of the target domain, the predictive distributions of the source and target domains are likely to differ in reality. At the extreme, the predictive distributions of the source domains can differ completely from that of the target domain. In such case, using the learned source classifier to assist in the prediction of target data can result in prediction performance that is poorer than that with the omission of the source data. This phenomenon is established as negative transfer with impact known to be more severe in the multiclass context. To combat negative transfer due to differing predictive distributions across domains, we first introduce the notion of positive transferability for the assessment of synergy between the source and target domains in their prediction models, and we also propose a criterion to measure the positive transferability between sample pairs of different domains in terms of their prediction distributions. With the new measure, a predictive distribution matching (PDM) regularizer and a PDM framework learn the target classifier by favoring source data with large positive transferability while inferring the labels of target unlabeled data. Extensive experiments are conducted to validate the performance efficacy of the proposed PDM framework using several commonly used multidomain benchmark data sets, including Sentiment, Reuters, and Newsgroup, in the context of both binary-class and multiclass domains. Subsequently, the PDM framework is put to work on a real-world scenario pertaining to water cluster molecule identification. The experimental results illustrate the adverse impact of negative transfer on several state-of-the-art DA methods, whereas the proposed framework exhibits excellent and robust predictive performances.
Collapse
|