1
|
Chai Z, Qin H. Dynamic Motion Transition: A Hybrid Data-Driven and Model-Driven Method for Human Pose Transitions. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:1848-1861. [PMID: 38427540 DOI: 10.1109/tvcg.2024.3372421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
The rapid, accurate, and robust computation of virtual human figures' "in-between" pose transitions from available and sometimes sparse inputs is of fundamental significance to 3D interactive graphics and computer animation. Various methods have been proposed to produce natural lifelike transitions of human pose automatically in recent decades. Nevertheless, conventional pure model-driven methods require heuristic knowledge (e.g., least motion guided by physics laws) and ad-hoc clues (e.g., splines with non-uniform time warp) that are difficult to obtain, learn, and infer. With the fast emergence of large-scale datasets readily available to animators in the most recent years, deep models afford a powerful alternative to tackle the aforementioned challenges. However, pure data-driven methods still suffer from the remaining challenges such as unseen data in practice and less generative power in model/domain/data transfer, and the measurement of the generative power has always been omitted in these works. In essence, data-driven methods solely rely on the qualities and quantities of training datasets. In this paper, we propose a hybrid approach built upon the seamless integration of data-driven and model-driven methods, called Dynamic Motion Transition (DMT), with the following salient modeling advantages: (1) The data augmentation capability based on the limited human locomotion data capture and the concept of force-derived directly from physical laws; (2) Force learning by which skeleton joints are driven to move, and the Conditional Temporal Transformer (CTT) being trained to learn the force change in the local range, both at the fine level; and (3) At the coarse level, the effective and flexible creation of the subsequent step motion using Dynamic Movement Primitives (DMP) until the target is reached. Our extensive experiments have confirmed that our model can outperform the state-of-the-art methods under the newly devised metric by virtue of the least action loss function. In addition, our novel method and system are of immediate benefit to many other animation tasks such as motion synthesis and control, and motion tracking and prediction in this bigdata graphics era.
Collapse
|
2
|
Ma H, Yang Z, Liu H. Fine-Grained Unsupervised Temporal Action Segmentation and Distributed Representation for Skeleton-Based Human Motion Analysis. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13411-13424. [PMID: 34932492 DOI: 10.1109/tcyb.2021.3132016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Understanding the fine-grained temporal structure of human actions and its semantic interpretation is beneficial to many real-world tasks, such as sports movements, rehabilitation exercises, and daily-life activities analysis. Current action segmentation methods mainly rely on deep neural networks to derive feature embedding of actions from motion data, while works on analyzing human actions in fine-granularity are still lacking due to the lack of clear and generic definitions of subactions and related datasets. On the other hand, the motion representations obtained in current action segmentation methods lack semantic or mathematical interpretability that can be used to evaluate action/subaction similarity in quantitative motion analysis. Toward the goal of fine-grained, interpretable, scalable, and efficient action segmentation, we propose a novel unsupervised action segmentation and distributed representation framework based on intuitive motion primitives defined on pose data. Metrics for comprehensive evaluation of the unsupervised fine-grained action segmentation task performance are proposed, and both public and self-constructed datasets are adopted in the experiments. The results show that the proposed method has good performance and generality across different subjects, datasets, and application scenarios.
Collapse
|
3
|
Harris EJ, Khoo IH, Demircan E. A Survey of Human Gait-Based Artificial Intelligence Applications. Front Robot AI 2022; 8:749274. [PMID: 35047564 PMCID: PMC8762057 DOI: 10.3389/frobt.2021.749274] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 11/01/2021] [Indexed: 12/17/2022] Open
Abstract
We performed an electronic database search of published works from 2012 to mid-2021 that focus on human gait studies and apply machine learning techniques. We identified six key applications of machine learning using gait data: 1) Gait analysis where analyzing techniques and certain biomechanical analysis factors are improved by utilizing artificial intelligence algorithms, 2) Health and Wellness, with applications in gait monitoring for abnormal gait detection, recognition of human activities, fall detection and sports performance, 3) Human Pose Tracking using one-person or multi-person tracking and localization systems such as OpenPose, Simultaneous Localization and Mapping (SLAM), etc., 4) Gait-based biometrics with applications in person identification, authentication, and re-identification as well as gender and age recognition 5) “Smart gait” applications ranging from smart socks, shoes, and other wearables to smart homes and smart retail stores that incorporate continuous monitoring and control systems and 6) Animation that reconstructs human motion utilizing gait data, simulation and machine learning techniques. Our goal is to provide a single broad-based survey of the applications of machine learning technology in gait analysis and identify future areas of potential study and growth. We discuss the machine learning techniques that have been used with a focus on the tasks they perform, the problems they attempt to solve, and the trade-offs they navigate.
Collapse
Affiliation(s)
- Elsa J Harris
- Human Performance and Robotics Laboratory, Department of Mechanical and Aerospace Engineering, California State University Long Beach, Long Beach, CA, United States
| | - I-Hung Khoo
- Department of Electrical Engineering, California State University Long Beach, Long Beach, CA, United States.,Department of Biomedical Engineering, California State University Long Beach, Long Beach, CA, United States
| | - Emel Demircan
- Human Performance and Robotics Laboratory, Department of Mechanical and Aerospace Engineering, California State University Long Beach, Long Beach, CA, United States.,Department of Biomedical Engineering, California State University Long Beach, Long Beach, CA, United States
| |
Collapse
|
4
|
Bekhouch A, Bouchrika I, Doghmane N. Gait biometrics: investigating the use of the lower inner regions for people identification from landmark frames. IET BIOMETRICS 2020. [DOI: 10.1049/iet-bmt.2020.0001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Amara Bekhouch
- Faculty of Science and Technology University of Souk Ahras Souk‐Ahras 41000 Algeria
| | - Imed Bouchrika
- Faculty of Science and Technology University of Souk Ahras Souk‐Ahras 41000 Algeria
| | - Nouredine Doghmane
- Department of Computer Science University of Annaba Annaba 23000 Algeria
| |
Collapse
|
5
|
Joukov V, Cesic J, Westermann K, Markovic I, Petrovic I, Kulic D. Estimation and Observability Analysis of Human Motion on Lie Groups. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1321-1332. [PMID: 31567105 DOI: 10.1109/tcyb.2019.2933390] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article proposes a framework for human-pose estimation from the wearable sensors that rely on a Lie group representation to model the geometry of the human movement. Human body joints are modeled by matrix Lie groups, using special orthogonal groups SO(2) and SO(3) for joint pose and special Euclidean group SE(3) for base-link pose representation. To estimate the human joint pose, velocity, and acceleration, we develop the equations for employing the extended Kalman filter on Lie groups (LG-EKF) to explicitly account for the non-Euclidean geometry of the state space. We present the observability analysis of an arbitrarily long kinematic chain of SO(3) elements based on a differential geometric approach, representing a generalization of kinematic chains of a human body. The observability is investigated for the system using marker position measurements. The proposed algorithm is compared with two competing approaches: 1) the extended Kalman filter (EKF) and 2) unscented KF (UKF) based on the Euler angle parametrization, in both simulations and extensive real-world experiments. The results show that the proposed approach achieves significant improvements over the Euler angle-based filters. It provides more accurate pose estimates, is not sensitive to gimbal lock, and more consistently estimates the covariances.
Collapse
|
6
|
Li P, Chen S. Shared Gaussian Process Latent Variable Model for Incomplete Multiview Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:61-73. [PMID: 30176618 DOI: 10.1109/tcyb.2018.2863790] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
These days, many multiview learning methods have been proposed by integrating the complementary information of multiple views and can significantly improve the performance of machine learning tasks comparing with single-view learning methods. However, most of these methods fail to learn better models when the multiview data are unpaired (or partially paired) or incomplete (or partially complete). Although some previous attempts have been made to address these problems, these methods often lead to poor results when dealing with incomplete multiview data that contain a relatively large number of missing instances. In fact, this incomplete problem is more challenging than the unpaired problem since less shared information can be caught by the model in the former case. In this paper, we propose a shared Gaussian process (GP) latent variable model for incomplete multiview clustering to gain the merits of two worlds (i.e., GP and multiview learning). Specifically, it learns a set of intentionally aligned representative auxiliary points in individual views jointly to not only compensate for missing instances but also implement the group-level constraint. Thus, the shared information among these views can be explicitly built into the model. All of the hyper-parameters and auxiliary points are simultaneously learned by variational inference. Compared with the existing methods, our method naturally inherits the advantages of GP. Furthermore, it is also straightforwardly extended to cases with more than two views without adding any complexity in formulation. In the experiments, we compare it with the state-of-the-art methods for incomplete multiview data clustering to demonstrate its superiorities.
Collapse
|
7
|
Wang SH, Sun J, Phillips P, Zhao G, Zhang YD. Polarimetric synthetic aperture radar image segmentation by convolutional neural network using graphical processing units. JOURNAL OF REAL-TIME IMAGE PROCESSING 2018; 15:631-642. [DOI: 10.1007/s11554-017-0717-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
|
8
|
Satoh S. Person Reidentification via Discrepancy Matrix and Matrix Metric. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:3006-3020. [PMID: 28991756 DOI: 10.1109/tcyb.2017.2755044] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Person reidentification (re-id), as an important task in video surveillance and forensics applications, has been widely studied. Previous research efforts toward solving the person re-id problem have primarily focused on constructing robust vector description by exploiting appearance's characteristic, or learning discriminative distance metric by labeled vectors. Based on the cognition and identification process of human, we propose a new pattern, which transforms the feature description from characteristic vector to discrepancy matrix. In particular, in order to well identify a person, it converts the distance metric from vector metric to matrix metric, which consists of the intradiscrepancy projection and interdiscrepancy projection parts. We introduce a consistent term and a discriminative term to form the objective function. To solve it efficiently, we utilize a simple gradient-descent method under the alternating optimization process with respect to the two projections. Experimental results on public datasets demonstrate the effectiveness of the proposed pattern as compared with the state-of-the-art approaches.
Collapse
|
9
|
Saeed A, Ozcelebi T, Lukkien J. Synthesizing and Reconstructing Missing Sensory Modalities in Behavioral Context Recognition. SENSORS (BASEL, SWITZERLAND) 2018; 18:E2967. [PMID: 30200575 PMCID: PMC6165109 DOI: 10.3390/s18092967] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 08/16/2018] [Accepted: 09/03/2018] [Indexed: 11/17/2022]
Abstract
Detection of human activities along with the associated context is of key importance for various application areas, including assisted living and well-being. To predict a user's context in the daily-life situation a system needs to learn from multimodal data that are often imbalanced, and noisy with missing values. The model is likely to encounter missing sensors in real-life conditions as well (such as a user not wearing a smartwatch) and it fails to infer the context if any of the modalities used for training are missing. In this paper, we propose a method based on an adversarial autoencoder for handling missing sensory features and synthesizing realistic samples. We empirically demonstrate the capability of our method in comparison with classical approaches for filling in missing values on a large-scale activity recognition dataset collected in-the-wild. We develop a fully-connected classification network by extending an encoder and systematically evaluate its multi-label classification performance when several modalities are missing. Furthermore, we show class-conditional artificial data generation and its visual and quantitative analysis on context classification task; representing a strong generative power of adversarial autoencoders.
Collapse
Affiliation(s)
- Aaqib Saeed
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands.
| | - Tanir Ozcelebi
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands.
| | - Johan Lukkien
- Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands.
| |
Collapse
|
10
|
Feature Representation and Data Augmentation for Human Activity Classification Based on Wearable IMU Sensor Data Using a Deep LSTM Neural Network. SENSORS 2018; 18:s18092892. [PMID: 30200377 PMCID: PMC6165524 DOI: 10.3390/s18092892] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Revised: 08/10/2018] [Accepted: 08/27/2018] [Indexed: 11/17/2022]
Abstract
Wearable inertial measurement unit (IMU) sensors are powerful enablers for acquisition of motion data. Specifically, in human activity recognition (HAR), IMU sensor data collected from human motion are categorically combined to formulate datasets that can be used for learning human activities. However, successful learning of human activities from motion data involves the design and use of proper feature representations of IMU sensor data and suitable classifiers. Furthermore, the scarcity of labelled data is an impeding factor in the process of understanding the performance capabilities of data-driven learning models. To tackle these challenges, two primary contributions are in this article: first; by using raw IMU sensor data, a spectrogram-based feature extraction approach is proposed. Second, an ensemble of data augmentations in feature space is proposed to take care of the data scarcity problem. Performance tests were conducted on a deep long term short term memory (LSTM) neural network architecture to explore the influence of feature representations and the augmentations on activity recognition accuracy. The proposed feature extraction approach combined with the data augmentation ensemble produces state-of-the-art accuracy results in HAR. A performance evaluation of each augmentation approach is performed to show the influence on classification accuracy. Finally, in addition to using our own dataset, the proposed data augmentation technique is evaluated against the University of California, Irvine (UCI) public online HAR dataset and yields state-of-the-art accuracy results at various learning rates.
Collapse
|
11
|
Affiliation(s)
- Fang Su
- School of Economics and Management, Shaanxi University of Science & Technology , Xi’an, P.R. China
| | - Jing-Yan Wang
- Science Division, New York University Abu Dhabi , Abu Dhabi, United Arab Emirates
| |
Collapse
|
12
|
Zou Q, Ni L, Wang Q, Li Q, Wang S. Robust Gait Recognition by Integrating Inertial and RGBD Sensors. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1136-1150. [PMID: 28368842 DOI: 10.1109/tcyb.2017.2682280] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Gait has been considered as a promising and unique biometric for person identification. Traditionally, gait data are collected using either color sensors, such as a CCD camera, depth sensors, such as a Microsoft Kinect, or inertial sensors, such as an accelerometer. However, a single type of sensors may only capture part of the dynamic gait features and make the gait recognition sensitive to complex covariate conditions, leading to fragile gait-based person identification systems. In this paper, we propose to combine all three types of sensors for gait data collection and gait recognition, which can be used for important identification applications, such as identity recognition to access a restricted building or area. We propose two new algorithms, namely EigenGait and TrajGait, to extract gait features from the inertial data and the RGBD (color and depth) data, respectively. Specifically, EigenGait extracts general gait dynamics from the accelerometer readings in the eigenspace and TrajGait extracts more detailed subdynamics by analyzing 3-D dense trajectories. Finally, both extracted features are fed into a supervised classifier for gait recognition and person identification. Experiments on 50 subjects, with comparisons to several other state-of-the-art gait-recognition approaches, show that the proposed approach can achieve higher recognition accuracy and robustness.
Collapse
|
13
|
Nabila M, Mohammed AI, Yousra BJ. Gait‐based human age classification using a silhouette model. IET BIOMETRICS 2017. [DOI: 10.1049/iet-bmt.2016.0176] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Mansouri Nabila
- ReDCAD LaboratoryUniversity of SfaxSfaxTunisia
- UVHC, LAMIH LaboratoryUniversity of Lille NorthValenciennesFrance
| | | | | |
Collapse
|
14
|
|