1
|
Feli M, Kazemi K, Azimi I, Liljeberg P, Rahmani AM. Multitask learning approach for PPG applications: Case studies on signal quality assessment and physiological parameters estimation. Comput Biol Med 2025; 188:109798. [PMID: 39946784 DOI: 10.1016/j.compbiomed.2025.109798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 01/30/2025] [Accepted: 02/03/2025] [Indexed: 03/05/2025]
Abstract
Wearable technology has expanded the applications of photoplethysmography (PPG) in remote health monitoring, enabling real-time measurement of various physiological parameters, such as heart rate (HR), heart rate variability (HRV), and respiration rate (RR). While existing studies mainly focus on individual parameters derived from PPG, they often overlook the shared characteristics among these physiological parameters. Multitask learning (MTL) offers a promising solution by training a single model to perform multiple related tasks, leveraging their interdependencies. However, the potential of MTL has not been thoroughly investigated in the context of PPG analysis. In this paper, we develop MTL approaches that exploit shared underlying characteristics across PPG-related tasks to improve the performance of PPG-based applications. We propose customized multitask deep learning models for two applications: (1) PPG quality assessment for HR and HRV features collected in free-living conditions and (2) simultaneous HR and RR estimation from PPG. Our models are evaluated on a PPG dataset collected from 46 subjects wearing smartwatches during their daily activities. Results demonstrate that the proposed MTL methods significantly outperform baseline single-task models, achieving higher accuracy in quality assessment and reduced error rates in HR and RR estimation.
Collapse
Affiliation(s)
- Mohammad Feli
- Department of Computing, University of Turku, Turku, Finland.
| | - Kianoosh Kazemi
- Department of Computing, University of Turku, Turku, Finland
| | - Iman Azimi
- Department of Computer Science, University of California, Irvine, USA
| | - Pasi Liljeberg
- Department of Computing, University of Turku, Turku, Finland
| | - Amir M Rahmani
- Department of Computer Science, University of California, Irvine, USA; School of Nursing, University of California, Irvine, USA
| |
Collapse
|
2
|
Wang K, Zhang K, Liu B, Chen W, Han M. Early prediction of sudden cardiac death risk with Nested LSTM based on electrocardiogram sequential features. BMC Med Inform Decis Mak 2024; 24:94. [PMID: 38600479 PMCID: PMC11005267 DOI: 10.1186/s12911-024-02493-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 03/26/2024] [Indexed: 04/12/2024] Open
Abstract
Electrocardiogram (ECG) signals are very important for heart disease diagnosis. In this paper, a novel early prediction method based on Nested Long Short-Term Memory (Nested LSTM) is developed for sudden cardiac death risk detection. First, wavelet denoising and normalization techniques are utilized for reliable reconstruction of ECG signals from extreme noise conditions. Then, a nested LSTM structure is adopted, which can guide the memory forgetting and memory selection of ECG signals, so as to improve the data processing ability and prediction accuracy of ECG signals. To demonstrate the effectiveness of the proposed method, four different models with different signal prediction techniques are used for comparison. The extensive experimental results show that this method can realize an accurate prediction of the cardiac beat's starting point and track the trend of ECG signals effectively. This study holds significant value for timely intervention for patients at risk of sudden cardiac death.
Collapse
Affiliation(s)
- Ke Wang
- College of Information Science and Technology, Zhejiang Shuren University, Hanzhou, 310015, China
| | - Kai Zhang
- Comprehensive Technical Service Center of Wenzhou Customs, Wenzhou, 325299, China
| | - Banteng Liu
- College of Information Science and Technology, Zhejiang Shuren University, Hanzhou, 310015, China.
| | - Wei Chen
- Zhejiang University, Hanzhou, 310058, China
- Binjiang Institute of Zhejiang University, Hanzhou, 310053, China
| | - Meng Han
- Zhejiang University, Hanzhou, 310058, China
- Binjiang Institute of Zhejiang University, Hanzhou, 310053, China
| |
Collapse
|
3
|
Kang Y, Yang G, Eom H, Han S, Baek S, Noh S, Shin Y, Park C. GAN-based patient information hiding for an ECG authentication system. Biomed Eng Lett 2023; 13:197-207. [PMID: 37124113 PMCID: PMC10130315 DOI: 10.1007/s13534-023-00266-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 01/17/2023] [Accepted: 02/01/2023] [Indexed: 05/02/2023] Open
Abstract
Various biometrics such as the face, irises, and fingerprints, which can be obtained in a relatively simple way in modern society, are used in personal authentication systems to identify individuals. These biometric data are extracted from an individual's physiological data and yield high performance in identifying an individual using unique data patterns. Biometric identification is also used in portable devices such as mobile devices because it is more secure than cryptographic token-based authentication methods. However, physiological data could include personal health information such as arrhythmia related patterns in electrocardiogram (ECG) signals. To protect sensitive health information from hackers, the biomarkers of certain diseases or disorders that exist in ECG signals need to be hidden. Additionally, to implement the inference models for both arrhythmia detection and personal authentication in a mobile device, a lightweight model such as a multi-task deep learning model should be considered. This study demonstrates a multi-task neural network model that simultaneously identifies an individual's ECG and arrhythmia patterns using a small network. Finally, the computational efficiency and model size of the single-task and multi-task models were compared based on the number of parameters. Although the multi-task model has 20,000 fewer parameters than the single-task model, they yielded similar performance, which demonstrates the efficient structure of the multi-task model.
Collapse
Affiliation(s)
- Youngshin Kang
- Department of Computer Engineering, Kwangwoon University, Seoul, KR 01897 Republic of Korea
| | - Geunbo Yang
- Department of Computer Engineering, Kwangwoon University, Seoul, KR 01897 Republic of Korea
| | - Heesang Eom
- Department of Computer Engineering, Kwangwoon University, Seoul, KR 01897 Republic of Korea
| | - Seungwoo Han
- Department of Intelligent Information System and Embedded Software Engineering, Kwangwoon University, Seoul, KR 01897 Republic of Korea
| | - Suwhan Baek
- Department of Computer Engineering, Kwangwoon University, Seoul, KR 01897 Republic of Korea
| | - Seungil Noh
- Department of Cybersecurity, Korea University, Seoul, KR 02841 Republic of Korea
| | - Youngjoo Shin
- Department of Cybersecurity, Korea University, Seoul, KR 02841 Republic of Korea
| | - Cheolsoo Park
- Department of Computer Engineering, Kwangwoon University, Seoul, KR 01897 Republic of Korea
| |
Collapse
|
4
|
Wang R, Fan J, Li Y. Deep Multi-Scale Fusion Neural Network for Multi-Class Arrhythmia Detection. IEEE J Biomed Health Inform 2020; 24:2461-2472. [DOI: 10.1109/jbhi.2020.2981526] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
5
|
Pan Y, Tsang IW, Singh AK, Lin CT, Sugiyama M. Stochastic Multichannel Ranking with Brain Dynamics Preferences. Neural Comput 2020; 32:1499-1530. [PMID: 32521213 DOI: 10.1162/neco_a_01293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A driver's cognitive state of mental fatigue significantly affects his or her driving performance and more important, public safety. Previous studies have leveraged reaction time (RT) as the metric for mental fatigue and aim at estimating the exact value of RT using electroencephalogram (EEG) signals within a regression model. However, due to the easily corrupted and also nonsmooth properties of RTs during data collection, methods focusing on predicting the exact value of a noisy measurement, RT generally suffer from poor generalization performance. Considering that human RT is the reflection of brain dynamics preference (BDP) rather than a single regression output of EEG signals, we propose a novel channel-reliability-aware ranking (CArank) model for the multichannel ranking problem. CArank learns from BDPs using EEG data robustly and aims at preserving the ordering corresponding to RTs. In particular, we introduce a transition matrix to characterize the reliability of each channel used in the EEG data, which helps in learning with BDPs only from informative EEG channels. To handle large-scale EEG signals, we propose a stochastic-generalized expectation maximum (SGEM) algorithm to update CArank in an online fashion. Comprehensive empirical analysis on EEG signals from 40 participants shows that our CArank achieves substantial improvements in reliability while simultaneously detecting noisy or less informative EEG channels.
Collapse
Affiliation(s)
- Yuangang Pan
- Centre for Artificial Intelligence, University of Technology Sydney, Sydney 2007, Australia
| | - Ivor W Tsang
- Centre for Artificial Intelligence, University of Technology Sydney, Sydney 2007, Australia
| | - Avinash K Singh
- Centre for Artificial Intelligence, University of Technology Sydney, Sydney 2007, Australia
| | - Chin-Teng Lin
- Centre for Artificial Intelligence, University of Technology Sydney, Sydney 2007, Australia
| | - Masashi Sugiyama
- Center for Advanced Intelligence Project, RIKEN, Tokyo 103-0027, and Graduate School of Frontier Sciences, University of Tokyo, Tokyo 2777-8563, Japan
| |
Collapse
|
6
|
Zeng H, Wu Z, Zhang J, Yang C, Zhang H, Dai G, Kong W. EEG Emotion Classification Using an Improved SincNet-Based Deep Learning Model. Brain Sci 2019; 9:E326. [PMID: 31739605 PMCID: PMC6895992 DOI: 10.3390/brainsci9110326] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 11/01/2019] [Accepted: 11/12/2019] [Indexed: 02/08/2023] Open
Abstract
Deep learning (DL) methods have been used increasingly widely, such as in the fields of speech and image recognition. However, how to design an appropriate DL model to accurately and efficiently classify electroencephalogram (EEG) signals is still a challenge, mainly because EEG signals are characterized by significant differences between two different subjects or vary over time within a single subject, non-stability, strong randomness, low signal-to-noise ratio. SincNet is an efficient classifier for speaker recognition, but it has some drawbacks in dealing with EEG signals classification. In this paper, we improve and propose a SincNet-based classifier, SincNet-R, which consists of three convolutional layers, and three deep neural network (DNN) layers. We then make use of SincNet-R to test the classification accuracy and robustness by emotional EEG signals. The comparable results with original SincNet model and other traditional classifiers such as CNN, LSTM and SVM, show that our proposed SincNet-R model has higher classification accuracy and better algorithm robustness.
Collapse
Affiliation(s)
- Hong Zeng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
- Industrial NeuroScience Lab, University of Rome “La Sapienza”, 00161 Rome, Italy
| | - Zhenhua Wu
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
| | - Jiaming Zhang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
| | - Chen Yang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
| | - Hua Zhang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
| | - Guojun Dai
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
| | - Wanzeng Kong
- School of Computer Science and Technology, Hangzhou Dianzi University, Hanghzhou 310018, China; (H.Z.); (Z.W.); (J.Z.); (C.Y.); (H.Z.); (G.D.)
| |
Collapse
|
7
|
A Review on Automatic Facial Expression Recognition Systems Assisted by Multimodal Sensor Data. SENSORS 2019; 19:s19081863. [PMID: 31003522 PMCID: PMC6514576 DOI: 10.3390/s19081863] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 04/15/2019] [Accepted: 04/15/2019] [Indexed: 11/28/2022]
Abstract
Facial Expression Recognition (FER) can be widely applied to various research areas, such as mental diseases diagnosis and human social/physiological interaction detection. With the emerging advanced technologies in hardware and sensors, FER systems have been developed to support real-world application scenes, instead of laboratory environments. Although the laboratory-controlled FER systems achieve very high accuracy, around 97%, the technical transferring from the laboratory to real-world applications faces a great barrier of very low accuracy, approximately 50%. In this survey, we comprehensively discuss three significant challenges in the unconstrained real-world environments, such as illumination variation, head pose, and subject-dependence, which may not be resolved by only analysing images/videos in the FER system. We focus on those sensors that may provide extra information and help the FER systems to detect emotion in both static images and video sequences. We introduce three categories of sensors that may help improve the accuracy and reliability of an expression recognition system by tackling the challenges mentioned above in pure image/video processing. The first group is detailed-face sensors, which detect a small dynamic change of a face component, such as eye-trackers, which may help differentiate the background noise and the feature of faces. The second is non-visual sensors, such as audio, depth, and EEG sensors, which provide extra information in addition to visual dimension and improve the recognition reliability for example in illumination variation and position shift situation. The last is target-focused sensors, such as infrared thermal sensors, which can facilitate the FER systems to filter useless visual contents and may help resist illumination variation. Also, we discuss the methods of fusing different inputs obtained from multimodal sensors in an emotion system. We comparatively review the most prominent multimodal emotional expression recognition approaches and point out their advantages and limitations. We briefly introduce the benchmark data sets related to FER systems for each category of sensors and extend our survey to the open challenges and issues. Meanwhile, we design a framework of an expression recognition system, which uses multimodal sensor data (provided by the three categories of sensors) to provide complete information about emotions to assist the pure face image/video analysis. We theoretically analyse the feasibility and achievability of our new expression recognition system, especially for the use in the wild environment, and point out the future directions to design an efficient, emotional expression recognition system.
Collapse
|