1
|
Zheng X, Yan W, Liu B, Wu YI, Tu H. Estimation of heart rate and respiratory rate by fore-background spatiotemporal modeling of videos. BIOMEDICAL OPTICS EXPRESS 2025; 16:760-777. [PMID: 39958852 PMCID: PMC11828443 DOI: 10.1364/boe.546968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 01/08/2025] [Accepted: 01/19/2025] [Indexed: 02/18/2025]
Abstract
Heart rate (HR) and respiratory rate (RR) are two critical physiological parameters that can be estimated from video recordings. However, the accuracy of remote estimation of HR and RR is affected by fluctuations in ambient illumination. To address this adverse effect, we propose a fore-background spatiotemporal (FBST) method for estimating HR and RR from videos captured by consumer-grade cameras. Initially, we identify the foreground regions of interest (ROIs) on the face and chest, as well as the background ROIs in non-body areas of the videos. Subsequently, we construct the foreground and background spatiotemporal maps based on the dichromatic reflectance model. We then introduce a lightweight network equipped with adaptive spatiotemporal layers to process the spatiotemporal maps and automatically generate a feature map of the non-illumination perturbation pulses. This feature map serves as input to a ResNet-18 network to estimate the physiological rhythm. Finally, we extract pulse signals and estimate HR and RR concurrently. Experiments conducted on three public and one private dataset demonstrate the superiority of the proposed FBST method in terms of accuracy and computational efficiency. These findings provide novel insights into non-intrusive human physiological measurements using common devices.
Collapse
Affiliation(s)
- Xiujuan Zheng
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Wenqin Yan
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Boxiang Liu
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
- Key Laboratory of Information and Automation Technology of Sichuan Province, Sichuan University, Chengdu 610065, China
| | - Yue Ivan Wu
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China
| | - Haiyan Tu
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China
- Key Laboratory of Information and Automation Technology of Sichuan Province, Sichuan University, Chengdu 610065, China
| |
Collapse
|
2
|
Chen CC, Lin SX, Jeong H. Low-Complexity Timing Correction Methods for Heart Rate Estimation Using Remote Photoplethysmography. SENSORS (BASEL, SWITZERLAND) 2025; 25:588. [PMID: 39860958 PMCID: PMC11768942 DOI: 10.3390/s25020588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 01/10/2025] [Accepted: 01/17/2025] [Indexed: 01/27/2025]
Abstract
With the rise of modern healthcare monitoring, heart rate (HR) estimation using remote photoplethysmography (rPPG) has gained attention for its non-contact, continuous tracking capabilities. However, most HR estimation methods rely on stable, fixed sampling intervals, while practical image capture often involves irregular frame rates and missing data, leading to inaccuracies in HR measurements. This study addresses these issues by introducing low-complexity timing correction methods, including linear, cubic, and filter interpolation, to improve HR estimation from rPPG signals under conditions of irregular sampling and data loss. Through a comparative analysis, this study offers insights into efficient timing correction techniques for enhancing HR estimation from rPPG, particularly suitable for edge-computing applications where low computational complexity is essential. Cubic interpolation can provide robust performance in reconstructing signals but requires higher computational resources, while linear and filter interpolation offer more efficient solutions. The proposed low-complexity timing correction methods improve the reliability of rPPG-based HR estimation, making it a more robust solution for real-world healthcare applications.
Collapse
Affiliation(s)
- Chun-Chi Chen
- Electrical Engineering Department, National Chiayi University, Chiayi 600355, Taiwan
| | - Song-Xian Lin
- Electrical Engineering Department, National Chiayi University, Chiayi 600355, Taiwan
| | - Hyundoo Jeong
- Department of Biomedical and Robotics Engineering, Incheon National University, Incheon 22012, Republic of Korea
| |
Collapse
|
3
|
Yan W, Zhuang J, Chen Y, Zhang Y, Zheng X. MFF-Net: A Lightweight Multi-Frequency Network for Measuring Heart Rhythm from Facial Videos. SENSORS (BASEL, SWITZERLAND) 2024; 24:7937. [PMID: 39771677 PMCID: PMC11679567 DOI: 10.3390/s24247937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2024] [Revised: 12/07/2024] [Accepted: 12/10/2024] [Indexed: 01/11/2025]
Abstract
Remote photo-plethysmography (rPPG) is a useful camera-based health motioning method that can measure the heart rhythm from facial videos. Many well-established deep learning models can provide highly accurate and robust results in measuring heart rate (HR) and heart rate variability (HRV). However, these methods are unable to effectively eliminate illumination variation and motion artifact disturbances, and their substantial computational resource requirements significantly limit their applicability in real-world scenarios. Hence, we propose a lightweight multi-frequency network named MFF-Net to measure heart rhythm via facial videos in a short time. Firstly, we propose a multi-frequency mode signal fusion (MFF) mechanism, which can separate the characteristics of different modes of the original rPPG signals and send them to a processor with independent parameters, helping the network recover blood volume pulse (BVP) signals accurately under a complex noise environment. In addition, in order to help the network extract the characteristics of different modal signals effectively, we designed a temporal multiscale convolution module (TMSC-module) and spectrum self-attention module (SSA-module). The TMSC-module can expand the receptive field of the signal-refining network, obtain more abundant multiscale information, and transmit it to the signal reconstruction network. The SSA-module can help a signal reconstruction network locate the obvious inferior parts in the reconstruction process so as to make better decisions when merging multi-dimensional signals. Finally, in order to solve the over-fitting phenomenon that easily occurs in the network, we propose an over-fitting sampling training scheme to further improve the fitting ability of the network. Comprehensive experiments were conducted on three benchmark datasets, and we estimated HR and HRV based on the BVP signals derived by MFF-Net. Compared with state-of-the-art methods, our approach achieves better performance both on HR and HRV estimation with lower computational burden. We can conclude that the proposed MFF-Net has the opportunity to be applied in many real-world scenarios.
Collapse
Affiliation(s)
- Wenqin Yan
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China; (W.Y.); (J.Z.); (Y.C.)
- Key Laboratory of Information and Automation Technology of Sichuan Province, Chengdu 610065, China
| | - Jialiang Zhuang
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China; (W.Y.); (J.Z.); (Y.C.)
| | - Yuheng Chen
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China; (W.Y.); (J.Z.); (Y.C.)
- Key Laboratory of Information and Automation Technology of Sichuan Province, Chengdu 610065, China
| | - Yun Zhang
- School of Information Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China;
| | - Xiujuan Zheng
- College of Electrical Engineering, Sichuan University, Chengdu 610065, China; (W.Y.); (J.Z.); (Y.C.)
- Key Laboratory of Information and Automation Technology of Sichuan Province, Chengdu 610065, China
| |
Collapse
|
4
|
Zou B, Zhao Y, Hu X, He C, Yang T. Remote physiological signal recovery with efficient spatio-temporal modeling. Front Physiol 2024; 15:1428351. [PMID: 39469440 PMCID: PMC11513465 DOI: 10.3389/fphys.2024.1428351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 09/30/2024] [Indexed: 10/30/2024] Open
Abstract
Contactless physiological signal measurement has great applications in various fields, such as affective computing and health monitoring. Physiological measurements based on remote photoplethysmography (rPPG) are realized by capturing the weak periodic color changes. The changes are caused by the variation in the light absorption of skin surface during systole and diastole stages of a functioning heart. This measurement mode has advantages of contactless measurement, simple operation, low cost, etc. In recent years, several deep learning-based rPPG measurement methods have been proposed. However, the features learned by deep learning models are vulnerable to motion and illumination artefacts, and are unable to fully exploit the intrinsic temporal characteristics of the rPPG. This paper presents an efficient spatiotemporal modeling-based rPPG recovery method for physiological signal measurements. First, two modules are utilized in the rPPG task: 1) 3D central difference convolution for temporal context modeling with enhanced representation and generalization capacity, and 2) Huber loss for robust intensity-level rPPG recovery. Second, a dual branch structure for both motion and appearance modeling and a soft attention mask are adapted to take full advantage of the central difference convolution. Third, a multi-task setting for joint cardiac and respiratory signals measurements is introduced to benefit from the internal relevance between two physiological signals. Last, extensive experiments performed on three public databases show that the proposed method outperforms prior state-of-the-art methods with the Pearson's correlation coefficient higher than 0.96 on all three datasets. The generalization ability of the proposed method is also evaluated by cross-database and video compression experiments. The effectiveness and necessity of each module are confirmed by ablation studies.
Collapse
Affiliation(s)
- Bochao Zou
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
- Shunde Graduate School of University of Science and Technology Beijing, Beijing, Guangdong, China
| | - Yu Zhao
- Key Laboratory of Complex System Control Theory and Application, Tianjin University of Technology, Tianjin, China
| | - Xiaocheng Hu
- China Academy of Electronics and Information Technology, Beijing, China
| | - Changyu He
- China Academy of Electronics and Information Technology, Beijing, China
| | - Tianwa Yang
- China University of Political Science and Law, Beijing, China
| |
Collapse
|
5
|
Nguyen N, Nguyen L, Li H, Bordallo López M, Álvarez Casado C. Evaluation of video-based rPPG in challenging environments: Artifact mitigation and network resilience. Comput Biol Med 2024; 179:108873. [PMID: 39053334 DOI: 10.1016/j.compbiomed.2024.108873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 07/05/2024] [Accepted: 07/08/2024] [Indexed: 07/27/2024]
Abstract
Video-based remote photoplethysmography (rPPG) has emerged as a promising technology for non-contact vital sign monitoring, especially under controlled conditions. However, the accurate measurement of vital signs in real-world scenarios faces several challenges, including artifacts induced by videocodecs, low-light noise, degradation, low dynamic range, occlusions, and hardware and network constraints. In this article, a systematic and comprehensive investigation of these issues is conducted, measuring their detrimental effects on the quality of rPPG measurements. Additionally, practical strategies are proposed for mitigating these challenges to improve the dependability and resilience of video-based rPPG systems. Methods for effective biosignal recovery in the presence of network limitations are detailed, along with denoising and inpainting techniques aimed at preserving video frame integrity. Compared to previous studies, this paper addresses a broader range of variables and demonstrates improved accuracy across various rPPG methods, emphasizing generalizability for practical applications in diverse scenarios with varying data quality. Extensive evaluations and direct comparisons demonstrate the effectiveness of these approaches in enhancing rPPG measurements under challenging environments, contributing to the development of more reliable and effective remote vital sign monitoring technologies.
Collapse
Affiliation(s)
- Nhi Nguyen
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland.
| | - Le Nguyen
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland.
| | - Honghan Li
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland; Division of Bioengineering, Graduate School of Engineering Science, Osaka University, Osaka, Japan.
| | - Miguel Bordallo López
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland; VTT Technical Research Center of Finland Ltd., Oulu, Finland.
| | | |
Collapse
|
6
|
Chen W, Yi Z, Lim LJR, Lim RQR, Zhang A, Qian Z, Huang J, He J, Liu B. Deep learning and remote photoplethysmography powered advancements in contactless physiological measurement. Front Bioeng Biotechnol 2024; 12:1420100. [PMID: 39104628 PMCID: PMC11298756 DOI: 10.3389/fbioe.2024.1420100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 06/27/2024] [Indexed: 08/07/2024] Open
Abstract
In recent decades, there has been ongoing development in the application of computer vision (CV) in the medical field. As conventional contact-based physiological measurement techniques often restrict a patient's mobility in the clinical environment, the ability to achieve continuous, comfortable and convenient monitoring is thus a topic of interest to researchers. One type of CV application is remote imaging photoplethysmography (rPPG), which can predict vital signs using a video or image. While contactless physiological measurement techniques have an excellent application prospect, the lack of uniformity or standardization of contactless vital monitoring methods limits their application in remote healthcare/telehealth settings. Several methods have been developed to improve this limitation and solve the heterogeneity of video signals caused by movement, lighting, and equipment. The fundamental algorithms include traditional algorithms with optimization and developing deep learning (DL) algorithms. This article aims to provide an in-depth review of current Artificial Intelligence (AI) methods using CV and DL in contactless physiological measurement and a comprehensive summary of the latest development of contactless measurement techniques for skin perfusion, respiratory rate, blood oxygen saturation, heart rate, heart rate variability, and blood pressure.
Collapse
Affiliation(s)
- Wei Chen
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Zhe Yi
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Lincoln Jian Rong Lim
- Department of Medical Imaging, Western Health, Footscray Hospital, Footscray, VIC, Australia
- Department of Surgery, The University of Melbourne, Melbourne, VIC, Australia
| | - Rebecca Qian Ru Lim
- Department of Hand & Reconstructive Microsurgery, Singapore General Hospital, Singapore, Singapore
| | - Aijie Zhang
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Zhen Qian
- Institute of Intelligent Diagnostics, Beijing United-Imaging Research Institute of Intelligent Imaging, Beijing, China
| | - Jiaxing Huang
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Jia He
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Bo Liu
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
- Beijing Research Institute of Traumatology and Orthopaedics, Beijing, China
| |
Collapse
|
7
|
Wang Y, Ren Y, Wang T, Li D, Cai H, Ji B. High-accuracy heart rate detection using multispectral IPPG technology combined with a deep learning algorithm. JOURNAL OF BIOPHOTONICS 2024:e202400119. [PMID: 38932695 DOI: 10.1002/jbio.202400119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 05/07/2024] [Accepted: 06/06/2024] [Indexed: 06/28/2024]
Abstract
Image Photoplethysmography (IPPG) technology is a noncontact physiological parameter detection technology, which has been widely used in heart rate (HR) detection. However, traditional imaging devices still have issues such as narrower receiving spectral range and inferior motion detection performance. In this paper, we propose a HR detection method based on multi-spectral video. Our method combining multispectral imaging with IPPG technology provides more accurate physiological information. To realize real-time evaluation of HR directly from facial multispectral videos, we propose a new end-to-end neural network, namely IPPGResNet18. The IPPGResNet18 model was trained on the multispectral video dataset from which better results were achieved: MAE = 2.793, RMSE = 3.695, SD = 3.707, p = 0.304. The experimental results demonstrate a high accuracy of HR detection under motion state using this detection method. In respect of real-time monitoring of HR during movement, our method is obviously superior to the conventional technical solutions.
Collapse
Affiliation(s)
- Yu Wang
- School of Physics, Changchun University of Science and Technology, Changchun, China
- Key Laboratory of Jilin Province for Spectral Detection Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Yu Ren
- School of Physics, Changchun University of Science and Technology, Changchun, China
- Key Laboratory of Jilin Province for Spectral Detection Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Tingting Wang
- School of Physics, Changchun University of Science and Technology, Changchun, China
- Key Laboratory of Jilin Province for Spectral Detection Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Dongliang Li
- School of Physics, Changchun University of Science and Technology, Changchun, China
- Key Laboratory of Jilin Province for Spectral Detection Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Hongxing Cai
- School of Physics, Changchun University of Science and Technology, Changchun, China
- Key Laboratory of Jilin Province for Spectral Detection Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Boyu Ji
- School of Physics, Changchun University of Science and Technology, Changchun, China
- School of Physics, Zhongshan Institute of Changchun University of Science and Technology, Zhongshan, China
| |
Collapse
|
8
|
Xiang G, Yao S, Peng Y, Deng H, Wu X, Wang K, Li Y, Wu F. An effective cross-scenario remote heart rate estimation network based on global-local information and video transformer. Phys Eng Sci Med 2024; 47:729-739. [PMID: 38504066 DOI: 10.1007/s13246-024-01401-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/06/2024] [Indexed: 03/21/2024]
Abstract
Remote photoplethysmography (rPPG) technology is a non-contact physiological signal measurement method, characterized by non-invasiveness and ease of use. It has broad application potential in medical health, human factors engineering, and other fields. However, current rPPG technology is highly susceptible to variations in lighting conditions, head pose changes, and partial occlusions, posing significant challenges for its widespread application. In order to improve the accuracy of remote heart rate estimation and enhance model generalization, we propose PulseFormer, a dual-path network based on transformer. By integrating local and global information and utilizing fast and slow paths, PulseFormer effectively captures the temporal variations of key regions and spatial variations of the global area, facilitating the extraction of rPPG feature information while mitigating the impact of background noise variations. Heart rate estimation results on the popular rPPG dataset show that PulseFormer achieves state-of-the-art performance on public datasets. Additionally, we establish a dataset containing facial expressions and synchronized physiological signals in driving scenarios and test the pre-trained model from the public dataset on this collected dataset. The results indicate that PulseFormer exhibits strong generalization capabilities across different data distributions in cross-scenario settings. Therefore, this model is applicable for heart rate estimation of individuals in various scenarios.
Collapse
Affiliation(s)
- Guoliang Xiang
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Song Yao
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Yong Peng
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China.
| | - Hanwen Deng
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Xianhui Wu
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Kui Wang
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Yingli Li
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Fan Wu
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| |
Collapse
|
9
|
Zhu F, Niu Q, Li X, Zhao Q, Su H, Shuai J. FM-FCN: A Neural Network with Filtering Modules for Accurate Vital Signs Extraction. RESEARCH (WASHINGTON, D.C.) 2024; 7:0361. [PMID: 38737196 PMCID: PMC11082448 DOI: 10.34133/research.0361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 04/01/2024] [Indexed: 05/14/2024]
Abstract
Neural networks excel at capturing local spatial patterns through convolutional modules, but they may struggle to identify and effectively utilize the morphological and amplitude periodic nature of physiological signals. In this work, we propose a novel network named filtering module fully convolutional network (FM-FCN), which fuses traditional filtering techniques with neural networks to amplify physiological signals and suppress noise. First, instead of using a fully connected layer, we use an FCN to preserve the time-dimensional correlation information of physiological signals, enabling multiple cycles of signals in the network and providing a basis for signal processing. Second, we introduce the FM as a network module that adapts to eliminate unwanted interference, leveraging the structure of the filter. This approach builds a bridge between deep learning and signal processing methodologies. Finally, we evaluate the performance of FM-FCN using remote photoplethysmography. Experimental results demonstrate that FM-FCN outperforms the second-ranked method in terms of both blood volume pulse (BVP) signal and heart rate (HR) accuracy. It substantially improves the quality of BVP waveform reconstruction, with a decrease of 20.23% in mean absolute error (MAE) and an increase of 79.95% in signal-to-noise ratio (SNR). Regarding HR estimation accuracy, FM-FCN achieves a decrease of 35.85% in MAE, 29.65% in error standard deviation, and 32.88% decrease in 95% limits of agreement width, meeting clinical standards for HR accuracy requirements. The results highlight its potential in improving the accuracy and reliability of vital sign measurement through high-quality BVP signal extraction. The codes and datasets are available online at https://github.com/zhaoqi106/FM-FCN.
Collapse
Affiliation(s)
- Fangfang Zhu
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, and State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network,
Xiamen University, Xiamen 361005, China
| | - Qichao Niu
- Vitalsilicon Technology Co. Ltd., Jiaxing, Zhejiang 314006, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Qi Zhao
- School of Computer Science and Software Engineering,
University of Science and Technology Liaoning, Anshan 114051, China
| | - Honghong Su
- Yangtze Delta Region Institute of Tsinghua University, Zhejiang, Jiaxing 314006, China
| | - Jianwei Shuai
- Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou 325001, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou 325001, China
| |
Collapse
|
10
|
Fontes L, Machado P, Vinkemeier D, Yahaya S, Bird JJ, Ihianle IK. Enhancing Stress Detection: A Comprehensive Approach through rPPG Analysis and Deep Learning Techniques. SENSORS (BASEL, SWITZERLAND) 2024; 24:1096. [PMID: 38400254 PMCID: PMC10892284 DOI: 10.3390/s24041096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 01/29/2024] [Accepted: 02/02/2024] [Indexed: 02/25/2024]
Abstract
Stress has emerged as a major concern in modern society, significantly impacting human health and well-being. Statistical evidence underscores the extensive social influence of stress, especially in terms of work-related stress and associated healthcare costs. This paper addresses the critical need for accurate stress detection, emphasising its far-reaching effects on health and social dynamics. Focusing on remote stress monitoring, it proposes an efficient deep learning approach for stress detection from facial videos. In contrast to the research on wearable devices, this paper proposes novel Hybrid Deep Learning (DL) networks for stress detection based on remote photoplethysmography (rPPG), employing (Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), 1D Convolutional Neural Network (1D-CNN)) models with hyperparameter optimisation and augmentation techniques to enhance performance. The proposed approach yields a substantial improvement in accuracy and efficiency in stress detection, achieving up to 95.83% accuracy with the UBFC-Phys dataset while maintaining excellent computational efficiency. The experimental results demonstrate the effectiveness of the proposed Hybrid DL models for rPPG-based-stress detection.
Collapse
Affiliation(s)
| | | | | | | | | | - Isibor Kennedy Ihianle
- Department of Computer Science, Nottingham Trent University, Nottingham NG1 4FQ, UK; (L.F.); (P.M.); (D.V.); (S.Y.); (J.J.B.)
| |
Collapse
|
11
|
Lee S, Lee M, Sim JY. DSE-NN: Deeply Supervised Efficient Neural Network for Real-Time Remote Photoplethysmography. Bioengineering (Basel) 2023; 10:1428. [PMID: 38136019 PMCID: PMC10740871 DOI: 10.3390/bioengineering10121428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
Non-contact remote photoplethysmography can be used in a variety of medical and healthcare fields by measuring vital signs continuously and unobtrusively. Recently, end-to-end deep learning methods have been proposed to replace the existing handcrafted features. However, since the existing deep learning methods are known as black box models, the problem of interpretability has been raised, and the same problem exists in the remote photoplethysmography (rPPG) network. In this study, we propose a method to visualize temporal and spectral representations for hidden layers, deeply supervise the spectral representation of intermediate layers through the depth of networks and optimize it for a lightweight model. The optimized network improves performance and enables fast training and inference times. The proposed spectral deep supervision helps to achieve not only high performance but also fast convergence speed through the regularization of the intermediate layers. The effect of the proposed methods was confirmed through a thorough ablation study on public datasets. As a result, similar or outperforming results were obtained in comparison to state-of-the-art models. In particular, our model achieved an RMSE of 1 bpm on the PURE dataset, demonstrating its high accuracy. Moreover, it excelled on the V4V dataset with an impressive RMSE of 6.65 bpm, outperforming other methods. We observe that our model began converging from the very first epoch, a significant improvement over other models in terms of learning efficiency. Our approach is expected to be generally applicable to models that learn spectral domain information as well as to the applications of regression that require the representations of periodicity.
Collapse
Affiliation(s)
| | | | - Joo Yong Sim
- Department of Mechanical Systems Engineering, Sookmyung Women’s University, Seoul 04310, Republic of Korea; (S.L.); (M.L.)
| |
Collapse
|
12
|
Casado CA, Lopez MB. Face2PPG: An Unsupervised Pipeline for Blood Volume Pulse Extraction From Faces. IEEE J Biomed Health Inform 2023; 27:5530-5541. [PMID: 37610907 DOI: 10.1109/jbhi.2023.3307942] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]
Abstract
Photoplethysmography (PPG) signals have become a key technology in many fields, such as medicine, well-being, or sports. Our work proposes a set of pipelines to extract remote PPG signals (rPPG) from the face robustly, reliably, and configurably. We identify and evaluate the possible choices in the critical steps of unsupervised rPPG methodologies. We assess a state-of-the-art processing pipeline in six different datasets, incorporating important corrections in the methodology that ensure reproducible and fair comparisons. In addition, we extend the pipeline by proposing three novel ideas; 1) a new method to stabilize the detected face based on a rigid mesh normalization; 2) a new method to dynamically select the different regions in the face that provide the best raw signals, and 3) a new RGB to rPPG transformation method, called Orthogonal Matrix Image Transformation (OMIT) based on QR decomposition, that increases robustness against compression artifacts. We show that all three changes introduce noticeable improvements in retrieving rPPG signals from faces, obtaining state-of-the-art results compared with unsupervised, non-learning-based methodologies and, in some databases, very close to supervised, learning-based methods. We perform a comparative study to quantify the contribution of each proposed idea. In addition, we depict a series of observations that could help in future implementations.
Collapse
|
13
|
Lin B, Tao J, Xu J, He L, Liu N, Zhang X. Estimation of vital signs from facial videos via video magnification and deep learning. iScience 2023; 26:107845. [PMID: 37790274 PMCID: PMC10542939 DOI: 10.1016/j.isci.2023.107845] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 07/27/2023] [Accepted: 09/05/2023] [Indexed: 10/05/2023] Open
Abstract
The continuous monitoring of vital signs is one of the hottest topics in healthcare. Recent technological advances in sensors, signal processing, and image processing spawned the development of no-contact techniques such as remote photoplethysmography (rPPG). To solve the common problems of rPPG including weak extracted signals, body movements, and generalization with limited data resources, we proposed a dual-path estimation method based on video magnification and deep learning. First, image processes are applied to detect, track, and magnificate facial ROIs automatically. Then, the steady part of the wave of each processed ROI is used for the extraction of features including heart rate, PTT, and features of pulse wave waveform. The blood pressures are estimated from the features via a small CNN. Results comply with the current standard and promise potential clinical applications in the future.
Collapse
Affiliation(s)
- Bin Lin
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Jing Tao
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Jingjing Xu
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Liang He
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Nenrong Liu
- Fujian Provincial Key Laboratory of Quantum Manipulation and New Energy Materials, Fujian Provincial Collaborative Innovation Center for Advanced High-Field Superconducting Materials and Engineering, College of Physics and Energy, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Xianzeng Zhang
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| |
Collapse
|
14
|
Zhang Q, Lin X, Zhang Y, Liu Q, Cai F. Non-contact high precision pulse-rate monitoring system for moving subjects in different motion states. Med Biol Eng Comput 2023; 61:2769-2783. [PMID: 37474842 DOI: 10.1007/s11517-023-02884-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 07/03/2023] [Indexed: 07/22/2023]
Abstract
Remote photoplethysmography (rPPG) enables contact-free monitoring of the pulse rate by using a color camera. The fundamental limitation is that motion artifacts and changes in ambient light conditions greatly affect the accuracy of pulse-rate monitoring. We propose use of a high-speed camera and a motion suppression algorithm with high computational efficiency. This system incorporates a number of major improvements including reproduction of pulse wave details, high-precision pulse-rate monitoring of moving subjects, and excellent scene scalability. A series of quantization methods were used to evaluate the effect of different frame rates and different algorithms in pulse-rate monitoring of moving subjects. The experimental results show that use of 180-fps video and a Plane-Orthogonal-to-Skin (POS) algorithm can produce high-precision pulse-rate monitoring results with mean absolute error can be less than 5 bpm and the relative accuracy reaching 94.5%. Thus, it has significant potential to improve personal health care and intelligent health monitoring.
Collapse
Affiliation(s)
- Qing Zhang
- School of Biomedical Engineering, Hainan University, Haikou, 570228, Hainan, China
| | - Xingsen Lin
- School of Biomedical Engineering, Hainan University, Haikou, 570228, Hainan, China
| | - Yuxin Zhang
- School of Biomedical Engineering, Hainan University, Haikou, 570228, Hainan, China
| | - Qian Liu
- School of Biomedical Engineering, Hainan University, Haikou, 570228, Hainan, China
| | - Fuhong Cai
- School of Biomedical Engineering, Hainan University, Haikou, 570228, Hainan, China.
| |
Collapse
|
15
|
Hino Y, Ashida K, Ogawa-Ochiai K, Tsumura N. Noise-Robust Pulse Wave Estimation from Near-Infrared Face Video Images Using the Wiener Estimation Method. J Imaging 2023; 9:202. [PMID: 37888309 PMCID: PMC10607892 DOI: 10.3390/jimaging9100202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 09/26/2023] [Accepted: 09/27/2023] [Indexed: 10/28/2023] Open
Abstract
In this paper, we propose a noise-robust pulse wave estimation method from near-infrared face video images. Pulse wave estimation in a near-infrared environment is expected to be applied to non-contact monitoring in dark areas. The conventional method cannot consider noise when performing estimation. As a result, the accuracy of pulse wave estimation in noisy environments is not very high. This may adversely affect the accuracy of heart rate data and other data obtained from pulse wave signals. Therefore, the objective of this study is to perform pulse wave estimation robust to noise. The Wiener estimation method, which is a simple linear computation that can consider noise, was used in this study. Experimental results showed that the combination of the proposed method and signal processing (detrending and bandpass filtering) increased the SNR (signal to noise ratio) by more than 2.5 dB compared to the conventional method and signal processing. The correlation coefficient between the pulse wave signal measured using a pulse wave meter and the estimated pulse wave signal was 0.30 larger on average for the proposed method. Furthermore, the AER (absolute error rate) between the heart rate measured with the pulse wave meter was 0.82% on average for the proposed method, which was lower than the value of the conventional method (12.53% on average). These results show that the proposed method is more robust to noise than the conventional method for pulse wave estimation.
Collapse
Affiliation(s)
- Yuta Hino
- Graduate School of Science and Engineering, Chiba University, Chiba 263-8522, Japan (N.T.)
| | - Koichi Ashida
- Graduate School of Science and Engineering, Chiba University, Chiba 263-8522, Japan (N.T.)
| | - Keiko Ogawa-Ochiai
- Kampo Clinical Center, Hiroshima University Hospital, Hiroshima 734-8511, Japan;
| | - Norimichi Tsumura
- Graduate School of Science and Engineering, Chiba University, Chiba 263-8522, Japan (N.T.)
- Kampo Clinical Center, Hiroshima University Hospital, Hiroshima 734-8511, Japan;
| |
Collapse
|
16
|
Hu M, Wu X, Wang X, Xing Y, An N, Shi P. Contactless blood oxygen estimation from face videos: A multi-model fusion method based on deep learning. Biomed Signal Process Control 2023; 81:104487. [PMID: 36530216 PMCID: PMC9735266 DOI: 10.1016/j.bspc.2022.104487] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 11/13/2022] [Accepted: 12/01/2022] [Indexed: 12/14/2022]
Abstract
Blood Oxygen ( SpO 2 ), a key indicator of respiratory function, has received increasing attention during the COVID-19 pandemic. Clinical results show that patients with COVID-19 likely have distinct lower SpO 2 before the onset of significant symptoms. Aiming at the shortcomings of current methods for monitoring SpO 2 by face videos, this paper proposes a novel multi-model fusion method based on deep learning for SpO 2 estimation. The method includes the feature extraction network named Residuals and Coordinate Attention (RCA) and the multi-model fusion SpO 2 estimation module. The RCA network uses the residual block cascade and coordinate attention mechanism to focus on the correlation between feature channels and the location information of feature space. The multi-model fusion module includes the Color Channel Model (CCM) and the Network-Based Model(NBM). To fully use the color feature information in face videos, an image generator is constructed in the CCM to calculate SpO 2 by reconstructing the red and blue channel signals. Besides, to reduce the disturbance of other physiological signals, a novel two-part loss function is designed in the NBM. Given the complementarity of the features and models that CCM and NBM focus on, a Multi-Model Fusion Model(MMFM) is constructed. The experimental results on the PURE and VIPL-HR datasets show that three models meet the clinical requirement(the mean absolute error ⩽ 2%) and demonstrate that the multi-model fusion can fully exploit the SpO 2 features of face videos and improve the SpO 2 estimation performance. Our research achievements will facilitate applications in remote medicine and home health.
Collapse
Affiliation(s)
- Min Hu
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Xia Wu
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Xiaohua Wang
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Yan Xing
- School of Mathematics, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Ning An
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
- National Smart Eldercare International S&T Cooperation Base, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Piao Shi
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| |
Collapse
|
17
|
Jaiswal KB, Meenpal T. Heart rate estimation network from facial videos using spatiotemporal feature image. Comput Biol Med 2022; 151:106307. [PMID: 36403356 PMCID: PMC9671618 DOI: 10.1016/j.compbiomed.2022.106307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 11/05/2022] [Accepted: 11/06/2022] [Indexed: 11/10/2022]
Abstract
Remote health monitoring has become quite inevitable after SARS-CoV-2 pandemic and continues to be accepted as a measure of healthcare in future too. However, contact-less measurement of vital sign, like Heart Rate(HR) is quite difficult to measure because, the amplitude of physiological signal is very weak and can be easily degraded due to noise. The various sources of noise are head movements, variation in illumination or acquisition devices. In this paper, a video-based noise-less cardiopulmonary measurement is proposed. 3D videos are converted to 2D Spatio-Temporal Images (STI), which suppresses noise while preserving temporal information of Remote Photoplethysmography(rPPG) signal. The proposed model projects a new motion representation to CNN derived using wavelets, which enables estimation of HR under heterogeneous lighting condition and continuous motion. STI is formed by the concatenation of feature vectors obtained after wavelet decomposition of subsequent frames. STI is provided as input to CNN for mapping the corresponding HR values. The proposed approach utilizes the ability of CNN to visualize patterns. Proposed approach yields better results in terms of estimation of HR on four benchmark dataset such as MAHNOB-HCI, MMSE-HR, UBFC-rPPG and VIPL-HR.
Collapse
|
18
|
Kiddle A, Barham H, Wegerif S, Petronzio C. Dynamic region of interest selection in remote photoplethysmography: proof of principle (Preprint). JMIR Form Res 2022; 7:e44575. [PMID: 36995742 PMCID: PMC10131655 DOI: 10.2196/44575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 02/08/2023] [Accepted: 02/08/2023] [Indexed: 02/10/2023] Open
Abstract
BACKGROUND Remote photoplethysmography (rPPG) can record vital signs (VSs) by detecting subtle changes in the light reflected from the skin. Lifelight (Xim Ltd) is a novel software being developed as a medical device for the contactless measurement of VSs using rPPG via integral cameras on smart devices. Research to date has focused on extracting the pulsatile VS from the raw signal, which can be influenced by factors such as ambient light, skin thickness, facial movements, and skin tone. OBJECTIVE This preliminary proof-of-concept study outlines a dynamic approach to rPPG signal processing wherein green channel signals from the most relevant areas of the face (the midface, comprising the cheeks, nose, and top of the lip) are optimized for each subject using tiling and aggregation (T&A) algorithms. METHODS High-resolution 60-second videos were recorded during the VISION-MD study. The midface was divided into 62 tiles of 20×20 pixels, and the signals from multiple tiles were evaluated using bespoke algorithms through weighting according to signal-to-noise ratio in the frequency domain (SNR-F) score or segmentation. Midface signals before and after T&A were categorized by a trained observer blinded to the data processing as 0 (high quality, suitable for algorithm training), 1 (suitable for algorithm testing), or 2 (inadequate quality). On secondary analysis, observer categories were compared for signals predicted to improve categories following T&A based on the SNR-F score. Observer ratings and SNR-F scores were also compared before and after T&A for Fitzpatrick skin tones 5 and 6, wherein rPPG is hampered by light absorption by melanin. RESULTS The analysis used 4310 videos recorded from 1315 participants. Category 2 and 1 signals had lower mean SNR-F scores than category 0 signals. T&A improved the mean SNR-F score using all algorithms. Depending on the algorithm, 18% (763/4212) to 31% (1306/4212) of signals improved by at least one category, with up to 10% (438/4212) improving into category 0, and 67% (2834/4212) to 79% (3337/4212) remaining in the same category. Importantly, 9% (396/4212) to 21% (875/4212) improved from category 2 (not usable) into category 1. All algorithms showed improvements. No more than 3% (137/4212) of signals were assigned to a lower-quality category following T&A. On secondary analysis, 62% of signals (32/52) were recategorized, as predicted from the SNR-F score. T&A improved SNR-F scores in darker skin tones; 41% of signals (151/369) improved from category 2 to 1 and 12% (44/369) from category 1 to 0. CONCLUSIONS The T&A approach to dynamic region of interest selection improved signal quality, including in dark skin tones. The method was verified by comparison with a trained observer's rating. T&A could overcome factors that compromise whole-face rPPG. This method's performance in estimating VS is currently being assessed. TRIAL REGISTRATION ClinicalTrials.gov NCT04763746; https://clinicaltrials.gov/ct2/show/NCT04763746.
Collapse
|
19
|
Li B, Jiang W, Peng J, Li X. Deep learning-based remote-photoplethysmography measurement from short-time facial video. Physiol Meas 2022; 43. [PMID: 36215976 DOI: 10.1088/1361-6579/ac98f1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 10/10/2022] [Indexed: 02/07/2023]
Abstract
Objective. Efficient non-contact heart rate (HR) measurement from facial video has received much attention in health monitoring. Past methods relied on prior knowledge and an unproven hypothesis to extract remote photoplethysmography (rPPG) signals, e.g. manually designed regions of interest (ROIs) and the skin reflection model.Approach. This paper presents a short-time end to end HR estimation framework based on facial features and temporal relationships of video frames. In the proposed method, a deep 3D multi-scale network with cross-layer residual structure is designed to construct an autoencoder and extract robust rPPG features. Then, a spatial-temporal fusion mechanism is proposed to help the network focus on features related to rPPG signals. Both shallow and fused 3D spatial-temporal features are distilled to suppress redundant information in the complex environment. Finally, a data augmentation strategy is presented to solve the problem of uneven distribution of HR in existing datasets.Main results. The experimental results on four face-rPPG datasets show that our method overperforms the state-of-the-art methods and requires fewer video frames. Compared with the previous best results, the proposed method improves the root mean square error (RMSE) by 5.9%, 3.4% and 21.4% on the OBF dataset (intra-test), COHFACE dataset (intra-test) and UBFC dataset (cross-test), respectively.Significance. Our method achieves good results on diverse datasets (i.e. highly compressed video, low-resolution and illumination variation), demonstrating that our method can extract stable rPPG signals in short time.
Collapse
Affiliation(s)
- Bin Li
- School of Information Science and Technology, Northwest University, Xi'an, People's Republic of China
| | - Wei Jiang
- School of Information Science and Technology, Northwest University, Xi'an, People's Republic of China
| | - Jinye Peng
- School of Information Science and Technology, Northwest University, Xi'an, People's Republic of China
| | - Xiaobai Li
- Center for Machine Vision and Signal Analysis, University of Oulu, Oulu
| |
Collapse
|
20
|
Man PK, Cheung KL, Sangsiri N, Shek WJ, Wong KL, Chin JW, Chan TT, So RHY. Blood Pressure Measurement: From Cuff-Based to Contactless Monitoring. Healthcare (Basel) 2022; 10:2113. [PMID: 36292560 PMCID: PMC9601911 DOI: 10.3390/healthcare10102113] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 09/26/2022] [Accepted: 10/02/2022] [Indexed: 11/04/2022] Open
Abstract
Blood pressure (BP) determines whether a person has hypertension and offers implications as to whether he or she could be affected by cardiovascular disease. Cuff-based sphygmomanometers have traditionally provided both accuracy and reliability, but they require bulky equipment and relevant skills to obtain precise measurements. BP measurement from photoplethysmography (PPG) signals has become a promising alternative for convenient and unobtrusive BP monitoring. Moreover, the recent developments in remote photoplethysmography (rPPG) algorithms have enabled new innovations for contactless BP measurement. This paper illustrates the evolution of BP measurement techniques from the biophysical theory, through the development of contact-based BP measurement from PPG signals, and to the modern innovations of contactless BP measurement from rPPG signals. We consolidate knowledge from a diverse background of academic research to highlight the importance of multi-feature analysis for improving measurement accuracy. We conclude with the ongoing challenges, opportunities, and possible future directions in this emerging field of research.
Collapse
Affiliation(s)
- Ping-Kwan Man
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
| | - Kit-Leong Cheung
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
| | - Nawapon Sangsiri
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
| | - Wilfred Jin Shek
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Biomedical Sciences, King’s College London, London WC2R 2LS, UK
| | - Kwan-Long Wong
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Jing-Wei Chin
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Tsz-Tai Chan
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Richard Hau-Yue So
- PanopticAI, Hong Kong Science and Technology Parks, New Territories, Hong Kong, China
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| |
Collapse
|
21
|
Jaiswal KB, Meenpal T. rPPG-FuseNet: Non-contact heart rate estimation from facial video via RGB/MSR signal fusion. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.104002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
22
|
PERSIST: Improving micro-expression spotting using better feature encodings and multi-scale Gaussian TCN. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03553-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]
|
23
|
Heart Rate Measurement Based on 3D Central Difference Convolution with Attention Mechanism. SENSORS 2022; 22:s22020688. [PMID: 35062649 PMCID: PMC8781886 DOI: 10.3390/s22020688] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/12/2022] [Accepted: 01/14/2022] [Indexed: 12/04/2022]
Abstract
Remote photoplethysmography (rPPG) is a video-based non-contact heart rate measurement technology. It is a fact that most existing rPPG methods fail to deal with the spatiotemporal features of the video, which is significant for the extraction of the rPPG signal. In this paper, we propose a 3D central difference convolutional network (CDCA-rPPGNet) to measure heart rate, with an attention mechanism to combine spatial and temporal features. First, we crop and stitch the region of interest together through facial landmarks. Next, the high-quality regions of interest are fed to CDCA-rPPGNet based on a central difference convolution, which can enhance the spatiotemporal representation and capture rich relevant time contexts by collecting time difference information. In addition, we integrate the attention module into the neural network, aiming to strengthen the ability of the neural network to extract video channels and spatial features, so as to obtain more accurate rPPG signals. In summary, the three main contributions of this paper are as follows: (1) the proposed network base on central difference convolution could better capture the subtle color changes to recover the rPPG signals; (2) the proposed ROI extraction method provides high-quality input to the network; (3) the attention module is used to strengthen the ability of the network to extract features. Extensive experiments are conducted on two public datasets—the PURE dataset and the UBFC-rPPG dataset. In terms of the experiment results, our proposed method achieves 0.46 MAE (bpm), 0.90 RMSE (bpm) and 0.99 R value of Pearson’s correlation coefficient on the PURE dataset, and 0.60 MAE (bpm), 1.38 RMSE (bpm) and 0.99 R value of Pearson’s correlation coefficient on the UBFC dataset, which proves the effectiveness of our proposed approach.
Collapse
|